Zeyuan Hu's page

"Synthesizing Data Structure Transformations from Input-Output Examples"

  • The paper presents a method to synthesize functional programs that transform recursive data structures (e.g., lists, trees)

    • Examples: see Figure 6 (e.g., join, cprod)
    • essentially shows how one can orchestrate a list of operators to generate the desired program
  • Three techniques

    • type-aware inductive generalization
      • purpose: create hypotheses that represent some or all properties of the target program
        • can be open -> contains "holes"
        • can be closed -> if is consistent with examples, this is the program
      • input: inferred types from examples
        • use types -> prune the search space
      • output: a stream of hypotheses (closed and open) that match types
      • how:
        • For open hypothesis, application of a higher-order combinator (Figure 3) to set of (known or unknown) arguments (use inferred types of examples to guide the selection of combinator)
        • For close hypothesis, recursive procedure like enumerate search
    • deduction
      • purpose: solve unknown functions in hypotheses
      • Two techniques:
        • Refutation: use counter-example to reject hypotheses
        • Example inference: uses properties of combinators to infer new examples on unknown functions
    • best-first enumerate search
      • weighted BFS idea -> use priority queue
      • weight created from cost model: simple is better (avoid degenerate case: prog packed with if-blocks)
  • Main algorithm

    Synthesis procedure
    • a priority queue \(Q\) of subtasks \((e, f, \mathcal{E})\)
      • \(e\): a hypothesis
      • \(f\): a hole in hypothesis
      • \(\mathcal{E}\): a set of examples
    • pop the head of queue and obtain a subtask to work on
      • if \(e\) is closed, checking agains input examples
        • ok -> got a solution; discard otherwise
      • if \(e\) is open
        • synthesize \(f\) in \(e\)
          • infer type of \(f\) from \(\mathcal{E}\)
          • use inductive generalization to create a stream of hypotheses \(H\) for \(f\)
          • replace \(f\) with each \(h \in H\) to obtain a set of new hypotheses. Say one of them is \(e'\)
            • if \(e'\) is closed -> put \((e', \perp, \emptyset)\) as a subtask to queue
            • if \(e'\) is open
              • new subtask for each hole \(f^*\) in \(e'\)
              • uses deduction on \(f^*\) to create new examples \(\mathcal{E^*}\) or refute
              • add \((e', f^*, \mathcal{E^*})\) as a subtask if not rejected
comments powered by Disqus