LooPy: Interactive Program Synthesis with Control Structures

29
153 LooPy: Interactive Program Synthesis with Control Structures KASRA FERDOWSIFARD, UC San Diego, USA SHRADDHA BARKE, UC San Diego, USA HILA PELEG, Technion, Israel SORIN LERNER, UC San Diego, USA NADIA POLIKARPOVA, UC San Diego, USA One vision for program synthesis, and specifcally for programming by example (PBE), is an interactive pro- grammer’s assistant, integrated into the development environment. To make program synthesis practical for interactive use, prior work on Small-Step Live PBE has proposed to limit the scope of synthesis to small code snippets, and enable the users to provide local specifcations for those snippets. This paradigm, however, does not work well in the presence of loops. We present LooPy, a synthesizer integrated into a live programming environment, which extends Small-Step Live PBE to work inside loops and scales it up to synthesize larger code snippets, while remaining fast enough for interactive use. To allow users to efectively provide examples at various loop iterations, even when the loop body is incomplete, LooPy makes use of live execution, a technique that leverages the programmer as an oracle to step over incomplete parts of the loop. To enable synthesis of loop bodies at interactive speeds, LooPy introduces Intermediate State Graph, a new data structure, which compactly represents a large space of code snippets composed of multiple assignment statements and conditionals. We evaluate LooPy empirically using benchmarks from competitive programming and previous synthesizers, and show that it can solve a wide variety of synthesis tasks at interactive speeds. We also perform a small qualitative user study which shows that LooPy’s block-level specifcations are easy for programmers to provide. CCS Concepts: • Human-centered computing Graphical user interfaces;• Software and its engineer- ing Automatic programming; Programming by example. Additional Key Words and Phrases: Program synthesis, live programming ACM Reference Format: Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova. 2021. LooPy: Interactive Program Synthesis with Control Structures. Proc. ACM Program. Lang. 5, OOPSLA, Article 153 (October 2021), 29 pages. https://doi.org/10.1145/3485530 1 INTRODUCTION As software development environments continue to evolve, one feature that has long captured the community’s imagination is a łprogrammer’s assistantž, watching over the programmer’s shoulder and helpfully suggesting code snippets to solve small tasks that continuously arise during develop- ment. The assistant would allow the programmer to focus on the core algorithm instead of getting bogged down in low-level details or interrupting their fow to search for code online. In recent years, this dream seems to be within reach thanks to algorithmic advances in program synthesis and specif- ically programming by example (PBE) [Barke et al. 2020; Barman et al. 2015; Bavishi et al. 2019; Feng Authors’ addresses: Kasra Ferdowsifard, UC San Diego, San Diego, CA, USA, [email protected]; Shraddha Barke, UC San Diego, San Diego, CA, USA, [email protected]; Hila Peleg, Technion, Haifa, Israel, [email protected]; Sorin Lerner, UC San Diego, San Diego, CA, USA, [email protected]; Nadia Polikarpova, UC San Diego, San Diego, CA, USA, [email protected]. © 2021 Copyright held by the owner/author(s). 2475-1421/2021/10-ART153 https://doi.org/10.1145/3485530 Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021. This work is licensed under a Creative Commons Attribution 4.0 International License.

Transcript of LooPy: Interactive Program Synthesis with Control Structures

Page 1: LooPy: Interactive Program Synthesis with Control Structures

153

LooPy: Interactive Program Synthesis with Control Structures

KASRA FERDOWSIFARD, UC San Diego, USA

SHRADDHA BARKE, UC San Diego, USA

HILA PELEG, Technion, Israel

SORIN LERNER, UC San Diego, USA

NADIA POLIKARPOVA, UC San Diego, USA

One vision for program synthesis, and specifically for programming by example (PBE), is an interactive pro-

grammer’s assistant, integrated into the development environment. To make program synthesis practical for

interactive use, prior work on Small-Step Live PBE has proposed to limit the scope of synthesis to small code

snippets, and enable the users to provide local specifications for those snippets. This paradigm, however, does

not work well in the presence of loops. We present LooPy, a synthesizer integrated into a live programming

environment, which extends Small-Step Live PBE to work inside loops and scales it up to synthesize larger

code snippets, while remaining fast enough for interactive use. To allow users to effectively provide examples at

various loop iterations, even when the loop body is incomplete, LooPymakes use of live execution, a technique

that leverages the programmer as an oracle to step over incomplete parts of the loop. To enable synthesis of loop

bodies at interactive speeds, LooPy introduces Intermediate State Graph, a new data structure, which compactly

represents a large space of code snippets composed of multiple assignment statements and conditionals. We

evaluate LooPy empirically using benchmarks from competitive programming and previous synthesizers, and

show that it can solve a wide variety of synthesis tasks at interactive speeds. We also perform a small qualitative

user study which shows that LooPy’s block-level specifications are easy for programmers to provide.

CCS Concepts: •Human-centered computing→Graphical user interfaces; • Software and its engineer-

ing→Automatic programming; Programming by example.

Additional KeyWords and Phrases: Program synthesis, live programming

ACMReference Format:

Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova. 2021. LooPy: Interactive

Program Synthesis with Control Structures. Proc. ACM Program. Lang. 5, OOPSLA, Article 153 (October 2021),

29 pages. https://doi.org/10.1145/3485530

1 INTRODUCTION

As software development environments continue to evolve, one feature that has long captured thecommunity’s imagination is a łprogrammer’s assistantž, watching over the programmer’s shoulderand helpfully suggesting code snippets to solve small tasks that continuously arise during develop-ment. The assistant would allow the programmer to focus on the core algorithm instead of gettingbogged down in low-level details or interrupting their flow to search for code online. In recent years,this dream seems to be within reach thanks to algorithmic advances in program synthesis and specif-ically programming by example (PBE) [Barke et al. 2020; Barman et al. 2015; Bavishi et al. 2019; Feng

Authors’ addresses: Kasra Ferdowsifard, UC San Diego, San Diego, CA, USA, [email protected]; Shraddha Barke, UC

San Diego, San Diego, CA, USA, [email protected]; Hila Peleg, Technion, Haifa, Israel, [email protected]; Sorin

Lerner, UC San Diego, San Diego, CA, USA, [email protected]; Nadia Polikarpova, UC San Diego, San Diego, CA, USA,

[email protected].

© 2021 Copyright held by the owner/author(s).

2475-1421/2021/10-ART153

https://doi.org/10.1145/3485530

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Page 2: LooPy: Interactive Program Synthesis with Control Structures

153:2 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

1 def compress(s):

2 rs = ''

3 count = 1

4 last = s[0]

5 for c in s[1:]:

6 if c == last:

7 count += 1

8 else:

9 rs = rs + str(count) + last

10 count = 1

11 last = c

12 return rs + str(count) + last

(a) (b)

(c)

Fig. 1. (a) Motivating example: Python solution for String Compression. This program loops over the input

string s, updating the result rs and two auxiliary variables: last, the previous character from s, and count,

the length of the current run. (b) Prior work: Small-Step Live PBE. The user enters desired values for z into a

projection box, and the synthesizer replaces ??with len(x). (c) Our work: block-level synthesis with LooPy. The

user enters values for last, count, and rs, and the synthesizer replaces line 6 with the entire loop body.

et al. 2018; Lubin et al. 2020; Shi et al. 2019; So and Oh 2017;Wang et al. 2017]; despite these advances,however, the programmer’s assistant remains elusive.Where do existing PBE synthesizers fall short?

Big-step synthesis.Consider a Python programmerwhowants to solve the String Compression taskfrom the popular book Cracking the Coding Interview [McDowell 2015, p. 91]: Implement basic string

compression using the counts of repeated characters. For example, the string "aabccca"would become

"2a1b3c1a". Traditional PBE synthesizers adopt a big-step interaction model, where the programmerspecifies inputs and outputs at the function level, and the synthesizer is expected to generate theentire function body in one shot. In our example, the programmer might provide the input-outputexample "aabccca"→"2a1b3c1a", and expect the synthesizer to generate the body of the functioncompress in Fig. 1a. Unfortunately, this function contains several features that make it challengingfor program synthesis techniques to discover: it is relatively long and contains both library functioncalls and a loop. Although state-of-the-art PBE synthesizers such as FrAngel [Shi et al. 2019] arecapable in principle of generating programs of this complexity,1 they require minutes, not seconds,to do so, and the result can be unpredictable and sensitive to the provided examples, making big-stepsynthesizers unsuitable for the interactive setting of a programmer’s assistant.

Small-step synthesis. To enable program synthesis in interactive settings, a different line ofwork [Ferdowsifard et al. 2020; Galenson et al. 2014] has developed a small-step interaction model,where the synthesizerÐtypically integrated into the IDEÐis used to generate just the next line ofcode, and the programmer is expected to provide a local specification for that line. In particular, the

1We attempted to solve the String Compression task with FrAngel by providing up to nine input-output examples of varying

complexity, but it failed to find the right solution within a 30 minute timeout.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 3: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:3

interaction model we developed in [Ferdowsifard et al. 2020], dubbed Small-Step Live PBE, uses alive programming environmentÐan enviroment where the program state is continuously displayedÐnamed Projection Boxes [Lerner 2020]. With Small-Step Live PBE, the programmer may edit the livestate to specify desired values for a single output variable, prompting the synthesizer to generatean assignment to that variable; for example in Fig. 1b, when the programmer enters desired valuesfor z into the projection box for the two rows representing live values from the different invocationsof test, the synthesizer replaces the prompt ??with a generated expression len(x).An important advantage of this interaction model is that the programmer needs to specify only

the after-state for the synthesis problem, while the before-state is supplied by the live programmingenvironment from the live state before the update, seen on the left of the projection box. And becausethe code for each individual assignment is smaller and simpler than a full big-step solution, thesynthesizer can produce useful results in seconds, making it suitable for interactive use.

Unfortunately, Small-Step Live PBE would not be very helpful when solving the String Compres-sion task, because this task requires control structures (loops and conditionals). Intuitively, it is hardto specify code inside a loop one assignment at a time because of the dependent before-state prob-lem [Peleg et al. 2020]: the before-state of a given loop iteration depends on the code executed in theprevious loop iterations. In our example, suppose the programmer is in the middle of implementingcompress from Fig. 1a and is yet to write the entire else branch; if they now try to synthesize theassignment to rs in line 9 by providing the value of rs for the first few loop iterations, this will failbecause the right-hand side of this assignment uses the variables rs, count, and last, which all shouldbe updated in the loop; hence the synthesizer does not have access to the correct before-state forthese variables for later loop iterations.

Our approach: Block-Level Live PBE. To overcome this limitation and enable interactive synthesisin the presence of control structures, we propose a new block-level interaction model, which wecall Block-Level Live PBE, and implement this model in a synthesizer called LooPy.2 LooPy buildson Small-Step Live PBE, but allows the programmer to specify the after-state formultiple outputvariables at a time, prompting the synthesizer to generate a code block, i.e., a sequence of assignments,possibly including conditionals. In our running example, the programmer can use LooPy to generatethe entire loop body, by simultaneously specifying the values for last, count, and rs for the firstfive loop iterations, as shown in Fig. 1c. Given this input, LooPy generates the code on lines 6ś11of Fig. 1a in under two seconds. A video showing LooPy is found at https://youtu.be/EIWtF4BJpmo.The key technical insight that makes Block-Level Live PBE effective is live execution, a concept

inspired by the live evaluation of Lubin et al. [2020]. In live execution, the programmer performsthe natural act of providing variable values for loop iterations in order. In doing so, the programmeressentially serves as an interactive oracle to execute missing statements. As such, live execution canaccurately propagate the before-state through a loop, even when the loop is not yet complete, whichsolves the dependent before-state problem. Indeed, given the specification in Fig. 1c, LooPy can usethe programmer-provided after-state at iteration 0 as the before-state for iteration 1,3 and similarlyfor the next three iterations.

Although synthesizing the entire loop body at once is often convenient, we deliberately designedLooPy tobeflexiblewith respect to thegranularityof theblock. Inparticular, it also supports synthesiz-ing loopbodies one assignment at a time, aswell asmixinghand-written and synthesized code (as longas the output variables of each synthesis problem only depend on variables that are already correctlyupdated or are part of the same synthesis problem). For example, the programmer might manually

2The code for LooPy is found at https://github.com/KasraF/LooPy, and as a VM image at https://doi.org/10.5281/zenodo.

5459013.3Iteration counts are displayed in the ł#ž column of the projection box.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 4: LooPy: Interactive Program Synthesis with Control Structures

153:4 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

write the code that updates last and count, and then use LooPy to synthesize the assignment to rs online9, andLooPywill correctly computeanddisplay thebefore-stateof future loop iterationsby takinginto account both the specified after-state for rs and the hand-written code for the other variables.

Efficient synthesis of code blocks.The core technical challenge in building a block-level synthesizerlike LooPy is to make synthesis scale at interactive speeds to larger code snippets, such as the loopbody inFig. 1a (lines 6-11). Toavoidblindly enumeratingall sequencesof assignments,we leverageourblock-level interactionmodel: givenacompletebefore-andafter-states fora synthesisproblem,LooPycan enumerate single assignment subprograms considering all intermediate states that the programmight go through.We introduce a data structure called the Intermediate State Graph (ISG), which com-pactly represents all paths from the before-state to the after-state through those intermediate states.

Evaluation.We empirically evaluate the performance of our new synthesizer LooPy, and show thatit can handle a wide range of synthesis tasks at interactive speeds. Through a small-scale qualitativeuser study with five participants, we also evaluate the feasibility of providing block-level specifi-cations. In our study, participants used LooPy to solve two Python programming tasks that involvedloops. Our study shows that generally programmers are able to provide block-level specifications.

Contributions. In summary, this paper makes two main contributions:

(1) A new interaction model called Block-Level Live PBE, whose key technical insight is live ex-ecution, an approach that uses programmer input as an oracle to compute the before-state offuture loop iterations, even when the loop body is incomplete. Our user study demonstratesthe feasibility of this interaction model.

(2) A block-level program synthesis algorithm using Intermediate State Graphs. Our empiricalevaluation shows that this algorithm can synthesize a variety of code blocks using smallspecifications and in interactive times (seconds).

2 MOTIVATING EXAMPLE

In this section we illustrate the interaction model and the inner workings of LooPy using the StringCompression task frominFig. 1 as the runningexample.Weassume thatouruser is anexperiencedpro-grammer, but does not use Python very often; hence they have a high-level idea of the algorithm theywant to implement, but theywould like to use program synthesis to figure out the details at every step.

Prior work: Small-Step Live PBE.One paradigm for integrating synthesis into the programmer’sworkflow is our recent work on Small-Step Live PBE [Ferdowsifard et al. 2020], an approach thatcombines live programming in the Projection Boxes environment [Lerner 2020] with program-ming by example. Small-Step Live PBE allows the user to edit the live state of the program after amissing assignment statement, prompting the synthesizer to generate a statement that modifiesthe state accordingly. Fig. 1b shows a simple example: the user wants to compute the length ofthe list x, but they have forgotten the name of the corresponding Python function. To invoke thesynthesizer, they introduce a hole, z = ??, which spawns a projection box where the output variablez can be edited. The user might provide the value for z in the first line only; this gives rise to asynthesis task that asks to transform the before-state 𝜎start= {x ↦→[1,2,3],z ↦→⊥} into the after-state𝜎end = {x ↦→ [1,2,3],z ↦→ 3}. If the user specifies z in both lines, the synthesis task becomes a setof examples (before-after pairs) 𝜎0

start→𝜎0end

,𝜎1start→𝜎1

end. Given this specification, the synthesizer

generates the assignment z = len(x), which replaces the hole.The Small-Step Live PBE interaction model has two important properties. First, the user only

needs to provide the after-state 𝜎end for each example; the before-state 𝜎start is computed by the liveprogramming environment simply by executing the program up until the point of the hole. This saves

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 5: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:5

user effort, since they need notmanually construct relevant before-states from themiddle of an execu-tion. Second, the user’s expectation of the synthesizer is that once the hole is replaced by the synthesisresult, the program execution will indeed pass through the exact after-states they had specified.

Challenge: loops.These properties become non-trivial to realize in the presence of loops. Going backto the compress function in Fig. 1a, if the user wants to invoke Small-Step Live PBE inside the loop(say, to help with the assignment on line 9) the projection box cannot display an accurate before-statefor any loop iteration beyond iteration 1, because that would require executing the loop body, whichhas not yet been completed. While in principle it is possible to ask the user to provide after-statesfor arbitrary before-states, this violates the user’s mental model of Live PBE: that they are specifyingstates that are a part of a single program execution, and the synthesized programs will actually passthrough those states.

Our solution: Block-Level Live PBE. To extend the Live PBE paradigm to work in the presenceof loops, we propose a new interaction model we dub Block-Level Live PBE and implement thismodel in a synthesizer called LooPy. In our compress function in Fig. 1a, LooPy can synthesize theentire block of code inside the for loop, lines 6ś11. In the rest of the section we explain how LooPy

synthesizes this code, gradually building up to it through a series of smaller-scale synthesis problemsfor different fragments of the compress function. We start with explaining how LooPy synthesizesa single assignment, then multiple assignments, and finally the whole conditional.

2.1 Handling Loops with Live Execution

1 def compress(s):

2 rs = ''

3 count = 1

4 last = s[0]

5 for c in s[1:]:

6 if c == last:

7 count += 1

8 else:

9 rs = ??

10 count = 1

11 last = c

To start, we explain how our approach works for a single assign-ment. Consider for example the setting where the user has alreadyfigured out how to update the auxiliary variables count and last,and only needs help with appending to rs. In other words, the userstarts with a sketch shown on the right, where the hole rs = ?? online 9 indicates the statement they would like to synthesize. (Thisprogram is a prefix of the full solution in Fig. 1a; we assume thatthe user will add the return statement later by hand).

Live execution.To enable Live PBE in the presence of loops, LooPyperforms live execution (inspired by the live evaluation of Lubinet al. [2020]), where the programmer serves as an oracle to executemissing statements and accuratelypropagate the before-state through the sketch.In our example, when the user first invokes LooPy, they see the projection box in Fig. 2a and are

prompted to enter the output value for rs in iteration 1.4 Because the execution until that point hasnot encountered any holes, the projection box displays an accurate before-state for this iteration5:

𝜎0start= {c ↦→'b',rs ↦→'',count ↦→2,last ↦→'a'}

Once the user enters the desired value '2a' for rs, LooPy uses it as an oracle to łjump overž themissing assignment and compute the state after line 9 as (with the modified part highlighted):

𝜎0end= {c ↦→'b', rs ↦→'2a' ,count ↦→2,last ↦→'a'}

Starting from this state,LooPy executes the rest of the loop body, to compute the accurate before-statefor the next time the execution encounters the hole (in iteration 2):

𝜎1start= { c ↦→'c' ,rs ↦→'2a', count ↦→1 , last ↦→'b' }

4Iteration 0 is missing from this box, since it takes the other branch of the conditional and hence does not execute the hole.5In this section we omit s from the states for brevity, since it does not change.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 6: LooPy: Interactive Program Synthesis with Control Structures

153:6 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

(a) Before the first iteration (b) After the first iteration (c) The complete specification

Fig. 2. Specifying after-states for the hole rs = ?? in LooPy. Note that the projection box only displays those

iterations that execute the hole (1, 3, and 5).

At this point the user is prompted to enter the after-state for iteration 2, as shown in Fig. 2b, whichagain can be used as an oracle to continue live execution and further update the projection box (seeFig. 2c). After any number of iterations, the programmer can decide to stop and invoke the synthesizerwith the specifications provided so far.

Thanks to live execution, the two desirable properties of Live PBE are preserved inside the loop:(1) the before-states like 𝜎0

start, 𝜎1start are supplied by the environment, and (2) once synthesis is

complete, program execution passes through the states specified by the user. From the UI standpoint,live execution is supported by enforcing that the user specify after-states for each evaluation of thehole in order ; for example, in Fig. 2b the user cannot skip iteration 2 and proceed to enter the valuefor iteration 5, because the before-state in the latter iteration is not yet accurate.

Synthesis. Live execution reduces the sketch and the user input from the projection box into a localsynthesis problem, defined simply as a set of before-after pairs. For example, user input from Fig. 2cgenerates the following pairs:

𝜎0start= { c ↦→'b',rs ↦→'', 𝜎0

end= { c ↦→'b', rs ↦→'2a' ,

count ↦→2,last ↦→'a'} count ↦→2,last ↦→'a'}

𝜎1start= { c ↦→'c',rs ↦→'2a', 𝜎1

end= { c ↦→'c', rs ↦→'2a1b' ,

count ↦→1,last ↦→'b'} count ↦→1,last ↦→'b'}

𝜎2start= { c ↦→'a',rs ↦→'2a1b', 𝜎2

end= { c ↦→'a', rs ↦→'2a1b3c' ,

count ↦→3,last ↦→'c'} count ↦→3,last ↦→'c'}

Given this specification, the synthesizer takes less than a second to generate the expressionrs + str(count) + last for the right-hand side of the hole rs = ??, and LooPy replaces the statementwith a pretty-printed version: rs += str(count) + last.

To solve the local synthesis problem, LooPy uses a popular synthesis technique called bottom-up

enumeration with Observational Equivalence reduction [Albarghouthi et al. 2013; Udupa et al. 2013].In this technique, an expression enumerator gradually builds more and more complex expressions bycomposing previously enumerated simpler expressions; all expressions are evaluated on the set ofbefore-states 𝜎𝑖start, and expressions with the same output are pruned.

2.2 Synthesizing Assignment Sequences with Intermediate State Graphs

Next we explain how our approach works for blocks of statements. In our running example, theorder of assignments in the else branch of compress can be tricky to get right, so it would benice to delegate the entire else branch to the synthesizer. In LooPy the user can achieve thisby writing a hole with multiple output variablesÐlast, count, and rsÐas shown in Fig. 3. This

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 7: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:7

Fig. 3. Specifying after-states for a hole with multiple variables in LooPy.

9 rs += str(count) + last

10 count = 1

11 last = c

spawns a projection box that prompts the user to enter valuesfor all three output variables, at each loop iteration. Given theuser input in Fig. 3, LooPy again takes less than a second togenerate the sequence of assignments shown on the right.

While live execution extends straightforwardly to multi-variable holes, the challenge is to extendthe synthesis back-end to synthesize the correct assignment order while maintaining interactivespeeds. Fortunately, because in our setting the entire before- and after-states are given, we candecompose the synthesis of a sequence of assignments into sub-problems for individual assignments.Specifically, consider the before-after pair for iteration 1 in Fig. 3:

𝜎start= {c ↦→'b',last ↦→'a',count ↦→2,rs ↦→''}

𝜎end= {c ↦→'b', last ↦→'b' , count ↦→1 , rs ↦→'2a' }

A sequence of three assignments that transforms 𝜎start into 𝜎end must pass through two intermediatestates. Let us first assume that the order of assignments is given (first rs, then count, then last), andthe synthesizer only needs to generate their right-hand sides. In this case, the two intermediatestates are fixed: the first one is the state where only rs is updated and the rest of the variables havethe same values as in 𝜎start (we denote it 𝜎{rs}); similarly, the second state is the one where only rs

and count are updated (we denote it 𝜎{rs,count}). Hence our synthesis problem has been reduced tothree independent sub-problems, 𝜎start →𝜎{rs} , 𝜎{rs} →𝜎{rs,count} , and 𝜎{rs,count} →𝜎end, each onlyrequiring an assignment to a single variable.

Intermediate State Graph.Of course, in reality the order of assignments is not given: this is whatthe user needed help with in the first place! The bad news is that for a hole with 𝑛 output variablesthere are 𝑛! possible assignment orders to consider, but the good news is that the synthesizer neednot enumerate them all explicitly, because different orders share intermediate states and assignmentsub-sequences. For example, both the assignment order last, rs, count and the assignment orderlast, count, rs pass through the intermediate state 𝜎{last} , and likewise both rs, last, count and last,rs, count pass through 𝜎{rs,last} .

To take advantage of these shared states, we propose a new data structure we dub an Intermediate

State Graph (ISG). The ISG for our running example is shown in Fig. 4. The nodes in this graph are allthe relevant states of a synthesis problemÐ𝜎start, 𝜎end, and the intermediate states 𝜎𝑋 for all subsets𝑋 ⊂𝑉 of the output variables; the edges in this graph connect a state to all stateswith exactly onemoreupdated variable.6 Each ISG node except 𝜎end holds a separate bottom-up expression enumerator.When the enumerator at the ISG node 𝜎{rs} encounters the expression c, this expression gets

evaluated in the state 𝜎{rs} and tested as a possible assignment for all outgoing edges from that node.

6For simplicity, here we ignore Python’s simultaneous assignment statements, which update multiple variables at a time. In

Sec. 4 and Sec. 5.3 we show how LooPy synthesizes such assignments.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 8: LooPy: Interactive Program Synthesis with Control Structures

153:8 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

𝜎𝑠𝑡𝑎𝑟𝑡𝜎{rs}𝜎{count}𝜎{last}

𝜎{rs,count}𝜎{count,last}𝜎{rs,last}

𝜎𝑒𝑛𝑑rs = ?

rs = ?count = ?

count = ?

last = ?

Fig. 4. Intermediate State Graph for synthesizing rs, last, and count.

In our example, the assignment last = c transforms 𝜎{rs} into 𝜎{rs,last} , hence we label the outgoingedge last from𝜎{rs} with this program andmark the edge solved. A solution to the synthesis problemis foundwhen there is a path from𝜎start to𝜎end along solved edges. For this example, we get a solutionby traversing the path 𝜎start,𝜎{rs},𝜎{rs,count},𝜎end.

2.3 Synthesizing Conditional Statements

6 if c == last:

7 count += 1

8 else:

9 rs += str(count) + last

10 count = 1

11 last = c

As a final challenge, assume the user now wants thesynthesizer to generate the entire loop body, by insertingthe samemulti-variable hole immediately after the loopheader and providing after-states for the first five itera-tions of the loop, as shown in Fig. 1c. Again, in less than asecond, LooPy replaces the hole with the code shown onthe right. To understand how LooPy generates this codeso quickly, the final missing piece is efficient synthesis of conditionals.

Our technique for synthesizing conditionals is inspired by the divide-and-conquer approach fromEUSolver [Alur et al. 2017], but is adapted to the context of ISGs. As an example, consider a simplifiedsynthesis problem defined by the before-after pairs 𝜖0 and 𝜖1 from the first two loop iterations inFig. 1c, where 𝜖𝑖 =𝜎𝑖start→𝜎𝑖

endfor 𝑖 =0,1 and

𝜎0start= { c ↦→'a',rs ↦→'', 𝜎0

end= { c ↦→'a',rs ↦→'',

count ↦→1,last ↦→'a'} count ↦→2 ,last ↦→'a'}

𝜎1start= { c ↦→'b',rs ↦→'', 𝜎1

end= { c ↦→'b', rs ↦→'2a' ,

count ↦→2,last ↦→'a'} count ↦→1 , last ↦→'b' }

The synthesizer has to decide whether to generate a single assignment sequence that satisfies both ofthese examples, or to synthesize two branches, where the first one satisfies 𝜖0 and the second onesatisfies 𝜖1; more generally, with 𝑘 examples, the synthesizer needs to guess the right partitioning ofthese examples into the two branches. LooPy is able to consider all possible partitions simultaneouslyusing the following modification to the ISG.Consider again the ISG in Fig. 4, but now imagine that every node is associated with a vector

of states (one for each example 𝜖0 and 𝜖1). Let us focus on the transition between 𝜎start and 𝜎{count} .Instead of a single edge between these nodes that satisfies both examples, LooPy now constructstwo edges, one for each partition of the example set: ⟨{𝜖0,𝜖1},∅⟩ and ⟨{𝜖0},{𝜖1}⟩. The edge label nowcontains two separate solutions for count: one for the then branch of the conditional and one for theelse branch. Whenever a new expression is enumerated at the node 𝜎start, it is tested as a potentialsolution for both of the cases on each of the two edges: for example, the expression count + 1 is

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 9: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:9

𝑝 ::=𝑠 atomic statement|𝑥 =?? hole|𝑝 ; 𝑝 sequential composition| if 𝑒 : 𝑝 else : 𝑝 conditional| for 𝑥 in 𝑒 : 𝑝 for-loop

Atom⟦𝑠⟧(𝜎) =𝜎′

[𝑠 ]O (𝜎) =𝜎→𝑠 𝜎′Hole

[ℎ]O (𝜎) =𝜎→ℎ O(𝜎)

Seq

[𝑝1 ]O (𝜎0) =𝜎0 ...→𝑠1 𝜎1

[𝑝2 ]O (𝜎1) =𝜎1 ...→𝑠2 �̂�2

[𝑝1 ; 𝑝2 ]O (𝜎0) =𝜎0 ...→𝑠1 𝜎1 ...→𝑠2 �̂�2

SeqBot[𝑝1 ]O (𝜎0) =𝜎0 ...→ℎ⊥

[𝑝1 ; 𝑝2 ]O (𝜎0) =𝜎0 ...→ℎ⊥

Fig. 5. (Left) The syntax of sketches 𝑝 in LooPy; programs 𝑝 use the same grammar excluding holes. (Right)

Definition of a live trace [𝑝]O (𝜎) of a sketch 𝑝 starting from store 𝜎 using a live execution oracle O; rules forconditionals and loops are omitted for brevity.

correct for 𝜖0 but not 𝜖1, so we save it on the edge ⟨{𝜖0},{𝜖1}⟩, in the then case; the expression 1 iscorrect for 𝜖1 but not 𝜖0, so we save it on the same edge but in the else case.Finally, to synthesize the guard expression for the conditional, the 𝜎start node of the ISG also

accumulates boolean expressions that partition the examples into all possible partitions. To constructthe synthesis result, we now require the ISG to have a path from 𝜎start to 𝜎end via edges that belong tothe same partition, and a boolean expression that matches that partition.

3 FROMLIVE PBE TOBLOCK-LEVEL SYNTHESIS

In this section, we define two forms of synthesis tasks: a live PBE task, specified by a LooPy user,and a block-level synthesis task, solved by the the LooPy synthesis engine. We also formalize liveexecution as the mechanism that reduces the former task to the latter.We define our synthesis tasks over a subset of Python shown in Fig. 5 (for now ignore the hole

statement, which can appear in sketches but not in programs). The structure of expressions 𝑒 andatomic statements𝑠 is irrelevant for purposes of this section, and therefore omitted from thefigure.Weassume, however, that they are equipped with semantics ⟦𝑒⟧ : Store→Val and ⟦𝑠⟧ : Store→Store

(whereVal is the set of values and Store is the set of stores𝜎 thatmap variables to values).We combinethe semantics of atomic statements with the standard behavior of control structures to define aprogram trace [𝑝] (𝜎0)=𝜎0→𝑠1 𝜎1→𝑠2 ···→𝑠𝑛 𝜎𝑛 as the sequences of stores 𝜎𝑖 and atomic statements𝑠𝑖 that a program execution starting at 𝜎0 goes through, such that 𝜎𝑖 =⟦𝑠𝑖⟧(𝜎

𝑖−1).

Sketches and live exectuion. A sketch 𝑝 is a program with exactly one hole statement 𝑥 = ??,where 𝑥 denotes one or more program variables, called the output variables of the hole. We use themeta-variableℎ to range over holes and 𝑠 to range over the union of atomic statements and holes.A live execution oracle is a function O that, given a store 𝜎 , returns either a new store 𝜎 ′ or ⊥.

Note that the oracle need not take the hole as input since a sketch has only one hole (we do assume,however, that the oracle is specific to a sketch). We also assume that O only updates the outputvariables of the hole, i.e., for a sketch with the hole 𝑥 =?? and for any store 𝜎 and program variable𝑦 ∉𝑥 , O(𝜎) (𝑦)=𝜎 (𝑦).

Using the oracle, we define a live trace [𝑝]O (𝜎0) of a sketch 𝑝 starting in the store𝜎0. A live trace is

a sequence 𝜎0→𝑠1 𝜎1→𝑠2 ···→𝑠𝑛 �̂�𝑛 , where the last state �̂�𝑛 can be either a store 𝜎𝑛 or⊥. Live tracesare defined using rules in Fig. 5 (right). The differences from regular program traces are capturedin the rulesHole, which uses the oracle to execute a hole statement, and SeqBot, which suspendslive execution once the oracle returns⊥ (a similar suspension happens in the rule for loops, whichis omitted for brevity). We say that a program trace 𝑡 refines a live trace 𝑡 , written 𝑡 ≺ 𝑡 , if 𝑡 can beobtained from 𝑡 by replacing every step of the form 𝜎→ℎ �̂� ′ with a trace 𝜎→𝑠1 ...→𝑠𝑛 𝜎 ′′, where

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 10: LooPy: Interactive Program Synthesis with Control Structures

153:10 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

either �̂� ′=𝜎 ′′ or �̂� ′

=⊥. In other words, 𝑡 passes through the same stores as 𝑡 , except that oracle stepscan be replaced with multiple atomic steps, and if 𝑡 was suspended, 𝑡 instead continues execution.With these preliminaries, we can define the live PBE task:

Definition 1 (Live PBE task). A live PBE task is a triple ⟨𝑝,O,𝜎0⟩ of a sketch 𝑝 , a live execution oracleO, and an initial store 𝜎0. A solution to a synthesis task is a program 𝑝 such that

1) ∃𝑝∗ .𝑝 =𝑝 [𝑝∗/ℎ], i.e., 𝑝 is the sketch with its hole replaced by some program 𝑝∗; and2) [𝑝] (𝜎0) ≺ [𝑝]O (𝜎

0), i.e., the execution trace of 𝑝 refines the live trace of the sketch.

The two conditions above capture the syntactic and semantic expectations of a live PBE user,respectively. The first condition prevents the synthesizer from replacing any part of the sketchother than the hole. The second condition requires the synthesized program to behave like the liveexecution for as long as possible (until the point where the latter was suspended). For simplicity,we define a live PBE task using a single initial store 𝜎0; the definition can be easily generalized tomultiple initial stores; note, however, that if the hole is inside a loop, even a single initial store canlead to multiple occurrences of the hole in the live trace, and hence multiple invocations of the oracle.

The LooPy UI. In LooPy, the user provides the sketch 𝑝 and the initial store 𝜎0 (see e.g., the callcompress('aabccca') in line 12 in Fig. 3), and also serves as the live execution oracle O. The LooPyUI incrementally builds the live trace [𝑝]O (𝜎

0) by querying the oracle via projection boxes, as shownin Fig. 2. More precisely, LooPy first builds the prefix 𝜎0···→𝑠1 𝜎1→ℎ of the live trace until it firstencounters the hole; it then displays the store 𝜎1 in the left-hand side of the projection box, andprompts the user to enter new values for the output variables ofℎ in the right-hand side. If the userpresses łenterž, O(𝜎1) is taken to be⊥, and the live trace is complete; otherwise execution continueswith the updated store until it either terminates or encounters the hole again (in a later loop iteration),which triggers another query to the oracle (in the next line of the projection box).

Block-level synthesis.Given a live PBE task ⟨𝑝,O,𝜎0⟩, we can now use its live trace, generated bythe LooPyUI, to derive a simpler, local specification for 𝑝’s holeℎ, whichwe refer to as the block-levelsynthesis task. The LooPy synthesizer can then simply solve this local task, ignoring the fact that theoriginal sketch might have contained a loop.

Definition 2 (Block-level synthesis task). A block-level synthesis task is defined by the set ofexamples E, where each example 𝜖 is a pair of stores 𝜎start→𝜎end. A solution to a block-level task is aprogram 𝑝∗ with no holes or loops, such that for every 𝜎start→𝜎end ∈E, [𝑝

∗] (𝜎start)=𝜎start...→𝑠 𝜎end.

To reduce the live PBE task ⟨𝑝,O,𝜎0⟩ to a block-level task, we define E = {𝜎 → 𝜎 ′ | 𝜎 →ℎ 𝜎 ′ ∈[𝑝]O (𝜎

0)}. It is easy to show by comparing the definitions of the two synthesis tasks, that if 𝑝∗ is asolution to the block-level task E, then 𝑝 [𝑝∗/ℎ] is a solution to original live PBE task ⟨𝑝,O,𝜎0⟩.The following two sections will address the synthesis of loop-free blocks of code as a solution

to the block-level synthesis task. For the sake of presentation, Sec. 4 focuses on straight-line codeblocks (sequences of assignments); then Sec. 5 extends the synthesis algorithm to conditionals.

4 SYNTHESIZING SEQUENCESOF ASSIGNMENTS

In this section, we describe LooPy’s block-level synthesis algorithm for straight-line programs, i.e.,when the solution 𝑝∗ is drawn from the following grammar:

𝑝 ::= 𝑥 =𝑒 | 𝑝 ; 𝑝

Here 𝑥 =𝑒 is Python’s simultaneous assignment statement, which has one or more variables on theleft and the same number of expressions on the right. The semantics of a simultaneous assignment isto first evaluate each 𝑒𝑖 in the current store and then to assign its value to the corresponding 𝑥𝑖 . Forexample, executing x, y = x + x, x + x + x in the store {x ↦→1,y ↦→1} yields the store {x ↦→2,y ↦→3}.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 11: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:11

We begin our exposition in Sec. 4.1 with the simplest version of the algorithm, where the synthesistask is restricted to a single example (|E |=1), and the solution is restricted to a single (simultaneous)assignment. Sec. 4.2 then extends the algorithm to generate a sequence of assignments, and Sec. 4.3extends it to handle multiple examples.

Expression enumerators.All variants of LooPy’s synthesis algorithm require enumerating expres-sions 𝑒 , used on the right-hand side of assignments (and in the guards of conditionals later on). Tothis end, LooPy relies on an existing algorithm called bottom-up enumeration with Observational

Equivalence reduction [Albarghouthi et al. 2013; Udupa et al. 2013]. Bottom-up enumeration graduallybuilds up a bank of larger and larger expressions, by combining sub-expressions that are already inthe bank. Observational Equivalence (OE) speeds up this process by evaluating every expression ona given set of inputs and only retaining one expression per result in the bank.

For the purposes of Sec. 4.1ś4.2, we can think of an expression enumerator as a black-box functionEnum(𝜎), which is parameterized by a store,7 and produces a (possibly infinite) streamof expressions𝑒0,𝑒1,.... The reason an enumerator takes 𝜎 as a parameter is twofold: first, it uses the store’s domain,Vars(𝜎), as the set of free variables in the expressions it builds; second, it uses 𝜎 to evaluate theexpressions for the purposes ofOE reduction.OE reduction guarantees that the enumeratorEnum(𝜎)is complete on 𝜎 , that is, for any value 𝑣 , if there exists an expression 𝑒 such that ⟦𝑒⟧(𝜎)=𝑣 , then a(possibly different) expression 𝑒𝑖 such that ⟦𝑒𝑖⟧(𝜎)=𝑣 will eventually be enumerated; in other words,OE discards programs but never discards distinct evaluation results on 𝜎 .When LooPy solves a synthesis task 𝜎start → 𝜎end and uses an enumerator to synthesize a sub-

expression 𝑒 of a program 𝑝 , it is crucial that the enumerator be initialized with the store 𝜎 in which 𝑒

will be evaluated during the execution of 𝑝 on𝜎start. Depending onwhere 𝑒 is located inside 𝑝 ,𝜎 mightor might not be equal to 𝜎start. Passing a wrong store to the enumerator leads to incompleteness: wecan no longer assume that if an expression with the required value exists, it will be enumerated. Forthis reason, LooPy often needs to create multiple enumerators with different stores.

4.1 Single Example, Single Assignment

We begin by examining the case where E= {𝜖} and the target program contains a single assignment.We will denoteMod𝜖 the set of variablesmodified by an example 𝜖 =𝜎start→𝜎end, i.e., those variables𝑥 where 𝜎end (𝑥)≠𝜎start (𝑥). The solution to our block-level synthesis task is hence of the form 𝑥 =𝑒 ,where the left-hand side contains exactly the variables inMod𝜖 .

Synthesis algorithm. The synthesis algorithm maintains a variable assignment A, which mapseach variable inMod𝜖 to an expression or⊥; the assignment is initialized by settingA(𝑥)=⊥ forevery 𝑥 ∈Mod𝜖 . The algorithm then creates a single expression enumerator Enum(𝜎start). In eachiteration, it draws another expression 𝑒𝑖 from the enumerator stream, and for every unassignedvariable 𝑥 (such thatA(𝑥)=⊥), it tests whether 𝑒𝑖 is valid for 𝑥 , i.e., whether ⟦𝑒𝑖⟧(𝜎start)=𝜎end (𝑥); ifso, the algorithm updatesA(𝑥) to 𝑒𝑖 . WhenA is complete (i.e.,A(𝑥)≠⊥ for every 𝑥 ), the algorithm

terminates and returns 𝑥 =A(𝑥) as the solution to the block-level synthesis task. Note that a singleexpression enumerator is sufficient in this case because all right-hand sides of the simultaneousassignment 𝑥 =𝑒 are evaluated in the same store 𝜎start.

Example 1. Consider the specification 𝜎start = {x ↦→ 1,y ↦→ 1} and 𝜎end = {x ↦→ 2,y ↦→ 3}, makingMod𝜖 = {x,y}. We first initializeA= {x ↦→⊥,y ↦→⊥} and create an enumerator Enum({x ↦→1,y ↦→1}).After several iterations, the enumerator yields the expression x + x, which evaluates to 2. Thismatches 𝜎end (x), so we setA(x) to x + x. Next, the enumerator yields x + x + x, which evaluates

7In Sec. 4.3 we will generalize the notion of expression enumerators from a single store to a vector of stores.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 12: LooPy: Interactive Program Synthesis with Control Structures

153:12 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

𝑥 ↦ 1𝑦 ↦ 1𝑥 ↦ 1𝑦 ↦ 3𝑥 ↦ 2𝑦 ↦ 1

𝑥 ↦ 2𝑦 ↦ 3

𝑁𝑠𝑡𝑎𝑟𝑡

𝑁𝑒𝑛𝑑𝑁{𝑥} 𝑁{𝑦}x = ⊥

x = ⊥y = ⊥

x,y = ⊥,⊥

y = ⊥

(a) The initial ISG

𝑥 ↦ 1𝑦 ↦ 1𝑥 ↦ 1𝑦 ↦ 3𝑥 ↦ 2𝑦 ↦ 1

𝑥 ↦ 2𝑦 ↦ 3

𝑁𝑠𝑡𝑎𝑟𝑡

𝑁𝑒𝑛𝑑𝑁{𝑥} 𝑁{𝑦}x = x + x

x = x + x

y = ⊥y = x + y

x,y = x + x,⊥

(b) The ISG with a solution

Fig. 6. The Intermediate State Graph for Ex. 1. The path of complete edges is marked in red.

to 3, matching 𝜎end (y). At this point,A = {x ↦→ x + x,y ↦→ x + x + x} is complete, so the algorithmterminates and returns the simultaneous assignment statement:

x, y = x + x, x + x + x

4.2 Single Example, Sequence of Assignments

The synthesis task described above has an even simpler solution if instead of a single simultaneousassignment we perform two assignments in sequence: x = x + x; y = x + y. We can synthesize thissolution by decomposing the overall synthesis task into two single-assignment sub-tasks:𝜎start→𝜎{x} ,which transforms the start state into an intermediate state where only x has been updated, and𝜎{x}→𝜎end, which transforms the intermediate state into the end state. Each sub-task can then besolved independently using the algorithm from Sec. 4.1. Since the order and grouping of assignmentsin the solution 𝑝∗ is not known a-priori, the algorithm has to consider decomposing the problemusing all intermediate states that 𝑝∗ could possibly pass through. If we assume that 𝑝∗ assigns eachvariable only once, there is exactly one intermediate state for each non-empty, strict subset ofMod𝜖 .Formally, for each𝑋 ⊂Mod𝜖 , let a partially-updated state 𝜎𝑋 be the state where only those variablesin𝑋 have been updated:

𝜎𝑋 = {𝑥 ↦→𝜎end (𝑥) |𝑥 ∈𝑋 } ∪ {𝑥 ↦→𝜎start (𝑥) |𝑥 ∈Mod𝜖 \𝑋 }

Note that 𝜎∅=𝜎start and 𝜎Mod𝜖 =𝜎end, and all other partially-updated states are intermediate states.

Example 2. In Ex. 1, there are two possible intermediate states to consider: 𝜎{x} = {𝑥 ↦→2,𝑦 ↦→1}and 𝜎{y} = {𝑥 ↦→ 1,𝑦 ↦→ 3}. A solution 𝑝∗ can transition from 𝜎start to 𝜎end through 𝜎{x} , where x ismodified first, through 𝜎{y} , where y is modified first, or directly, in a simultaneous assignment.

Intermediate State Graph.We can compactly represent the space of all possible solutions to asynthesis task using a DAGwhose nodes are partially-updated states and whose edges are single-assignment synthesis sub-tasks. We dub this data structure an Intermediate State Graph.

Definition 3 (Intermediate State Graph (ISG)). Given a synthesis task {𝜖}= {𝜎start→𝜎end}, its ISG isa directed acyclic graph where:(1) there is a node 𝑁𝑋 for each𝑋 ⊂Mod𝜖 , which represents the partially-updated state 𝜎𝑋 ;(2) there is an edge (𝑁𝑋 ,𝑁𝑋 ′) iff𝑋 ⊊𝑋 ′;(3) each edge (𝑁𝑋 ,𝑁𝑋 ′) is labeled with a variable assignmentA (𝑋,𝑋 ′) , whose domain is𝑋 ′\𝑋 .

Example 3. Fig. 6 depicts the ISG for Ex. 1 with different variable assignmentsA𝐸 on the edges: onthe left all the assignments are empty; on the right, some (but not all) the assignments are complete.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 13: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:13

Synthesis algroithm. The synthesis algorithmmaintains an ISG, where the assignment for eachedge 𝐸 is initialized with A𝐸 (𝑥) = ⊥ for every 𝑥 in its domain. The algorithm then creates anexpression enumerator Enum(𝜎𝑋 ) for each ISG node𝑁𝑋 except the final node𝑁end. In each iteration,the algorithm draws an expression 𝑒𝑖 from an enumerator at some node 𝑁𝑋 . For every outgoingedge (𝑁𝑋 ,𝑁𝑋 ′) and for every unassigned variable 𝑥 on that edge (such thatA (𝑋,𝑋 ′) (𝑥)=⊥), it testswhether 𝑒𝑖 is valid for 𝑥 on that edge, i.e., whether ⟦𝑒𝑖⟧(𝜎𝑋 ) =𝜎𝑋 ′ (𝑥); if so, the algorithm updatesA (𝑋,𝑋 ′) (𝑥) to 𝑒𝑖 . When all variables on an edge are assigned (i.e.,A (𝑋,𝑋 ′) (𝑥) ≠⊥ for every 𝑥), theedge (𝑁𝑋 ,𝑁𝑋 ′) is marked complete. The algorithm terminates when there is a path from 𝑁start to𝑁end via complete edges. The sequence of assignments along the path is the solution to the synthesisproblem.

Example 4. To solve the synthesis problem defined in Ex. 1, we first initialize the ISG as shown inFig. 6a. We then create three expression enumerators, one each in 𝑁start, 𝑁 {𝑥 } , and 𝑁 {𝑦 } .Consider the iteration of the algorithmwhere we query the enumerator 𝑁start, and it yields the

expression x + x. This expression, which evaluates to ⟦x + x⟧(𝜎start)=2, is then tested for validityfor every variable on each of the three outgoing edges from 𝑁start:

− (𝑁start,𝑁 {𝑥 }), variable 𝑥 : 𝜎{x} (𝑥)=2, soA (start,{𝑥 }) (𝑥) is set to x + x.− (𝑁start,𝑁 {𝑦 }), variable𝑦, 𝜎{y} (𝑦)≠2, soA (start,{𝑦 }) (𝑦) remains⊥.− (𝑁start,𝑁end), variables 𝑥 and𝑦: 𝜎end (𝑥)=2 but 𝜎end (𝑦)≠2, soA (start,end) (𝑥) is set to x + x, and

A (start,end) (𝑦) remains⊥.After this iteration, the edge (𝑁start,𝑁 {𝑥 }) is complete, but the two other outgoing edges are not.

At a later iteration, we encounter the expression x + y at a different node, 𝑁 {𝑥 } . This expression,which evaluates to ⟦x + y⟧(𝜎{x})=3 is tested for validity for the variable𝑦 of the sole outgoing edge(𝑁 {𝑥 },𝑁end) from 𝑁 {𝑥 } . Since 𝜎end (𝑦)=3,A ( {𝑥 },end) (𝑦) is set to x + y, and this edge is now complete.Moreover, as shown in Fig. 6b, there is now a path of complete edges from 𝑁start to 𝑁end, which istranslated into the solution x = x + x; y = x + y.

Multiple enumerators.As we alluded to at the beginning of this section, our synthesis algorithmfor a sequence of assignments needs to use multiple enumerators, initialized with different stores 𝜎𝑋 ,because the synthesized expressions are meant to be evaluated in different stores during programexecution. To illustrate potential completeness issues when using a wrong store, assume that theISG in Fig. 6 had a single shared enumerator Enum(𝜎start). Note that ⟦𝑥⟧(𝜎start)=⟦𝑦⟧(𝜎start)=1, sofrom the standpoint of Enum(𝜎start) these two expressions are equivalent, and one of them (say𝑦)is discarded by OE. As a result Enum(𝜎start) will never enumerate any expressions with variable𝑦; in particular, using just this enumerator, we would not be able to generate the desired solutionx = x + x; y = x + y. Instead, the enumeratorEnum(𝜎{x}) yields both𝑥 and𝑦 (and larger expressionsbuild from both of these variables), since the two variables are not equivalent in 𝜎{x} .

Design considerations.A shrewd readermight be concerned that the enumerators at different nodesduplicate each other’s work, since they do enumerate some of the same expressions. An alternativedesign that eliminates this work duplication is to use a single shared enumerator initialized with avector of all partially updated states: Enum𝜎𝑋 . This enumerator treats each 𝜎𝑋 as a separate exampleit must consider, and prunes expressions by evaluating them point-wise on 𝜎𝑋 and comparing theiroutput vectors. Indeed, we can safely share this single enumerator between all nodes in the ISG,without the danger of any relevant expressions being lost, as all states on which the expressionsmight possibly be evaluated are taken into account. Perhaps surprisingly, this design turned out tobe so inefficient, that it did not even warrant a quantitative evaluation. The reason is that the sharedenumerator has amuchmore fine-grained equivalence relation between expressions, which preventsOE reduction from pruning expressions efficiently; as a result, the shared enumerator producesmanymore expressions than all per-node enumerators combined.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 14: LooPy: Interactive Program Synthesis with Control Structures

153:14 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

The algorithm above does not specify the order in which different enumerators are queried. Ourcurrent implementation interleaves them in a round-robin fashion. In principle they could easily runin parallel, as the only operation that requires coordination between multiple ISG nodes is checkingfor a complete path. As long as the scheduling of enumerators is starvation-free, it does not affect thesoundness or completeness of the algorithm, although it can affect the size of the generated solution.

Picking the smallest program. Because a single edge can be shared bymultiple paths, the synthesisalgorithmmight discover multiple complete paths from𝑁start to𝑁end in a single iteration. In this caseLooPy selects the programwith the smallest total size. To that end, each edge (𝑁𝑋 ,𝑁𝑋 ′) is assigned aweight equal to

∑𝑥 ∈𝑋 ′\𝑋 size(A (𝑋,𝑋 ′) (𝑥)), where size is the size of an expression in AST nodes (with

size(⊥)=∞). LooPy then uses Dijkstra’s algorithm to find the shortest path from 𝑁start to 𝑁end.In general, the synthesis algorithm is not guaranteed to find the smallest solution overall, because

the size of expressions enumerated at different nodes increases at a different rate (e.g., in Fig. 6𝑁start starts producing larger expressions sooner than the other nodes, since it only has to considerexpressions with a single free variable)8; our experiments show, however, that the round-robininterleaving produces small programs in practice.

4.3 Multiple Examples, Sequence of Assignments

Recall that in Live PBE, a hole inside a loop typically generates a block-level synthesis task withmultiple examples (one for each loop iteration entered by the user). We now extend our synthesisalgorithm to this setting. More specifically, let E = ⟨𝜖1,...,𝜖𝑛⟩ be a vector of examples (the orderingof examples is irrelevant, but fixed once and for all before synthesis begins). We define the set ofmodified variablesModE to include variables modified in any of the examples:ModE =

⋃𝜖∈EMod𝜖 .

Expression enumerators.An expression enumerator Enum(⟨𝜎1,...,𝜎𝑛⟩) must now be parametrizedby a vector of stores. Each expression 𝑒𝑖 it constructs is evaluated in all stores to produce an outputvector ⟨𝑣1, ...,𝑣𝑛⟩; output vectors are compared point-wise, and only one expression per vector isretained in the bank. As usual with OE reduction, the more examples we have, the more expressionsare retained in the bank, and the slower the enumeration.

Intermediate State Graph. The only change in the ISG is that a node 𝑁𝑋 now corresponds to avector of partially-updated stores, rather than a single store:

𝑁𝑋 = ⟨𝜎1𝑋 ,...𝜎

𝑛𝑋 ⟩ where 𝜎𝑘𝑋 = {𝑥 ↦→𝜎𝑘end (𝑥) |𝑥 ∈𝑋 }∪{𝑥 ↦→𝜎𝑘start (𝑥) |𝑥 ∈ModE \𝑋 }

Importantly, the topology of the graph is unchanged, in the sense that the set of nodes and edges isdetermined only byModE and does not directly depend on 𝑛.

Synthesis algorithm.The synthesis algorithm remains largely the same. The expression enumeratoris eachnode𝑁𝑋 isnowEnum(⟨𝜎1

𝑋,...𝜎𝑛

𝑋⟩).Anexpression𝑒 isvalid foravariable𝑥 onanedge (𝑁𝑋 ,𝑁𝑋 ′)

if it is valid for every example: ∀1≤𝑘 ≤𝑛.⟦𝑒⟧(𝜎𝑘𝑋)=𝜎𝑘

𝑋 ′ (𝑥).In the next section, we tackle synthesis of conditionals, which requires multiple examples; hence

from now on we use these generalized versions of the ISG and the synthesis algorithm.

5 SYNTHESIZINGCONDITIONALS

In this section, we extend our block-level synthesis algorithm to support conditional statements. Webegin in Sec. 5.1 with a restricted settingwhere the solution has a single assignment in each branch ofthe conditional; Sec. 5.2 extends the algorithm to combine conditionals with assignment sequences.

8It is theoretically possible to pause each enumerator that finishes program size 𝑘 until all enumerators finish size 𝑘 , and start

size 𝑘+1 together. This is impractical, however, since the set of programs of size 𝑘 can be very large even for a moderate 𝑘 .

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 15: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:15

5.1 Single Conditional Assignment

Consider a block-level synthesis task E with multiple examples (|E | > 1) and a single modifiedvariable 𝑥 (ModE = {𝑥}), and assume that we are looking for a solution 𝑝∗ of the form:

if 𝑒cond : 𝑥 =𝑒then else : 𝑥 =𝑒else

Unlike an unconditional assignment 𝑥 =𝑒 , where we had to find a single expression 𝑒 that is valid forall examples E, in this case we have to find three expressions, 𝑒cond, 𝑒then, and 𝑒else such that:(1) 𝑒then is valid for some subset of examples Ethen ⊂E;(2) 𝑒else is valid for the rest of the example Eelse=E\Ethen; and(3) 𝑒cond is a boolean expression that separates these two subsets, i.e., evaluates to True on all Ethen

and to False on all Eelse.The general idea behind the synthesis algorithm is to use a single enumerator Enum(⟨𝜎1

start,...,𝜎𝑛start⟩)

to generate a stream of expressions, and keep track of promising candidates for 𝑒then, 𝑒else, and 𝑒cond,until we have encountered three expressions that together satisfy the above requirements.

Partitions. Since we do not know in advance how to partition E into Ethen and Eelse, the algorithmmust consider all possible partitions 𝜋 = ⟨E1,E2⟩. Note, however, that a solution for ⟨E1,E2⟩, canbe transformed into a solution for the symmetric partition ⟨E2,E1⟩, by swapping 𝑒then and 𝑒else andnegating 𝑒cond. Hence the algorithm only needs to explicitly track half of all partitions (modulosymmetry); we denote this set of partitions of interest ΠE , with |ΠE |=2

|E |−1.

Example 5. Consider a synthesis task E = {𝜖1,𝜖2} where 𝜖1= {𝑥 ↦→−1 }→{𝑥 ↦→1} and 𝜖2= {𝑥 ↦→2}→{𝑥 ↦→2}. There are four partitions of E:

𝜋1= ⟨E,∅⟩ 𝜋3= ⟨{𝜖2},{𝜖1}⟩

𝜋2= ⟨{𝜖1},{𝜖2}⟩ 𝜋4= ⟨∅,E⟩

For thepurposesof synthesis,𝜋2 and𝜋3 are symmetric, and soare𝜋1 and𝜋4 (the latterpair correspondsto an unconditional solution). Hence we select ΠE = {𝜋1,𝜋2}.

Storing candidate expressions. For each partition ⟨E1,E2⟩ ∈ΠE , the algorithm maintains a pairof variable assignments, ⟨AE1 ,AE2⟩. An assignment AE 𝑗 maps each variable in ModE (in thissubsection, just 𝑥) to an expression 𝑒 that is valid over E 𝑗 : that is, ∀𝜎start →𝜎end ∈ E 𝑗 .⟦𝑒⟧(𝜎start) =𝜎end (𝑥). The expressions stored inA

E1 andAE2 are used as candidates for 𝑒then and 𝑒else, respectively.Note that in total there is one variable assignment for each subset of E.In addition to the 2 |E | variables assignments, the algorithm also maintains a single condition

store C, which maps every partition ⟨E1,E2⟩ ∈ ΠE to a boolean expression 𝑏 that matches thispartition: that is, evaluates to True on E1 (∀𝜎start→𝜎end ∈ E1.⟦𝑏⟧(𝜎start) =True) and to False on E2

(∀𝜎start→𝜎end ∈E2.⟦𝑏⟧(𝜎start)=False). The expressions stored in C are used as candidates for 𝑒cond.BothAE𝑖 and C are partial maps, i.e., some of their keys may be mapped to⊥.

Example 6. Fig. 7 shows two different states of the synthesis algorithm for the task from Ex. 5. Eachstate is depicted as a degenerate ISG with only two nodesÐ𝑁start and 𝑁endÐsince in this subsectionwe are not dealing with sequences of assignments. Instead of a single edge connecting the two nodes,there are now two: one edge per partition 𝜋 ∈ΠE . Each edge is labeled with the pair of assignmentsassociated with its partition. The node 𝑁start is also labeled with the condition store, which stores aboolean expression for each of the two partitions.

Synthesis algorithm. The algorithm begins by initializing all AE 𝑗 (𝑥) and C(𝜋) to ⊥, exceptC(⟨E,∅⟩) ↦→True. In each iteration, the algorithm draws one expression 𝑒𝑖 from the enumerator andupdates the state as follows:

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 16: LooPy: Interactive Program Synthesis with Control Structures

153:16 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

𝜎𝑠𝑡𝑎𝑟𝑡1 = {𝑥 ↦ −1}𝜎𝑠𝑡𝑎𝑟𝑡2 = {𝑥 ↦ 2}𝜎𝑒𝑛𝑑1 = {𝑥 ↦ 1}𝜎𝑒𝑛𝑑2 = {𝑥 ↦ 2}

𝑁𝑠𝑡𝑎𝑟𝑡𝑁𝑒𝑛𝑑

𝜋1: True𝜋2: ⊥𝜋1 𝜋2𝒜ℰ: x = ⊥𝒜∅: x = ⊥ 𝒜{𝜖1}: x = ⊥𝒜{𝜖2}: x = ⊥

𝒞 𝜎𝑠𝑡𝑎𝑟𝑡1 = {𝑥 ↦ −1}𝜎𝑠𝑡𝑎𝑟𝑡2 = {𝑥 ↦ 2}𝜎𝑒𝑛𝑑1 = {𝑥 ↦ 1}𝜎𝑒𝑛𝑑2 = {𝑥 ↦ 2}

𝑁𝑠𝑡𝑎𝑟𝑡𝑁𝑒𝑛𝑑

𝜋1: True𝜋2: x < 0𝜋1 𝜋2𝒜ℰ: x = ⊥𝒜∅: x = x

𝒜{𝜖1} : x = -x𝒜{𝜖2}: x = x

𝒞

Fig. 7. Initial (left) and final (right) state of the synthesis algorithm for the task in Ex. 5. Complete variable

assignments and complete edges (partitions) are highlighted in red. On the right, partition 𝜋2 is complete,

because both of its variable assignments are complete, and it has a condition in C.

(1) For each partition ⟨E1,E2⟩ ∈ΠE , the algorithm tests whether 𝑒𝑖 is valid over either of the E 𝑗 ; in

that case,AE 𝑗 (𝑥) is updated to 𝑒𝑖 unless already set to something other than⊥.(2) If 𝑒𝑖 is a boolean expression that evaluates without errors on all examples, the algorithm

searches for a partition 𝜋 ∈ΠE such that 𝑒𝑖 matches 𝜋 ; if found, C(𝜋) is updated to 𝑒𝑖 , unlessalready set. Note that a matching partition 𝜋 might not exist in ΠE , since ΠE only stores halfof the partitions; in this case, there must be a 𝜋 ′∈ΠE that is symmetric with 𝜋 , and moreoverthe expression not 𝑒𝑖 matches 𝜋 ′. Hence, if a matching partition for 𝑒𝑖 is not found, then thealgorithm searches for one for not 𝑒𝑖 , and updates C accordingly.

The algorithm terminates as soon as some partition 𝜋∗= ⟨E∗

1,E∗2⟩ is complete, that is, C(𝜋∗)≠⊥ and

bothAE∗1 andAE∗

2 are complete (do not contain⊥). The algorithm returns a solution where:

𝑒cond=C(𝜋∗) 𝑒then=A

E∗1 (𝑥) 𝑒else=A

E∗2 (𝑥)

Example 7. For the synthesis task in Ex. 5, the state is initialized as shown in Fig. 7 (left). Thefirst iteration enumerates expression x, which is valid for the example 𝜖2, and hence we update

A {𝜖2 } (x) = x (and also, trivially, A ∅ (x) = x). A later iteration enumerates -x , which is valid for

𝜖1, updating A {𝜖1 } (x) = -x . At this point, the partition 𝜋2 = ⟨{𝜖1}, {𝜖2}⟩ has both of its variableassignments complete, but the partition itself is not yet complete, since it does not have a conditionin C. Once the enumerator yields the expression x < 0, we notice that it evaluates to True on 𝜖1

and to False on 𝜖2, hence we update C(𝜋2)=x < 0. At this point, 𝜋2 is complete, and the algorithmterminates. This final state is depicted in Fig. 7 (right); the highlighted partition creates the finalsolution:

if x < 0: x = -x else: x = x

Multiple solutions. Because expressions 𝑒then and 𝑒else might be valid on overlapping subsets ofexamples, our algorithmmaycompletemultiple partitions in the same iteration. Just like inSec. 4.2,wepick the solutionwith the smallest overall size in AST nodes, i.e., minimizing size(𝑒cond)+size(𝑒then)+size(𝑒else). For the partition ⟨E,∅⟩, the size is computed simply as size(𝑒then), since this solution canbe simplified into an unconditional assignment.

5.2 Conditionals and Assignment Sequences

Finally,wepresentourblock-level synthesis algorithminall generality, combining thenotionofan ISGfrom Sec. 4.2 to handle sequential compositionwith partitions from Sec. 5.1 to handle conditionals. Intheory this approach can support programswith arbitrary combinations of assignments, conditionals,and sequential composition, drawn from the grammar:

𝑝 ::= 𝑥 =𝑒 | 𝑝 ; 𝑝 | if 𝑒 : 𝑝 else: 𝑝

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 17: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:17

In practice, however, the full search space turns out to be too large, slowing down the search andleaving the user to weed through many irrelevant solutions. Hence, the version of the algorithmimplemented in LooPy and described in this section restricts the search space to programs with asingle top-level conditional, with a sequence of assignments in each branch:

𝑝 ::= 𝑞 | if 𝑒 : 𝑞 else: 𝑞

𝑞 ::= 𝑥 =𝑒 | 𝑞 ; 𝑞

Conditional ISG. To represent programs from this space, we generalize the notion of an ISG fromDef. 3 into a conditional ISG. We have already seen a simple conditional ISG with just two nodes inFig. 7. In general, to turn an ISG into a conditional ISG we simply clone every edge 2 |E |−1 times (onefor each partition), and associate two variable assignments (instead of one) with each edge. We alsoadd a single condition store C, associated with the node 𝑁start.

Definition 4 (Conditional ISG). Given a synthesis taskE= {𝜖1,...,𝜖𝑛}, its conditional ISG is a directedacyclicmulti-graphwhere:(1) there is a node 𝑁𝑋 for each𝑋 ⊂ModE , which represents the vector of partially-updated stores

⟨𝜎1𝑋,...,𝜎𝑛

𝑋⟩;

(2) for each pair of states 𝑁𝑋 ,𝑁𝑋 ′ such that𝑋 ⊊𝑋 ′, and for each partition 𝜋 ∈ΠE , there is an edge(𝑁𝑋 ,𝑁𝑋 ′)𝜋 ;

(3) each edge (𝑁𝑋 ,𝑁𝑋 ′)𝜋 where 𝜋 = ⟨E1,E2⟩ is labeled with a pair of variable assignmentsAE1

(𝑋,𝑋 ′)

andAE2

(𝑋,𝑋 ′), whose domain is𝑋 ′\𝑋 .

(4) the node 𝑁start is labeled with a condition store C.

Given a conditional ISG G and a partition 𝜋 , we denote G/𝜋 the sub-graph of G spanned by all theedges of the form (𝑁,𝑁 ′)𝜋 . Each G/𝜋 is a regular DAG (not a multi-graph), which represents thecurrent candidate solution for partition 𝜋 .

Synthesisalgorithm.Thehigh-level structureof thealgorithm is similar to thatwithout conditionals:i.e., each ISG node has an associated expression enumerator, and in each iteration a new expression𝑒𝑖 is produced at some node 𝑁𝑋 . State updates are a straightforward combination of Sec. 4.2 andSec. 5.1: namely, 𝑒𝑖 is tested for validity for both E1 and E2, for each outgoing edge (𝑁𝑋 ,𝑁𝑋 ′) ⟨E1,E2 ⟩ ,

and each variable 𝑥 ∈𝑋 ′\𝑋 , and the assignmentAE 𝑗

(𝑋,𝑋 ′)is updated accordingly.Whenever a boolean

expression is enumerated at the node 𝑁start, the algorithm additionally tests whether it matches anypartitions and updates C accordingly. Note that we are looking for a programwith a single top-levelconditional, where the condition is always evaluated in the initial state; this is why the algorithmonly needs one C, and the conditions are produced by the enumerator at 𝑁start.

An edge (𝑁𝑋 ,𝑁𝑋 ′) ⟨E1,E2 ⟩ is considered completewhen both of its assignmentsAE1

(𝑋,𝑋 ′)andAE2

(𝑋,𝑋 ′)

are complete (i.e., do not contain⊥). The algorithm terminates when there is a partition 𝜋∗ such that:• it has a condition in C: C(𝜋∗)≠⊥, and• the subgraph G/𝜋 has a path from 𝑁start to 𝑁end along complete edges.

The algorithm then returns the solution ifC(𝜋∗) : 𝑞then else:𝑞else, where𝑞then and𝑞else are sequences

of assignments collected fromAEthen

(𝑋,𝑋 ′)andAEelse

(𝑋,𝑋 ′)along the complete path (with ⟨Ethen,Eelse⟩=𝜋

∗).

Once again, since multiple solutions may be discovered simultaneously, the algorithm searches for ashortest complete path in each G/𝜋 , and then selects the smallest program among these candidates.

5.3 Post-Processing

Because the synthesis algorithm described in Sec. 5.2 generates programs in a restricted form (asingle top-level conditional with assignments to the same variables in both branches), the resulting

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 18: LooPy: Interactive Program Synthesis with Control Structures

153:18 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

if 0 <= x:

x = x

else:

x = -x

if 0 <= x:

else:

x = -x

if x < 0:

x = -x

1 2

x, y = x + z, y + zx = x + z

y = y + z

3

if 10 > a:

a += 1

c = a + b

d = a + b

b -= 1

else :

a += 1

d = a + b

c = a - b

b -= 1

4

a += 1

d = a + b

if 10 > a:

c = a

else :

c = a - b

b -= 1

a

b

c

Fig. 8. Examples of post-processing of synthesized conditional programs to make themmore readable.

program is not always the most concise or natural. To improve readability of synthesized programs,LooPy post-processes the solution to simplify it before presenting to the user.

More specifically, the following rewrite rules are repeatedly applied until a fixed point is reached:(1) Removing self-assignment.A variable 𝑥 ∈ModE might be actually modified only in one

branch of the solution and not the other. In this case, the algorithm generates a self-assignment𝑥 =𝑥 , which can be simply removed during post-processing, as shown in step 1○ in Fig. 8a.

(2) Removing empty branches.As a result of applying other rules, one branch of the conditionalcan become empty and can be removed. If the remaining branch is the else branch, LooPy turnsit into the then branch and negates the condition, as shown in step 2○ in Fig. 8a.

(3) Splitting simultaneous assignments. The generated solutions might include simultaneousassignments even when they are not strictly required (recall the example in Sec. 4.1). Ourexperience shows that a sequence of simple assignments is usually more familiar and hencemore readable than a simultaneous assignment. For this reason, LooPy splits a simultaneousassignment into a sequence of simple assignments whenever this is sound, i.e., when there isno cross-variable dependency between left- and right-hand sides. Step 3○ in Fig. 8b shows anexample of this rewrite; this is sound because x + z does not use y and y + z does not use x. Onthe other hand, x, y = x + z, y + x cannot be straightforwardly split, because the right-handside assigned to y uses x, and specifically its old value. Splitting assignments has the additionalbenefit that it makes other rewrite rules more likely to apply.

(4) Factoring out unconditional code. Some of the assignments might be duplicated betweenthe then and else branches, and hence can be factored out of the conditional. An example isshown in step 4○ in Fig. 8c. More specifically, LooPy extracts the identical prefix and suffixof assignments from the two branches, and places these statements before and after theconditional, respectively. In order to maximize the common prefix/suffix, LooPy also re-ordersassignments inside each branch whenever this is sound; for example in Fig. 8c the statementd = a + b is re-ordered before c = a + b (since there is no dependency between the two) andbecomes part of the common prefix.

6 EMPIRICAL EVALUATION

Wedesign our experiments to answer the following research questions: overall, wewant to show thatLooPy requires less time and less input from the user than big-step synthesizers, and yet generatescorrect and general programs most of the time.

(RQ1) Can LooPy handle a wide range of synthesis tasks at interactive speeds?

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 19: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:19

Table 1. Summary of the four benchmark suites used in LooPy’s empirical evaluation.

Number of Avg. Avg. solution Avg. number Avg. iterations

Benchmark suite benchmarks |ModE | size (AST nodes) of examples per example

No control flow 61 1.7 8.9 1.6 ś

LooPy conditional 9 1.0 14.6 2.6 ś

LooPy 38 1.8 11.9 1.1 4.1

FrAngel 23 1.0 7.9 1.0 6.0

(RQ2) Does LooPy require less user input to synthesize correct programs with loops compared tothe state of the art?

(RQ3) Does enumerating programs using the conditional ISG affect LooPy’s ability to solve simplenon-conditional synthesis tasks?

(RQ4) Are assignment sequences necessary to solve benchmarks with loops, or is simultaneousassignment sufficient?

Implementation.WeimplementedLooPy in Scala basedon the algorithm inSec. 5.2, using a standardsize-based bottom-up synthesizer as the expression enumerator in each ISG node. Each enumeratorhas a vocabulary of 84 components, plus all the variables available in the before-state, and stringliterals extracted from the after-state. Our implementation is single-threaded; there is much room forimprovement via parallelism, as each ISG node can be handled independently.

Benchmarks.We evaluated LooPy on 131 benchmarks from four benchmark suites:

(1) No control flow: a suite of 61 benchmarks from previous synthesizers that generate assignmentswithout conditions or loops.

(2) LooPy conditional: a suite of 9 benchmarks that contain a conditional statement outside thecontext of a loop.

(3) LooPy: a suite of 38 block-level synthesis tasks extracted from programs with loops via liveexecution; the original looping programs are curated from competitive programming andeducational problems, as well as our user study tasks, described in Sec. 7.

(4) FrAngel: 23 benchmarks from FrAngel’s ControlStructures benchmark suite [Shi et al. 2019].

The statistics for these benchmarks, including the size of specification provided, are shown in Tab. 1.For each benchmark, we created a set of gold standard solutions to compare synthesis results against.We next detail the process of selecting and converting FrAngel’s benchmarks for LooPy.

Selection criteria.We selected 23 tasks from FrAngel’s ControlStructures benchmark suite, whichtests manipulating list-like data structures in various ways. Of the 40 benchmarks in the originalsuite, we excluded those with constructs not supported by LooPy according to the following criteria:

− Benchmarks that contain chained conditionals or multiple chained loops. Such benchmarkswere broken up into multiple benchmarks and added to the LooPy benchmarks set.

− Benchmarks that contain unsupported types, where no comparable type exists in LooPy (e.g.,matrices, benchmarks that hinge on null pointers).

− Benchmarks that require external library functions not present in the standard library.

Benchmark translation. Since FrAngel synthesizes Java programs from end-to-end specifications,we converted FrAngel’s benchmarks into Python and block-level specifications. Additionally,the typical structure of a FrAngel benchmark is a set of approximately five examples where onerepresents the general case and the remainder are mutations covering corner cases. We convertedthe general case example from each of the selected FrAngel benchmarks to the LooPy specificationformat using the following steps:

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 20: LooPy: Interactive Program Synthesis with Control Structures

153:20 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

(a) Correctly solved benchmarks (b) All benchmarks that did not time out

Fig. 9. Performance of the LooPy synthesizer on the full benchmark suite at 300s timeout.

(1) The representative example was translated from Java to Python; built-in Java collections werereplaced with Python lists or sets, and Java-specific elements such as null references wereremoved if present.

(2) Wemanually specified the intermediate variables (if needed) to synthesize the solution.(3) Wemanually specified the loop structure, typically a for loop over the elements or the indexes

of the input.(4) Starting with the representative example’s input, we provided the intermediate output values

for each iteration (typically six iterations).(5) FrAngel’s gold standard solution was translated from Java to Python.

Additionally, benchmarks that can be solved by LooPywithout the use of a loop were also added tothe No Control Flow or LooPy Conditional benchmark suite.

Experimental setup.All benchmarks were run on an AMD Ryzen 3800X processor, with the JVMmaximum heap size set to 24 GB.

6.1 RQ1: Synthesis at Interactive Speeds

To test RQ1, we ran LooPy on all 131 benchmarks in our joint benchmark suite. We set a timeout offive minutes, and measured the time to completion and the correctness of the synthesis result foreach of the benchmarks. We show the results in Fig. 9.

Results.Of the 131 benchmarks, LooPy terminates on 125 (95%) within five minutes, and correctlysolves 99 (76%). However, since five minutes is far too long a wait within an actual programmingworkflow, we would like to examine LooPy at an interactive timeout, seven seconds.

By a seven-second timeout, LooPy terminates on 120 benchmarks (92%), it correctly solves 96(73%). This difference is small enough to consider the shorter timeout extremely beneficial: mostsynthesis tasks still behave the same, but the wait is sufficiently short to prevent the user from losingthe context of their work. We therefore answer RQ1 in the affirmative.

6.2 RQ2: Specifications Required for LooPy’s InteractionModel

To test whether LooPy’s interaction model requires less specification effort than the state of the art,we picked FrAngel as our baseline and used their manually crafted ControlStructures benchmarks.We performed two experiments:

1) We manually minimized the example sets in FrAngel’s original benchmarks to the minimumset required for correctness. This gives us a lower bound on the number of end-to-end examplesthat are necessary to use a big-step synthesizer.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 21: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:21

4% 17% 43%

4% 30%

4% 43%

9%

52% 70%

61% 70% 87%

61% 78% 78%

52% 70% 65% 65%

1

2

3

4

1 2 3 4 5

# of iterations provided

# o

f exa

mp

les p

rovid

ed

(a) Correctly solved benchmarks

83%

74% 65%

96% 96% 96% 96% 96%

96% 96% 96% 91% 91%

96% 96% 91% 91%

96% 96% 87%

1

2

3

4

1 2 3 4 5

# of iterations provided

# o

f exa

mp

les p

rovid

ed

(b) All benchmarks that did not time out (c) FrAngel end-to-end examples

Fig. 10. The effect of the number of examples and the number of iterations from each example on LooPy using

the FrAngel benchmark set, and the number of examples provided and required by FrAngel.

2) We tested LooPywith a varying number of examples and a varying number of loop iterationsper example.

6.2.1 Minimizing FrAngel. FrAngel’s original ControlStructures benchmarks have an average of5.5 end-to-end examples per benchmark. Tominimize the example sets, wemodified the benchmarksby successively reducing the number of examples as long as the result remained correct. BecauseFrAngel uses stochastic search, we ran each modified benchmark three times with a timeout of 30seconds, and considered the result correct if it was correct in at least one of the three attempts.

Results.The results are shown in Fig. 10c. Formost benchmarks, FrAngel needsmost of the originalexamplesÐan average of almost four end-to-end examples per benchmark. We further observe thateven though for some of the benchmarks half or more of the examples could be removed, selectingthose examples was far from intuitive. The original example set is more representative of a typicalinput to a big-step PBE synthesizer, since it was likely curated via the usual method of addingexamples as long as the result is incorrect.

6.2.2 Varying Number of Examples for LooPy. Tomeasure the size of user input LooPy needs, weexercised it with varying numbers of examples and loop iterations per example.

We translated each FrAngel benchmark with all its original examples, excluding examples wherethe loop is not entered (e.g., an empty list as input).Asmentioned above, each FrAngelbenchmarkfiletypically starts with a łmainž example (sometimes two) demonstrating the main success scenario ofthedesiredprogram, followedbymutations introducingcorner cases.Manyof these aredifferentiatingexamples, which do not capture representative behaviors but rather distinguish the desired programfrom simpler programs. For this reason, we cannot select 𝑛 examples at random, as we may wind upwith just the corner cases and nothing demonstrating the core behavior. Therefore, we consider theexamples in the order in which they appear in the FrAngel benchmark file.

For each FrAngel benchmark,we run LooPy on the first 1ś4 examples, with the first 1ś5 iterationsof the loop for each example. The percentage of benchmarks that did not time out and the percentageof benchmarks where LooPy found the correct program appear in Fig. 10.

Results. LooPy’s correctness peaks around 4ś5 iterations of 2ś3 examples, but even at five iterationsof one example, correctness is at 70%. This is despite several FrAngel benchmarks where the end-to-end specification requires at least two examples, such as the benchmark łAre all list elementspositive?ž. Here, the end-to-end specification requires two examples: one where the property holdsand one where it does not. In LooPy, however, it is sufficient to invoke the program on a single list

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 22: LooPy: Interactive Program Synthesis with Control Structures

153:22 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

(a) Time: single assignment (b) Time: assignment sequence

5

0

0

2

0

0

2

5

47

Tim

eout

Incorr

ect

Corr

ect

Timeout Incorrect Correct

Single Enumeration

LooP

y

(c) Correctness

Fig. 11. Differences in synthesis time and correctness between LooPy and a single expression enumerator.

that does not satisfy the property, as long it has a prefix that does. If we use all iterations of a singleexample (often more than five), LooPy succeeds on all FrAngel benchmarks but one.

It is also rather expected that providing just one iteration has a low success rate, evenwithmultipleexamples (see the leftmost column of Fig. 10a). It is similarly unsurprising that LooPy times outmore often as the number of examples increases (see Fig. 10b), due to the properties of ObservationalEquivalence we discussed in Sec. 4.2.

To conclude, LooPy does quite well when provided fewer inputs than FrAngel: LooPy solvesmostof the benchmarks using only five inputs (more specifically, five iterations of one examples), whileFrAngel requires 5.5 examples on average. LooPy reaches its peak accuracywith 6ś10 inputs, whichis comparable with the number of examples FrAngel needs on its most demanding benchmarks.Moreover, we believe that multiple loop iterations for one example are easier to provide than to comeup with entirely new end-to-end examples, which is supported by our user evaluation in Sec. 7. Withthis in mind, we answer RQ2 in the affirmative.

6.3 RQ3: The Overhead of Conditional ISG

The goal of this experiment is to measure the overhead of maintaining multiple expression enu-merators and keeping track of multiple example partitions. To this end, we compare the full LooPyalgorithm from Sec. 5.2 to a baseline synthesizer, which consists of a single bottom-upOE enumerator(of the same kind as used in each node of the ISG). Because the baseline synthesizer cannot handleconditionals or sequential composition (it can only synthesize a single assignment statement), werestrict this experiment to theNo Control Flow benchmark suite. While none of these benchmarksrequire conditionals, some of them do requiremultiple assignments.When solving these benchmarkswith the baseline synthesizer we manually decompose them into independent single-assignmenttasks and set the timeout of seven seconds for each task.We then compare synthesis times and resultsbetween the baseline synthesizer and the full LooPy algorithm; the results are shown in Fig. 11.

Results.Wefirst focus on synthesis times for those benchmarks that only require a single assignment(Fig. 11a). We notice that with the exception of two benchmarks synthesis times remain mostlyunchanged, as does the number of expressions enumerated. This is understandable because when atask has just one modified variable, the ISG contains only two nodes, 𝑁start and 𝑁end, and a singleenumerator at 𝑁start (which is identical to the enumerator of the baseline synthesizer). Hence, bothalgorithms are exploring exactly the same stream of expressions; the only difference in performancecomes from the fact that LooPy has to test validity of each expression against multiple partitions (and

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 23: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:23

(a) Running Time

0

1

21

24

1

2

0

9

73

Tim

eout

Incorr

ect

Corr

ect

Timeout Incorrect Correct

LooPySim

LooP

y

(b) Correctness

Fig. 12. Differences in running time and correctness between LooPy and LooPysim.

also test boolean expressions as candidates for the condition store). The two outliers are benchmarkswhere the baseline synthesizer times out while LooPy finds a short but incorrect conditional solution.

For the benchmarks that require assigning multiple variables (Fig. 11b), LooPy predictably takeslonger, because it has to figure out the order of assignments, while for the baseline synthesizer theorder is predefined. Despite this handicap, LooPy takes no longer than two seconds to solve eachbenchmark that the baseline synthesizer can also solve.

There are five benchmarks that were correctly solved by the baseline synthesizer and incorrectlysolved by LooPy (Fig. 11c); all five require multiple assignments. In three of those, LooPy finds adifferent order of assignments, which happens to match the examples; in the remaining two, LooPyintroduces a spurious condition, which leads to a shorter but incorrect solution.

Overall, using a conditional ISG has some effect on LooPy’s ability to solve benchmarks withoutcontrol flow, but this effect is small compared to thewealth of new benchmarks that can now

be solved.

6.4 RQ4: The Effect of Assignment Sequences

Maintaining an ISG with multiple enumerators enables LooPy to synthesize assignment sequences,which, aswe illustrated in Sec. 4.2, may result in a simpler solution compared to a single simultaneousassignment. In this experiment we evaluate whether this actually happens in practice (and hence,whether the additional complexity and performance overhead of the full ISG is justified). To this end,we created LooPysim, a version of LooPy capable of synthesizing only simultaneous assignmentsbut not assignment sequences. LooPysim maintains a conditional ISG with only two nodes, 𝑁start

and 𝑁end. We ran LooPysim on our entire benchmark suite and measured time to termination andcorrectness, compared to the original LooPy synthesizer. The results appear in Fig. 12.

Results. When it comes to synthesis times, on average, LooPy tends to be slightly slower thatLooPysim, because of the overhead ofmultiple enumerators. Predictably, this effect is not observed forany single-variable benchmarks, as the two ISGs are identical in this case. Formost other benchmarks,the performance overhead is small: most importantly, there are only two benchmarkwhere LooPysimfinishes and LooPy times out. There are also two benchmarks where the opposite happens: LooPyfinishes and LooPysim times out9; this happens precisely because the simultaneous assignmentsolutions are larger (in this case, sufficiently large to cause a time out).

9In Fig. 12a there appear to be three such benchmarks, but one of them actually finishes right before the timeout.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 24: LooPy: Interactive Program Synthesis with Control Structures

153:24 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

1 """

2 k-th digital root:

3 Compute the k-th digital root of a natural

4 number n, that is , replace n with the sum

5 of its digits k times.

6 >>> task (1798 , 3)

7 7

8 (because

9 1. 1 + 7 + 9 + 8 -> 25

10 2. 2 + 5 -> 7

11 3. 7 -> 7)

12 """

13 def task(n, k):

14 rs = n

15 return rs

16

17 task(1798, 3)

1 """

2 Extract Numbers:

3 Given a string containing numbers separated by

4 exactly one non -integer character , return a

5 list containing all the numbers in the string.

6 >>> task('13 a7b42 ')

7 [13, 7, 42]

8 """

9 def task(s):

10 rs = []

11 return rs

12

13 task('13 a7b42 ')

Fig. 13. The initial state of the study tasks, as provided to the users in VSCode.

With regard to the quality of solutions, Fig. 12b shows that there are 21 benchmarks that LooPysolves correctly and LooPysim solves incorrectly. Note that we did not include any programs with si-multaneous assignments in the set of gold standard solutions, becausewe consider them less readable.As a result, we consider a LooPysim solution łincorrectž whenever it still contains a simultaneousassignment after post-postprocessing (even if it is semantically equivalent to the gold standard). Outof these 21 incorrect solutions, 10 are actually larger in size than the sequential version because theyrepeat sub-expressions instead of using an intermediate variable.To conclude, with the same number of timeouts and more correct programs, we find assignment

sequences to benefit LooPy and answer RQ4 in the affirmative.

7 LOOPY IN THEHANDSOFUSERS

Block-level synthesis relies on user-generated block-level specifications, and we need to supportthe assumption that these specifications are reasonable and convenient for users to provide. To thatend, we ran a preliminary qualitative user study focusing primarily on the question:

(RQ5) Is providing block-level specifications feasible for users?

Implementation.Wemodified our VSCode extension for Small-Step Live PBE [Ferdowsifard et al.2020], adding: i) sketchholeswithmultiple variables, ii) separationbetween thebefore- and after-statein the projection box, and iii) live execution of sketches with the user as oracle. A live PBE synthesistask is then converted into a block-level synthesis task as described in Sec. 3.

Study method. We recruited five participants (one male, one female, one non-binary, and twopreferred not to state), with 4ś10 years of programming experience for a two-hour study. Participantswere recruited online and screened by the question łIn the past year, how often did you use Python?ž,selecting participants who reported more than łneverž and less than łonce a dayž.The study was conducted over a remote-controlled Zoom session on the same desktop machine

used for the experiments in Sec. 6. In the first part of the study, users watched a tutorial video andsolved a training task (String Compression from Sec. 2), where they could get assistance and wereencouraged to ask questions. In the second part of the study, they were asked to solve two studytasks (depicted in Fig. 13) and explain their thought process throughout. Since the focus of the studywas not on solving the tasks, but on providing specifications, users were specifically asked to solvethe tasks using a loop, and if they were struggling with the algorithm after 20 minutes, we verballyprovided an algorithmic hint. Each task had a 30 minute time limit. At the end of the session, users

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 25: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:25

were asked to fill a short survey about their experience, what they found helpful or frustrating, andwhat suggestions they have for improving LooPy.

Observations. Four users solved both tasks and the remaining one only solved the second task;two users required a hint for one of the tasks. Given the small size and scope of the study, we forgoa quantitative analysis of their sessions, and focus on qualitative responses and observations. Wefound that users naturally provided block-level specificationswhere appropriatewithout any notableissues in terms of the specification itself.

P1 andP3 explained they foundLooPy challengingbecause of theneed tohave a concrete algorithmin mind before providing the specification. P3 stated that łthe hardest part for me was the need tounderstand the structure of the skeleton before handing things to LooPyÐhowmany and whichvariables I want, and on what to iteratež. We observed a related pattern, where P1, P2 and P4 startedwith a small-step specification, and once they had a more holistic view of the loop body tried againwith a block-level specification for the entire loop body. P3 and P5 provided correct block-levelspecifications from the start. Given that P3 and P5 were not notably more experienced than otherparticipants, we think that this is most likely because they had the complete algorithm in mind fromthe beginning, whereas other participants incrementally discovered the algorithm and realized aftera few tries that they would need to modify multiple variables at once.Users also found LooPy useful in multiple ways. P1 and P2 both mentioned that it was useful in

writing more idiomatic code. P3 mentioned that they could have solved the tasks without LooPy, butit would have beenmore frustrating, and P4 found synthesizing loop bodies less tedious than writingthemmanually, saying łanywhere there’s a loop, and I kind of knowwhat it’s supposed to do, [...] it’smuch easier to just write the input-output examples for each loop body iterationž.

Conclusions.While we would need a much larger-scale study to make strong empirical claims,this study suggests that block-level specifications are indeed reasonable and intuitive to provide.Additionally, we notice that block-level specifications allow for a data-driven pattern of exploratoryprogramming, where the programmer explores different intermediate states instead of code.One major limitation of LooPy (mentioned by P1, P4 and P5) was users wanting a better under-

standing of why LooPy fails when it does. This is not unique to LooPy, e.g., it is discussed as a part ofthe łUser-Synthesizer Gapž in our prior work [Ferdowsifard et al. 2020].

8 LIMITATIONS

In this section, we discuss some of the limitations to LooPy’s generality and usability, one stemmingfrom the interaction model and others from the UI design.

Correcting specifications. The current LooPy UI makes it difficult for users to iterate on theirspecifications. While the user is providing examples in the projection box, they can move betweenvariables and loop iterations and change their input (while live execution updates the rest of theprojection box accordingly). Once they launch the synthesis task, however, the projection boxdisappears. If synthesis fails or produces a wrong result, the user cannot go back and edit their input;instead, they have to restart the interaction, providing the entire specification from scratch. Similarly,when the focus is inside the projection box, the user cannot modify the surrounding code or the setof output variables of the hole without exiting from the box and restarting the interaction.The need to correct an erroneous specification has been pointed out by several of our study

participants. We believe this can be fixed just by changing the UI so that undoing a synthesis taskreturns the user to the projection boxwith the latest specification.More complex forms of storing andrestoring specifications, however, are non-trivial to implement, especially if the code surroundingthe hole has changed, which invalidates the before-states inside the specification.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 26: LooPy: Interactive Program Synthesis with Control Structures

153:26 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

While loops.Wedesignedand testedLooPy in the context offor loops iteratingover afixed collection.In this context the number of loop iterations is known a-priori, making it natural for the user tospecify one iteration at a time, as shown in Fig. 2. Although live execution and block-level synthesisgeneralize straightforwardly to while loops with a user-provided loop condition, the user experienceis far more confusing in that case, as the number of loop iterations displayed in the projection boxmight change as the user is entering the specification. Specifically, assume that the user enters ahole in place of the entire body of a while loop. Upon entering this hole’s projection box, the loopcondition must evaluate to True, making the loop infinite; in this case the projection box displaysthe first several iterations. The user can then proceed to enter after-states for as many iterations asthey would like. As soon as the state they entered causes the loop condition to evaluate to False,however, any further iterations disappear from the projection box. Next, assume that the body of theloop contains more code around the hole. Now the number of iterations may change beyond the nextiteration, making the change even less comprehensible. Although this interaction is supported byLooPy, we deemed it too confusing for the user to include in our evaluation.

Comprehensions. LooPy inherits the ability to synthesize Python’s list and dictionary comprehen-sion from Small-Step Live PBE [Ferdowsifard et al. 2020]. Unlike loops, however, the user cannotobserve individual iterations within a comprehension and specify intermediate values for the datastructure it is building. Instead, a comprehension is handled and specified like any other expressionon the right-hand side of an assignment. This, however, is strictly a limitation of the current UI imple-mentation: the live PBE interaction model could certainly be applied to synthesize comprehensionsfrom block-level specifications.

9 RELATEDWORK

There is a long and rich history of work on program synthesis. Broadly speaking our work dis-tinguishes itself from prior work by providing a block-level synthesis approach and associatedinteraction model that allows small-step synthesis with control structures at interactive speeds. Wenow discuss the most closely related work to LooPy.

Synthesis with loops and recursion. Relatively few PBE tools support loops and recursion (orequivalent higher-order functions). Perhaps the most closely related to LooPy is FrAngel [Shiet al. 2019], which supports component-based synthesis for Java programs with control structures.Because FrAngel uses big-step (function-level) specifications, in principle it does not require users tohave knowledge of the algorithm or intermediate variables. In practice, however, to make the searchtractable, FrAngel requires users to provide a variety of examples including base cases and cornercases, and so some knowledge of the algorithm is still required. Also, as discussed more throughoutthe paper, FrAngel’s approach is not fast enough for an interactive setting.

Other PBE tools that efficiently support recursion and higher-order functions include Escher [Al-barghouthi et al. 2013],Myth [Osera and Zdancewic 2015], Smyth [Lubin et al. 2020], 𝜆2 [Feseret al. 2015], Big𝜆 [Smith and Albarghouthi 2016], and RESL [Peleg et al. 2020]. There is a generaltheme behind all these tools: efficient synthesis is achieved by extracting a local specification forthe recursive call (or the higher-order argument). Different tools use different approaches to makesuch extraction possible. For example,Myth requires the user examples to be trace complete; 𝜆2

does not require trace completeness, but only works efficiently when examples happen to be tracecomplete; RESL restricts iteration patterns to map and filter (as opposed to general folds) whichenables extraction of a local specificationwithout a trace completeness requirement. LooPy is similarto all these tools in that it uses local specifications of loop bodies to achieve efficient synthesis, butuses a different approach to make this feasible: LooPy supports dependent (fold-like) loops, andleverages its interaction model to solicit local specifications from the user.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 27: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:27

Our solution to synthesizing loops by asking the user to provide more convenient specificationsis partly inspired by Rousillon [Chasins et al. 2018], a tool for web scraping by demonstration.Rousillon can synthesize programs with loops that extract tabular data from a webpage by makinga łcontract with a userž that they will demonstrate just the first row of the table; Rousillon evensupports nested loops using the same technique and domain-specific insights. The main differencewith our work is that Rousillon is a domain-specific end-user tool, and all of its loops are essentiallymaps (from the DOM to a table), whereas LooPy handles more general loops in a context of ageneral-purpose programming environment.

Synthesis with conditionals.Our technique for generating conditional statements is related tothe various techniques for condition abduction [Alur et al. 2015; Kneuss et al. 2013; Leino andMilicevic 2012; Polikarpova et al. 2016; Shi et al. 2019]. It is most related to the the approach usedby EUSolver [Alur et al. 2017], which enumerates programs until a set of programs covers allinput-output examples, and then attempts to synthesize a condition that separates the examplesbetween branches; the main difference is that in EUSolver branches are just expressions, whereas inLooPy branches are sequences of assignments, so testing whether all examples are covered is moreinvolved. On the other hand, LooPy only generates binary conditionals with an atomic condition,whereas EUSolver uses decision tree learning to generate multi-branch conditionals.

Synthesis with sequential composition. Brahma [Gulwani et al. 2011] proposes an efficient SMTencoding for synthesizing straight-lineprogramswithmultiple assignments to intermediate variables.Their problem is, however, very different from our assignment sequences: Brahma specificationsare still big-step (the relation between the inputs and a single output variable), and the values of thetemporary variables must be guessed by the synthesizer. Instead LooPy takes advantage of the factthat the final values of all variables are provided to perform synthesis more efficiently using ISGs.

Synthesizers with limited support for control structures. There are alsomany other synthesizersin the literature, but compared to LooPy they have limited support for control structures. Thisincludes some interactive synthesizers that integrate into a general-purpose programmingworkflow,for example SnipPy [Ferdowsifard et al. 2020] and CodeHint [Galenson et al. 2014]; various otherPython synthesizers, for example TFCoder [Shi et al. 2020], AutoPandas [Bavishi et al. 2019],Wrex [Drosos et al. 2020]; and synthesizers for other languages [Feng et al. 2017; Galenson et al.2014; Gvero et al. 2013; James et al. 2020; Mandelin et al. 2005; Yang et al. 2018]. These all handleone-liners or sequences of method calls, with only limited support for control structures. In contrast,our proposed approach supports control structures and generating multiple statements at once.

Bottom-up enumerative synthesis. Bottom-up enumerative synthesis is a technique that origi-nated in Transit [Udupa et al. 2013] and Escher [Albarghouthi et al. 2013], and is used in manysynthesizers [Barke et al. 2020; Peleg et al. 2020; Peleg and Polikarpova 2020; Shi et al. 2020]. Thistechnique was originally used for enumerating expressions; we build ISGs on top of it to develop anefficient algorithm for enumerating assignments to multiple variables and introducing conditionals.

Live Execution. LooPy’s live execution uses concepts from live evaluation introduced by Omar et al.[2019] and further adapted by Lubin et al. [2020], such as evaluating around holes and pausing theevaluation at holes that cannot be executed. Live execution is adapted from a functional domain toan imparative one, and employs no logic for resolving holes, deferring instead to an oracleÐthe user.

ACKNOWLEDGMENTS

We thankMichael B. James and Shravan Narayan for their invaluable feedback on earlier iterationsof the LooPy UI design. This work was supported by the National Science Foundation under GrantsNo. 1911149, 1943623, 1955457, and 2107397.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 28: LooPy: Interactive Program Synthesis with Control Structures

153:28 Kasra Ferdowsifard, Shraddha Barke, Hila Peleg, Sorin Lerner, and Nadia Polikarpova

REFERENCES

Aws Albarghouthi, Sumit Gulwani, and Zachary Kincaid. 2013. Recursive program synthesis. In International conference on

computer aided verification. Springer, 934ś950. https://doi.org/10.1007/978-3-642-39799-8_67

Rajeev Alur, Pavol Cerný, and Arjun Radhakrishna. 2015. Synthesis Through Unification. In Computer Aided Verification -

27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part II. 163ś179.

RajeevAlur, Arjun Radhakrishna, andAbhishekUdupa. 2017. Scaling Enumerative Program Synthesis via Divide andConquer.

In Tools and Algorithms for the Construction and Analysis of Systems - 23rd International Conference, TACAS 2017, Held as

Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22-29, 2017,

Proceedings, Part I. 319ś336. https://doi.org/10.1007/978-3-662-54577-5_18

Shraddha Barke, Hila Peleg, and Nadia Polikarpova. 2020. Just-in-Time Learning for Bottom-up Enumerative Synthesis. Proc.

ACM Program. Lang. 4, OOPSLA, Article 227 (Nov. 2020), 29 pages. https://doi.org/10.1145/3428295

Shaon Barman, Rastislav Bodik, Satish Chandra, Emina Torlak, Arka Bhattacharya, and David Culler. 2015. Toward Tool

Support for Interactive Synthesis. In 2015 ACM International Symposium on New Ideas, New Paradigms, and Reflections on

Programming and Software (Onward!) (Pittsburgh, PA, USA) (Onward! 2015). Association for Computing Machinery, New

York, NY, USA, 121ś136. https://doi.org/10.1145/2814228.2814235

Rohan Bavishi, Caroline Lemieux, Roy Fox, Koushik Sen, and Ion Stoica. 2019. AutoPandas: Neural-Backed Generators for

ProgramSynthesis. Proc. ACMProgram. Lang. 3, OOPSLA,Article 168 (Oct. 2019), 27 pages. https://doi.org/10.1145/3360594

Sarah E. Chasins, Maria Mueller, and Rastislav Bodik. 2018. Rousillon: Scraping Distributed Hierarchical Web Data. In

Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology (Berlin, Germany) (UIST ’18).

Association for Computing Machinery, New York, NY, USA, 963ś975. https://doi.org/10.1145/3242587.3242661

Ian Drosos, Titus Barik, Philip J. Guo, Robert DeLine, and Sumit Gulwani. 2020. Wrex: A Unified Programming-by-Example

Interaction for Synthesizing Readable Code for Data Scientists. In Proceedings of the 2020 CHI Conference on Human Factors

in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for ComputingMachinery, New York, NY, USA, 1ś12.

https://doi.org/10.1145/3313831.3376442

YuFeng, RubenMartins,Osbert Bastani, and IsilDillig. 2018. ProgramSynthesisUsingConflict-DrivenLearning. InProceedings

of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (Philadelphia, PA, USA) (PLDI

2018). Association for Computing Machinery, New York, NY, USA, 420ś435. https://doi.org/10.1145/3192366.3192382

Yu Feng, RubenMartins, YuepengWang, Isil Dillig, and ThomasW. Reps. 2017. Component-Based Synthesis for Complex

APIs. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (Paris, France) (POPL

2017). Association for Computing Machinery, New York, NY, USA, 599ś612. https://doi.org/10.1145/3009837.3009851

KasraFerdowsifard,AllenOrdookhanians,HilaPeleg, SorinLerner, andNadiaPolikarpova. 2020. Small-StepLiveProgramming

by Example. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST ’20).

Association for ComputingMachinery, New York, NY, USA, 614ś626. https://doi.org/10.1145/3379337.3415869 event-

place: Virtual Event, USA.

John K. Feser, Swarat Chaudhuri, and Isil Dillig. 2015. Synthesizing Data Structure Transformations from Input-Output

Examples. SIGPLAN Not. 50, 6 (June 2015), 229ś239. https://doi.org/10.1145/2813885.2737977

Joel Galenson, Philip Reames, Rastislav Bodik, Björn Hartmann, and Koushik Sen. 2014. CodeHint: Dynamic and Interactive

Synthesis of Code Snippets. In Proceedings of the 36th International Conference on Software Engineering (Hyderabad, India)

(ICSE 2014). Association for ComputingMachinery, NewYork, NY, USA, 653ś663. https://doi.org/10.1145/2568225.2568250

Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam Venkatesan. 2011. Synthesis of Loop-Free Programs. In

Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (San Jose,

California, USA) (PLDI ’11). Association for Computing Machinery, New York, NY, USA, 62ś73. https://doi.org/10.1145/

1993498.1993506

Tihomir Gvero, Viktor Kuncak, Ivan Kuraj, and Ruzica Piskac. 2013. Complete Completion Using Types and Weights.

In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (Seattle,

Washington, USA) (PLDI ’13). Association for Computing Machinery, New York, NY, USA, 27ś38. https://doi.org/10.1145/

2491956.2462192

Michael B. James, Zheng Guo, ZitengWang, Shivani Doshi, Hila Peleg, Ranjit Jhala, and Nadia Polikarpova. 2020. Digging for

Fold: Synthesis-Aided API Discovery for Haskell. Proc. ACM Program. Lang. 4, OOPSLA, Article 205 (Nov. 2020), 27 pages.

https://doi.org/10.1145/3428273

Etienne Kneuss, Ivan Kuraj, Viktor Kuncak, and Philippe Suter. 2013. Synthesis modulo Recursive Functions. In Proceedings

of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications

(Indianapolis, Indiana, USA) (OOPSLA ’13). Association for Computing Machinery, New York, NY, USA, 407ś426. https:

//doi.org/10.1145/2509136.2509555

K. Rustan M. Leino and Aleksandar Milicevic. 2012. Program Extrapolation with Jennisys. In Proceedings of the ACM

International Conference on Object Oriented Programming Systems Languages and Applications (Tucson, Arizona, USA)

(OOPSLA ’12). Association for Computing Machinery, New York, NY, USA, 411ś430. https://doi.org/10.1145/2384616.

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.

Page 29: LooPy: Interactive Program Synthesis with Control Structures

LooPy: Interactive Program Synthesis with Control Structures 153:29

2384646

Sorin Lerner. 2020. Projection Boxes: On-the-fly Reconfigurable Visualization for Live Programming (CHI ’20). ACM.

https://doi.org/10.1145/3313831.3376494

Justin Lubin, Nick Collins, Cyrus Omar, and Ravi Chugh. 2020. Program Sketching with Live Bidirectional Evaluation. Proc.

ACM Program. Lang. 4, ICFP, Article 109 (Aug. 2020), 29 pages. https://doi.org/10.1145/3408991

David Mandelin, Lin Xu, Rastislav Bodík, and Doug Kimelman. 2005. Jungloid Mining: Helping to Navigate the API Jungle. In

Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (Chicago, IL, USA)

(PLDI ’05). Association for Computing Machinery, New York, NY, USA, 48ś61. https://doi.org/10.1145/1065010.1065018

Gayle LaakmannMcDowell. 2015. Cracking the coding interview: 189 programming questions and solutions; 6th ed. CareerCup,

Palo Alto, CA. https://cds.cern.ch/record/2669252

Cyrus Omar, Ian Voysey, Ravi Chugh, and Matthew A. Hammer. 2019. Live Functional Programming with Typed Holes. Proc.

ACM Program. Lang. 3, POPL, Article 14 (Jan. 2019), 32 pages. https://doi.org/10.1145/3290327

Peter-Michael Osera and Steve Zdancewic. 2015. Type-and-Example-Directed Program Synthesis. SIGPLAN Not. 50, 6 (June

2015), 619ś630. https://doi.org/10.1145/2813885.2738007

Hila Peleg, Roi Gabay, Shachar Itzhaky, and Eran Yahav. 2020. Programmingwith a Read-Eval-Synth Loop. Proc. ACMProgram.

Lang. 4, OOPSLA, Article 159 (Nov. 2020), 30 pages. https://doi.org/10.1145/3428227

Hila Peleg and Nadia Polikarpova. 2020. Perfect Is the Enemy of Good: Best-Effort Program Synthesis. In 34th European

Conference onObject-Oriented Programming (ECOOP 2020) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 166),

Robert Hirschfeld and Tobias Pape (Eds.). Schloss DagstuhlśLeibniz-Zentrum für Informatik, Dagstuhl, Germany, 2:1ś2:30.

https://doi.org/10.4230/LIPIcs.ECOOP.2020.2

Nadia Polikarpova, Ivan Kuraj, and Armando Solar-Lezama. 2016. Program Synthesis from Polymorphic Refinement Types. In

Proceedings of the 37th ACMSIGPLANConference on Programming Language Design and Implementation (Santa Barbara, CA,

USA) (PLDI ’16). Association for Computing Machinery, New York, NY, USA, 522ś538. https://doi.org/10.1145/2908080.

2908093

Kensen Shi, David Bieber, and Rishabh Singh. 2020. TF-Coder: Program Synthesis for Tensor Manipulations. arXiv preprint

arXiv:2003.09040 (2020).

Kensen Shi, Jacob Steinhardt, and Percy Liang. 2019. FrAngel: Component-Based Synthesis with Control Structures. Proc.

ACM Program. Lang. 3, POPL, Article 73 (Jan. 2019), 29 pages. https://doi.org/10.1145/3290386

Calvin Smith and Aws Albarghouthi. 2016. MapReduce Program Synthesis. SIGPLAN Not. 51, 6 (June 2016), 326ś340.

https://doi.org/10.1145/2980983.2908102

Sunbeom So andHakjooOh. 2017. Synthesizing imperative programs from examples guided by static analysis. In International

Static Analysis Symposium. Springer, 364ś381. https://doi.org/10.1007/978-3-319-66706-5_18

Abhishek Udupa, Arun Raghavan, Jyotirmoy V. Deshmukh, Sela Mador-Haim, Milo M.K. Martin, and Rajeev Alur. 2013.

TRANSIT: Specifying Protocols with Concolic Snippets. SIGPLAN Not. 48, 6 (June 2013), 287ś296. https://doi.org/10.1145/

2499370.2462174

ChenglongWang, Alvin Cheung, and Rastislav Bodik. 2017. Synthesizing Highly Expressive SQL Queries from Input-Output

Examples. SIGPLAN Not. 52, 6 (June 2017), 452ś466. https://doi.org/10.1145/3140587.3062365

Zijiang Yang, Jinru Hua, KaiyuanWang, and Sarfraz Khurshid. 2018. EdSynth: Synthesizing API sequences with conditionals

and loops. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). IEEE, 161ś171.

https://doi.org/10.1109/ICST.2018.00025

Proc. ACM Program. Lang., Vol. 5, No. OOPSLA, Article 153. Publication date: October 2021.