In 1969, McCarthy and Hayes tackled the problem of making agents that can formulate strategies to complete goals. The problem has two parts: representing the state of the world at various moments in time, and searching for a sequence of actions whose final world state satisfies the goal. Like good software engineers, they aspired to decouple the parts, and had a clever idea. They formalized in firstorder logic
This solved the first half of the problem, and now the second problem could be solved by a generic theorem prover. Unfortunately, in practice, formalization #3 ended up being really large.
We were obliged to add the hypothesis that if a person has a telephone, he still has it after looking up a number in the telephone book. If we had a number of actions to be performed in sequence, we would have quite a number of conditions to write down that certain actions do not change the values of certain fluents [fluent = a proposition about the world which changes over time]. In fact, with
n
actions andm
fluents, we might have to write downn*m
such conditions.
They called this problem of n*m
blowup the frame problem, but made the mistake of including the word philosophical in the title of their paper, provoking AI doomsayers to cite it as yet another example of why computers could never think like humans. The discussion became more interesting when Daniel Dennett directed the attack away from the AI researches and toward the philosophers. He caricatured epistemology as a comically profound but very incomplete theory, because for thousands of years, no one had ever noticed the frame problem.
… it is turning out that most of the truly difficult and deep puzzles of learning and intelligence get kicked downstairs by this move [of leaving the mechanical question to some dimly imagined future research]. It is rather as if philosophers were to proclaim themselves expert explainers of the methods of a stage magician, and then, when we ask them to explain how the magician does the sawingtheladyinhalf trick, they explain that it is really quite obvious: the magician doesn’t really saw her in half; he simply makes it appear that he does. ‘But how does he do that?’ we ask. ‘Not our department’, say the philosophers – and some of them add, sonorously: ‘Explanation has to stop somewhere.’
Some philosophers and AI researches argued that the original mistake leading to the frame problem was McCarthy and Hayes choosing firstorder logic for world representation. Their case is easily made with the Tweety Bird problem: The premises
can prove both
Clearly premise 1 is too strong, but attempting to modify firstorder logic to support most statements instead of all statements breaks monotonicity: Under mostenabling logic, premises 1, 2, 3 would prove 5, but premises 1, 2, 3, 4 would prove 6. An agent learning premise 4 would change its mind from conclusion 5 to conclusion 6. This is, of course, the desired behavior, but dropping the stability of truth means the agent can no longer use a generic theorem prover. The agent is using a modified logic system, and so it must use a specialized theorem prover. The question becomes: which logic system to use?
In standard firstorder logic, every proposition is either true, false, or unknown. Learning new information can only ever change the status of unknown statements. To solve the tweety bird problem, a logic must enable assuming unknowns as false until proven otherwise (closedworld assumption). The symbolic AI community eventually converged on circumscription, which is a logic that assumes particular propositions to be false until proven otherwise.
McCarthy updated his situation calculus by circumscribing the proposition Abnormal, allowing him to formalize Most birds fly as All birds fly unless they are abnormal and adding the premise Brokenwinged creatures are abnormal. Since the Abnormal proposition is assumed to be false until proven otherwise, Tweety is assumed to be a normal flying bird until the agent learns that Tweety has a broken wing.
Shanahan took a timeoriented approach instead. In his circumscriptive event calculus, he circumscribed Initiates and Terminates, so he could formalize Most birds fly as All birds can fly at birth and he could replace All brokenwinged creatures cannot fly with Breaking a wing Terminates the flying property. Since the Terminates proposition is assumed to be false until proven otherwise, Tweety’s birth state (capable of flight) is assumed to persist until the agent learns that Tweety’s wing was broken.
Personally I find circumscription unsatisfying. To me, the most obvious answer for “How do you turn ‘all’ into ‘most’?” is probability theory. As E. T. Jaynes showed, logic is merely a special case of probability theory (in which all of the probabilities are 0 or 1), so the jump from logic to probability theory seems more natural to me than circumscription. I am not alone in thinking this, of course. Many people attempted to solve the frame problem using probability theory, but as Pearl showed in 1988 regarding the Yale Shooting Problem, probability theory can never be enough, because it cannot describe counterfactuals, and thus cannot describe causality.
But that limitation disappeared in 1995, when Pearl figured out how to generalize probability theory. He discovered a complete set of axioms for his “calculus of causality”, which distinguishes between observed conditional variables and intervened conditional variables.
Logic > Probability Theory > Calculus of Causality (wow!)
According to the linked paper, the circumscriptive event calculus and Thielscher’s fluent calculus have adequately solved the frame problem. But I still wonder, has anyone reattempted a solution using the calculus of causality?
]]>1 2 3 4 5 6 7 8 9 10 11 

After static differentiation, the code becomes:^{1}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 

When optimizations are applied, grandTotal'
becomes the implementation
that a programmer would have written:
1 2 3 4 5 6 7 8 

In this case, the resulting grandTotal'
makes no reference to the original multisets at all.
The authors of the paper call this “selfmaintainability”, by analogy to selfmaintainable
views in databases.
The problem of infering redis update operations from database update operations, then, is simply a matter of differentiating and then optimizing the cache schema. (“Cache schema” is the mapping from redis keys to the database queries that populate those keys.) The mappings whose derivatives are selfmaintainable can be translated into redis commands.
Here is the source transformation described in the paper:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 

Returning to an example from the first post:
1 2 3 4 5 6 7 8 

The derivative is
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 

In the case of an insert, we have
1


which means that userIds'
can be reduced to
1 2 

^{1}: I’m being a little imprecise when I define the derivative of a type as another type, since the type of the derivative can vary depending on the value. The derivative of 3 is all integers from 3 to positive infinity, not all integers.
]]>IORef
s and monadic
state. The interpreter below uses a completely different tactic: exploiting
unsafeInterleaveIO
. All function arguments are evaluated “right away”, but in the
context of an unsafeInterleaveIO
(so, in fact, they are actually not evaluated
right away). With this hack, we get to write an interpreter which looks
like an interpreter for a strict functional language, but actually
behaves lazily (by lifting haskell’s own lazy semantics into our interpreter).
]]>But that was not the only benefit! It turns out that having extra constructs also makes the bindingtime analysis easier. (Bindingtime analysis is the task of figuring out which parts of a program are static and which are dynamic for a given partial input.) An obvious example is booleans. Using churchencoded booleans is more minimal than having primitive booleans and an ifthenelse construct, but analyzing the former is harder, since it requires analysis of higherorder functions, which usually requires writing a typeinference algorithm. Maps are another example. Lispstyle association lists seem like a natural approach, but, unless you do some very sophisticated analysis, the specializer will fail to recognize when the keys are static and the values are dynamic, and so appromixate to marking the entire data structure as dynamic (which usually kills optimality). By making maps a primitive in the language, you can code especially for that scenario.
For anybody interested in partial evaluation, I highly recommend the Jones, Gomard, and Sestoft book. It is extremely lucid in its exposition, not only of partial evaluation, but of many other analysis and transformational techniques. For instance, a year or so ago I was trying to understand abstract interpretation, but I could not find a succinct explanation of the algorithm anywhere. It turns out they provide one in chapter 15. They do it in only five pages, most of which is examples. Another example is supercompilation, which was opaque to me until I read Neil Mitchell’s excellent paper on Supero. But if he hadn’t written it, I could have turned to chapter 17 of the book, which incidentally also covers deforestation in the same breath. I think the only computer science book which I have revisited more frequently than this one is Norvig and Russell’s book on artificial intelligence. Pierce’s Types and Programming Languages is a close 3rd.
]]>In my previous post, I demonstrated that a
library could infer cache update operations from database insert operations by performing
algebraic manipulations on the queries that define the cache keys. The algebraic
laws needed were the distribution laws between monoids. e.g. count
distributes
over the Set
monoid to produce the Sum
monoid. A library could also
infer the arguments of the cache keys (e.g. taskIds.{userId} > taskIds.65495
) by
performing functional logical evaluation on the cache key’s query. If the library’s goal
became suspended during evaluation, it could proceed by unifying expressions
of low multiplicity with all possible values. For instance, if the goal for a filter
query became suspended, the library could proceed by considering the true
and
false
cases of the filter separately.
In this post I would like to talk about sorting and limiting, as well as flesh out some of the data structures that might be used in an automatic redis library.
Set
is the simplest data structure,
and forms the foundation for two of our other collection types.
1


The monoidal operation for Set
is simply set union.
List
is a Set
with an embedded sorting function. Tracking the sorting function
enables us to compute redis sorted set keys if necessary.
1


A commonly used sorting function would be x => x.modifiedDate
.
The monoidal operation for List
is the merge operation from mergesort, with
one restriction: the sorting functions of both lists must be the same
sorting function.
LimitedList
is a List
with an upper bound on its size.
1


The length of the contained List
must be less than or equal to the upper bound.
Tracking the length enables us to know how to trim cache entries, e.g.
when using the ZREMRANGEBYRANK command.
The monoidal operation for LimitedList
is to mergesort the two lists and truncate
the result to the limit. Similarly to List
, the library expects both lists to have
the same
upper limit.
First
and Last
are essentially LimitedList
s whose upper bound is 1
. Making
specialized types for singleton LimitedLists makes working with noncollection redis
data structures easier.
1 2 

Although First
and Last
have the same representation, they have different monoidal
operations, namely (x,y) => x
and (x,y) => y
.
The Maybe
type is useful for queries that always generate a unique result (such
as lookup by primary key), and as such the Maybe
type
does not need to contain a sorting function.
1


The monoidal operation is to pick Just
over Nothing
, but with the restriction
that both arguments cannot be Just
s.
1 2 3 4 5 

Collision of Just
s can happen if the application developer misuses the The
operation
(defined below). Unfortunately this error cannot be caught by an automatic redis
library, because
the library never actually computes the value of mappend
. The library only
tracks monoidal types so that it can know what the final redis commands will
be.
Speaking of query operations, it’s about time I defined them. But first… one more monoid.
1 2 3 4 

Query operations are parameterized over an input type and an output type.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 

A few more data structures and we will have all the pieces necessary for an application developer to define a cache schema.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 

Putting it all together, we can showcase the cache schema for a simple task management website.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 

It’s important to keep in mind that although I have made the above code look
like haskell, no library in haskell could actually use the above code. The variables
occuring after the $=
sign are logic variables, not function parameters. An
EDSL could get close to something like the above, but the normal types for
==
and &&
are unusable, and the lambdas inside the Where
clauses
would need to be reified anyway.
Still to come: deletes, updates, uniqueness constraints (maybe?), and psuedocode for the generation of redis commands.
]]>These are some initial thoughts on how to automate cache updates. The question I want to answer is this: given a mapping from redis keys to the queries that produce their values, how can I infer which redis commands should be run when I add, remove, and update items in the collections which are my source of truth?
The code in this post is psuedohaskell. What appears to the left of an =
sign is not
always a function, and the .
is used for record field lookup as well as function
composition.
I’ll start with a simple example. Suppose I run a website which is a task manager, and
I want to display on my website the number of users who
have signed
up for an account. i.e. I want to display count users
. I don’t want to count the entire collection
every time I add an item to it, so instead I keep the count in redis, and increment it whenever
a new account is created. Proving that INCR is the right command
to send to redis is straightforward:
1 2 3 4 5 

Notice that when count
distributes, it changes the plus operation from union (++
) to
addition (+
).
Here is a similar example, this time storing the ids instead of a count.
1 2 3 4 5 6 

Obviously the appropriate redis command to use in this case is SADD.
Filtering is also straightforward.
1 2 3 4 5 6 7 8 9 10 11 12 

Obviously a pipeline of SADD
s will be correct, and the expression to the right
of the ++
gives my automatic cache system a procedure for determining which SADD
operations to perform. When the cache system gets the user object to be added, it
will learn that
the number of SADD
operations is either
zero or one, but it doesn’t have to know that ahead of time.
A computer can easily verify the above three proofs, as long as they are properly annotated. But can I get the computer to create the proof in the first place?
Rewriting the activeUserIds example to use function composition suggests one approach.
1 2 

In general, it seems that queries of the form
1


become
1


provided f, g, h, etc. all distribute over mappend
. The actual value of mappend
will determine
which redis operation to perform. Integer addition becomes INCR
, set union becomes SADD
,
sorted set union becomes ZADD, list concatenation becomes
LPUSH or RPUSH, etc. An
important monoid which may not be obvious is the Last
monoid (mappend x y = y
), which becomes SET.
So much for updates on constant cache keys. Parameterized cache keys are much more interesting.
On my task manager website, I want to have one cache entry per user. The user’s id will determine the cache key that I use.
1


It’s tempting to think of this definition as a function:
1


But an automatic caching system will not benefit from this perspective.
From it’s perspective, the
input is a task object, and the output is any number of redis commands. The system has to implicitly
discover the userId
from the task object it receives. The userId
parameter of taskIds.{userId}
is therefore more like a logic variable (e.g. from prolog) than a variable in imperative or functional
languages.
The monoidal shortcut rule is still valid for parameterized redis keys.
1 2 

The caching system does not need to reduce this expression further, until it receives the task object. When it does, it can evaluate the addend as an expression in a functionallogical language (similar to Curry).
1 2 3 4 5 6 7 8 9 10 11 12 

Unfortunately at this point the goal becomes suspended. The cache system
can cheat a little by unifying
task.owner == userId
with True
and False
.
In the true case, userId
unifies with task.owner
, which I’ll say is 65495:
1 2 3 4 5 6 7 8 9 10 11 12 13 

In the false case, userId
remains unbound, but that’s ok, because the expression reduces to a noop:
1 2 3 4 5 6 7 8 9 10 

In general, whenever the cache system’s goals become suspended, it can resume narrowing/residuation by picking a subexpression with low multiplicity (e.g. booleans, enums) and nondeterministically unifying it with all possible values.
Most of the time, each unification will result in either a noop, or a redis command with all parameters bound. An exception (are there others?) is queries which affect an inifinite number of redis keys, e.g. caching all tasks that do NOT belong to a user.
1


This is clearly a bug, so the caching system can just log an error and perform no cache updates. It may even be possible for the caching system to catch the bug at compile time by letting the inserted entity (e.g. a task) be an unbound variable, and seeing if a nondegenerate redis command with unbound redis key parameters can be produced.
This post has focused mostly on inserts and queries that fit the monoidal pattern. In another post I’ll take a look at deletes and queries which are not so straightforward.
]]>In my day to day work, there are approximately five things that take up most of my time. Ordered from most time consuming to least time consuming, they are:
So the obvious place to start is reducing my time spent debugging. The best way to reduce debugging time is to avoid doing it in the first place, and I’ve accomplished this a number of ways. From best to worst:
(To the weenies who are angry at me for putting unit tests at the bottom: it’s only because I hit the point of diminishing returns once I’ve applied the other approaches. I found writing unit tests in ruby to be enormously helpful, because ruby is neither statically typed nor does it have smart editors. But when I’m writing scala in IntelliJ, the type system and the editor catch so many of my bugs that there’s usually nothing left for the unit tests to find. I still write unit tests, but they provide more value in discovering regressions than in discovering bugs the first time around.)
Despite using all these approaches, debugging still takes up more of my time than the actual writing of the code. The only exception has been haskell, but I don’t use haskell at work.
My approaches are fairly standard, but a few days ago I discovered an approach that I haven’t heard described elsewhere. I was practicing the habit of “noticing when I’m surprised”. Being frequently surprised is bad because it means I’m not learning. I noticed that sometimes when I ran my programs, they did not behave the way I expected. i.e. I was surprised.
How could I stop being surprised? I decided to start documenting my surprises. I created a document with a table of two columns. In the left column I would record each surprise: what I did, what I expected to happen, and what actually happened. In the right column I would record the resolution (once I had finished debugging it), and why my expectations were wrong in the first place.
I was hoping that after doing this for a few days, I would have enough data to find the persistent errors in my thinking. But something pleasant happened before I got that far!
I have not been very disciplined about this. I have only remembered to document my surprises twice since I started this experiment, and I almost missed the second one. I was about to bust out the printlns and the debugger before I caught myself. Although it felt tedious, I opened up my document and wrote down what I did, what I expected to happen, and what actually happened. When I added that last part, it suddenly hit me what my mistake was. No debugging necessary! Apparently the very act of articulating the difference between my expectations and reality was sufficient for me to recognize the error in my thinking (and my coding).
Perhaps it was a fluke. Perhaps the reason would have come to me anyway. But I am now definitely motivated to continue this experiment.
]]>entity
, predicate
, operation
, and realize
.
The entity
directive gives the “platonic” description of a type.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 

The predicate
directive tells how these types are represented. For example, to represent the user object in a relational
database:
1 2 3 4 5 6 7 

or in a key value store:
1 2 3 4 5 6 7 8 

You can also use predicates to specify how types are embodied in classes.
1 2 3 4 

Notice that I left out userName
; classes do not have to be perfectly aligned with the platonic entities. You can even combine
different entities into a single class. For example, imagine a Java class like this:
1 2 3 4 5 6 

Even though tasks and comments are separate entites, you can still map between them and the task class:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 

It’s a little crazy, but it could be made simpler with a library function and/or syntactic sugar saying “this embodied list matches this list of entites”. I just wanted to give you some idea of how flexible I want this language to be.
The operation
directive gives names to operations that might be performed on the entities.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

The realize
directive indicates how operations will be realized using concrete classes.
1 2 3 4 5 

Compiling would generate a code block for each realize
directive. It would fail if any of the operations were impossible.
(e.g. getTasksForUser
would be impossible for a keyvalue store if you had stored only
Task => [User]
pairs and forgotten the User => [Taks]
pairs. It would generate a warning if any of the operations were slow.
(e.g. getCommentsForTask
on an ordered keyvalue store when the comments were indexed by commentId
and not by $taskId:$commentId
)
So, does a language like this already exist? I know there are several things that come close, ORMs being the obvious example. Most ORMs require you to build schemas according to THEIR rules, not your own rules, and the exceptional ones require you to write custom code, usually 4 different times, for the get, set, update, and delete cases, when the representation is anything nonstandard.
I want something that can handle
UserTask
in a sharded database, the code generated for createUserTask
should do two inserts.comment1
column of the row for that task, the second comment in the comment2
column, etc.unknown
.Since it seems really useful, I would love to write this language, but honestly, I don’t even know where to begin.
Conceptually, how do you translate quantified logic into imperative code? What would abstraction look like in this language?
(e.g. Can I make a listEqualsList
function?) Outside of the entity
/predicate
/operation
/realize
directives, what
primitives would I need to provide so that other people can write modules for their favorite pet database?
1 2 3 4 5 

And I’m trying to find a twocoloring for it. (i.e. I want to color each of the nodes black or white in such a way that directly connected nodes have opposite colors.)
Obviously any realistic constraint solver is going to solve this problem in linear time, since any assignment causes a propagation to the rest of the graph.
(e.g. A
being black causes B
, C
, and D
to be white which causes E
to be black, F
to be white, G
, H
, and I
to be black, and
J
to be white.)
But suppose (since this is just an illustration) my constraint solver doesn’t maintain arc consistency, but it does do some kind of
constraint learning. Also, suppose that I already know some of the symmetry in this problem.
In particular, I know that [A, B, C, D, E]
is symmetric with [F, G, H, I, J]
.
(The constraint solver doesn’t have to discover this symmetry; I know it in advance.)
The constraint solver might learn at some point that A == E
, because it combines the constraint A != B
with B != E
.
It would be a shame if the constraint solver later also learned that F == J
. It would be nice if it could learn
F == J
at the same time that it learns A == E
, since I have told it about the symmetry of the problem.
Notice that the learning is valuable even though the two halves of the problem have different assignments. (If A
is black, then F
is white.)
How can a constraint solver make these kind of inferences?
Here’s my current solution:
A constraint satisfaction problem is a collection of variables and constraints. We declare an ordered subset X
of variables as
isomorphic to a subset Y
of variables if for every constraint involving only variables in X there is an identical constraint involving the
corresponding variables in Y
. (Constraints involving variables both in X
and around X
are not required to have a corresponding constraint.)
It follows that if X
and Y
are isomorphic, then their corresponding subsets must also be isomorphic.
Whenever a constraint solver learns a constraint, it can add all of the isomorphic constraints to its collection of learned constraints. There might even be a space optimization here, if I can find an appropriate lazy data structure, e.g. by allowing “abstract” constraints in the solver’s collection of learned constraints. The hard part is figuring out how to do watched literals.
Has this problem already been tackled?
Original post: https://www.reddit.com/r/artificial/comments/18rttb/symmetric_constraint_learning/
]]>I think it is safe to say that the programming language of the future, if it exists at all, will involve some kind of artificial intelligence. This post is about why I think that theorem provers will be standard in languages of the future.
1


This simple function takes two arguments. The first is a predicate
distinguishing between desirable (True
) and undesirable (False
) values for A.
The second is a size restriction on A (e.g. number of bytes).
The function returns a random value of A, if one exists, meeting two constraints:
Also, the solve
function is guaranteed to terminate whenever the predicate
terminates.
First I will try to convince you that the solve
function is more important than any of your petty opinions about syntax, objectorientation, type theory, or macros. After that I will make a fool of myself by explaining how to build the solve
function with today’s technology.
It can find fixpoints:
1 2 3 4 5 6 7 

It can invert functions:
1 2 3 4 

It can solve Project Euler problems:
1 2 3 4 5 6 7 8 9 10 

It can check that two functions are equal:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 

So it’s useful for detecting the introduction of bugs when you are optimizing things.
In fact, the solve function can find a more efficient implementation on your behalf.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

The speed check is crude, but the idea is there.
Keeping the size constraint reasonable prevents the solve
function from just creating a giant table
mapping inputs to outputs.
Curry and Howard tell us that
programs and proofs are one and the same thing. If our solve
function can generate programs, then it
can also generate mathematical proofs.
1 2 3 4 5 6 7 8 9 10 11 

If the proof is ugly, we can decrease the search size, and we will get a more elegant proof.
The solve
function can find bugs:
1 2 3 4 5 

The solve
function will never get people to stop arguing, but it will at least change the dynamic
vs static types argument from a pragmatic one to an artistic one.
One last example:
Testdriven development advocates writing tests which are sufficient to construct the missing parts of a program. So why write the program at all?
1 2 3 4 5 6 7 8 

In fact, unit_tests can be replaced with any assertion about the desired program: e.g. that it type checks under HindleyMilner, that it terminates within a certain number of steps, that it does not deadlock within the first X cycles of the program’s execution, and so on.
Are you excited yet? Programming in the future is awesome!
Always start with the obvious approach:
1 2 3 4 5 

Correct, but useless. If the predicate consisted of only one floating point operation, the Sequoia supercomputer would take 17 minutes to solve a mere 8 bytes.
The complexity of solve
is clear. The variable num
can be nondeterministically chosen from the range in
linear time (size * 8
), decode takes linear time, and predicate takes polynomial time in most of
our examples from above. So solve
is usually in NP, and no worse than NPcomplete as long as
our predicate is in P.
It’s a hard problem. Were you surprised? Or did you get suspicious when the programmers of the future started exemplifying godlike powers?^{1}
Thankfully, a lot of work has been put into solving hard problems.
Today’s sat solvers can solve problems with 10 million variables. That’s 1.2 megabytes of search space, which is large enough for almost all of the examples above, if we’re clever enough. (The Kadane example is the definite exception, since the predicate takes superpolynomial time.)
The CookLevin theorem gives us a
procedure for writing the solve
function more efficiently.
True
.I call this approach “solving the interpreter trace” because the imaginary processors act as an interpreter for the predicate, and we ask the sat solver to trace out the processor execution.
The approach is elegant, but it has three major problems:
We can get rid of these problems if we compile our predicate directly into a boolean formula. Compilation is easy enough if our predicate contains neither loops nor conditionals.
1 2 3 4 5 6 

becomes
1 2 3 4 5 

Actually conditionals aren’t that hard either
1 2 3 4 5 6 7 8 

becomes
1 2 3 4 5 

A sat solver would immediately assign w2
the value 0
. If we were solving over an interpretational
trace, w2
wouldn’t be a single variable, but would be one of two variables depending on whether
b
was True
or False
.
By compiling the predicate, we have enabled the solver to work from end to beginning (if it so chooses).
Can we handle loops?
1 2 3 4 5 6 7 8 

One approach is to unroll the loop a finite number of times.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 

With branching and conditionals, we are turing complete. Function calls can be inlined up until recursion. Tail recursive calls can be changed to while loops, and the rest can be reified as loops around stack objects with explicit push and pop operations. These stack objects will introduce symmetry into our sat formulas, but at least it will be contained.
When solving, we assume the loops make very few iterations, and increase our unroll depth as that assumption is violated. The solver might then look something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 

max_unroll_count
does static analysis to figure out the maximum number of
unrolls that are needed. The number of unrolls will either be a constant
(and so can be found out by doing constant reduction within the predicate), or it
will somehow depend on the size of the predicate argument (and so an upper bound can be found by
doing inference on the predicate).
The solver is biased toward finding solutions that use fewer loop iterations, since each loop
iteration sets another boolean variable to 1
, and thus cuts the solution space down by half.
If the solver finds a solution, then we return it. If not, then we try again, this time allowing
_longer_loop_needed
to be true. If it still can’t find a solution, then we know no solution
exists, since i
and j
were set to arbitrary values. By “arbitrary”, I mean that, at compilation
time, no constraints will connect the later usages of i
and j
(there are none in this example)
with the earlier usages.
I admit that this approach is ugly, but the alternative, solving an interpreter trace, is even more expensive. The hacks are worth it, at least until somebody proves P == NP.
Some of the examples I gave in the first section used eval. Partial evaluation techniques can be used to make these examples more tractable.
I’ve only talked about sat solvers. You can probably get better results with an smt solver or a domainspecific constraint solver.
In thinking about this problem, I’ve realized that there are several parallels between compilers and sat solvers. Constant reduction in a compiler does the same work as the unit clause heuristic in a sat solver. Dead code removal corresponds to early termination. Partial evaluation reduces the need for symmetry breaking. Memoization corresponds to clause learning. Is there a name for this correspondance? Do compilers have an analogue for the pure symbol heuristic? Do sat solvers have an analogue for attribute grammars?
If you want to use languages which are on the evolutionary path toward the language of the future, you should consider C# 4.0, since it is the only mainstream language I know of that comes with a builtin theorem prover.
Update (20131124):
I am happy to report that I am not alone in having these ideas. “Searchassisted programming”, “solver aided languages”, “computer augmented programming”, and “satisfiability based inductive program synthesis” are some of the names used to describe these techniques. Emily Torlak has developed an exciting language called Rosette, which is a dsl for creating solver aided languages. Ras Bodik has also done much work combining constraint solvers and programming languages. The ExCAPE project focuses on program synthesis. Thanks to Jimmy Koppel for letting me know these people exist.
^{1}: Even many computer scientists do not seem to appreciate how different the world would be if we could solve NPcomplete problems efficiently. I have heard it said, with a straight face, that a proof of P = NP would be important because it would let airlines schedule their flights better, or shipping companies pack more boxes in their trucks! One person who did understand was Gödel. In his celebrated 1956 letter to von Neumann, in which he first raised the P versus NP question, Gödel says that a linear or quadratictime procedure for what we now call NPcomplete problems would have “consequences of the greatest magnitude.” For such a procedure “would clearly indicate that, despite the unsolvability of the Entscheidungsproblem, the mental effort of the mathematician in the case of yesorno questions could be completely replaced by machines.” But it would indicate even more. If such a procedure existed, then we could quickly find the smallest Boolean circuits that output (say) a table of historical stock market data, or the human genome, or the complete works of Shakespeare. It seems entirely conceivable that, by analyzing these circuits, we could make an easy fortune on Wall Street, or retrace evolution, or even generate Shakespeare’s 38th play. For broadly speaking, that which we can compress we can understand, and that which we can understand we can predict. — Scott Aaronson
]]>1 2 3 4 5 6 7 8 9 10 

The solution has to do with the implementation of the for loop in python. (I ran the program in cpython; it may be interesting to see what other implementations of python do.) Rather than creating a new binding for the num variable on every iteration of the loop, the num variable is mutated (probably for efficiency or just simplicity of implementation). Thus, even though numclosures is filled with distinct anonymous functions, they both refer to the same instance of num.
I tried writing similar routines in other languages. Ruby and C# do the same thing as Python:
1 2 3 4 5 6 7 8 9 10 11 12 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 

Please excuse the use of the NumClosure delegate. For some reason I could not get Mono to compile with Func
Fortunately, all of these languages provide some kind of workaround. Ruby has Array#each, and C# has List<>.ForEach. Python has the map builtin.
1 2 3 4 5 6 7 8 9 10 11 

1 2 3 4 5 6 7 8 9 10 11 12 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 

Not everybody mutates their enumerators, however. Lisp, the language which normally requires every programmer to be an expert in variable scoping, handles iteration very cleanly:
1 2 3 4 5 6 7 8 9 10 11 

And despite its messy syntax, Perl also scores a 10 for clean variable semantics:
1 2 3 4 5 6 7 8 9 10 11 12 

As before, our game is the children’s “Guess a number between 1 and 100” game:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 

I like this version much better because the ugly wart of before:
1 2 3 4 5 6 7 8 

has been simplified to:
1


The interface between the game and the user interface is the same as before, with one addition:
1 2 3 4 5 6 

The new method “get” is made use of in the game when generating the answer for the game:
1


Retrieving the random number through self.interface.get ensures that the game will not be constantly changing its answer while a user is playing through the web interface.
As before, the command line interface is very simple:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 

The web interface works by raising a StopWebInterface exception when execution of the game needs to be paused so that the user can input some data into a form. Our abstraction is thus slightly leaky, in that a game which at some point generically caught all types of exceptions might interfere with the behavior of the web interface. The yield lambda solution did not have this problem.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 

Our target application will be the following “guess a number” game.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 

Here is what the program looks like using coroutines:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 

Essentially, all read and write actions with the outside world have been replaced with the yield lambda pattern. That includes the call to rgen.randint, because rgen has been initialized according to the current time.
All we need now is an interface that implements the following methods:
1 2 3 4 5 6 7 8 9 10 11 

We’ll start with the simpler command line version:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 

The behavior of cli.py + game.py is completely identical to simple.py. Remarkably, though, the core logic of the game (in game.py) is now reusable with any user interface supporting the four methods given above.
A typical webMVCstyle solution to the “guess a number” game would probably have a controller which dispatched on one of three different situations: the user has input her name, the user has input a guess, or the user has told us whether or not she would like to keep playing. The three different situations would likely be represented as distinct URIs. In our game.py, however, a situation corresponds to the “yield lambda” at which execution has been paused.
The essential idea to writing a coroutinebased web interface is this: only run the game routine up to the point where more information is needed. Store the result of every lambda yielded so far. On successive page requests, replay the routine with the stored results, but only invoke the lambdas that were not invoked on a previous page request. The medium for storing the results of the lambdas does not matter. It could be embedded in hidden input elements in HTML (though this raises issues of trust), or stored in a database tied to a session ID. For simplicity, the following implementation stores the values in memory, tied to a value stored in a hidden input element.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 
