This post describes some speculative ideas for new AI/AGI/Artificial-Life/Artificial-Chemistry development that could be done on top of the OpenCog platform, leveraging OpenCog tools in a novel way.
These are not necessarily proposed for immediate-term development – we have a lot of other stuff going on, and a lot of stuff that’s half-finished or ¾-finished or more, that needs to be wrapped up and improved, etc. They are, however, proposed as potentially fundamental enhancements to the OpenCog and PrimeAGI (CogPrime) designs in their current form.
The core motivation here is to add more self-organizing creativity to the AGI design. One can view these ideas as extending what MOSES does in the PrimeAGI design – MOSES (at least in its aspirational version) is artificial evolution enhanced by pattern mining and probabilistic reasoning; whereas Cogistry is, very loosely, more like artificial biochemistry similarly enhanced.
Walter Fontana, in a fascinating 1990 paper, articulated a novel approach to “artificial life” style digital evolution of emergent phenomena, called “algorithmic chemistry.” The basic concept was: Create small codelets, so that when codelet A acts on codelet B, it produces a new codelet C. Then put a bunch of these codelets in a “primordial algorithmic soup” and let them act on each other, and see what interesting structures and dynamics emerge.
The paper reports some interesting emergent phenomena, but then the research programme was dropped off at an early stage. Very broadly speaking, the story seems to have been similar to what happened with a lot of Alife-related work of that era: Some cool-looking self-organizational phenomena occurred, but not the emergence of highly complex structures and dynamics like the researchers were looking for.
These sorts of results spawned the natural question “why?” Did the simulations involved not have a large enough scale? (After all, the real primordial soup was BIG and based on parallel processing and apparently brewed for quite a long time before producing anything dramatic.) Or were the underlying mechanisms simply not richly generative enough, in some way? Or both?
What I am going to propose here is not so much a solution to this old “why” question, but rather a novel potential route around the problem that spawned the question. My proposal – which I call “Cogistry” — is to enhance good old Fontana-style algorithmic chemistry by augmenting its “self-modifying program soup” with AI algorithms such as hypergraph pattern mining and probabilistic reasoning. I believe that, if this is done right, it can lead to “algorithm soups” with robust, hierarchical, complex emergent structures – and also to something related but new and different: emergent, self-organizing program networks that carry out functions an AI agent desires for achievement of its goals.
That is: my aim with Cogistry is to use probabilistic inference and pattern mining to enhance algorithmic chemistry, so as to create “digital primordial soups” that evolve into interesting digital life-forms, but ALSO so as to create “life-like program networks” that transform inputs into outputs in a way that carries out useful functions as requested by an AI agent’s goal system. The pure “digital primordial soup” case would occur when the inference and pattern mining are operating with the objective of spawning interesting structures and dynamics; whereas the “useful program network” case would occur when the inference and pattern mining are operating with an additional, specific, externally-supplied objective as well.
There is a rough analogy here with the relation between genetic algorithms and estimation of distribution algorithms (EDAs). EDAs aim to augment GA mutation and crossover with explicit pattern mining and probabilistic reasoning. But there are significant differences between EDAs and the present proposal as well. There is a lot more flexibility in an algorithmic chemistry network than in an evolving populations of bit strings or typical GP program trees; and hence, I suspect, a lot more possibility for the “evolution/self-organization of evolvability” and the “evolution/self-organization of inferential analyzability” to occur. Of course, though, this added flexibility also provides a lot more potential for messes to be made (including complex, original sorts of messes).
From an Alife point of view, the “chemistry” in the “algorithmic chemistry” metaphor is intended to be taken reasonably seriously. A core intuition here is that to get rich emergent structures and dynamics from one’s “digital biology” it’s probably necessary to go a level deeper to some sort of “digital chemistry” with a rich combinatorial potential and a propensity to give rise to diverse stable structures, including some with complex hierarchical forms. One might wonder whether this is even deep enough, whether one needs actually to dig down to a “digital physics” from which the digital chemistry emerges; my intuition is that this is not the case and focusing on the level of algorithmic chemistry (if it’s gotten right) is deep enough.
Actually, the “digital physics,” in the analogy pursued here, would be the code underlying the algorithms in the algorithm-soup: the programming-language interpreter and the underlying C and assembly code, etc. So part of my suggestion here will be a suggestion regarding what kind of digital physics is likely to make algorithmic chemistry work best: e.g. functional programming and reversible computing. But, according to my intuition, the core ideas of this “digital physics” have already been created by others for other purposes, and can be exploited for algorithmic-chemistry purposes without needing dramatic innovations on that level.
An Algorithmic Chemistry Framework
First I will define a framework for algorithmic chemistry in a reasonably general way. I will then fill in more details, bit by bit.
The basic unit involved I will call a “codelet” – defined simply as a piece of code that maps one or more input codelets into one or more output codelets. What language to use for specifying codelets is a subtle matter that I will address below, but that we don’t need to define in order to articulate a general algorithmic chemistry framework.
Fontana’s original work involved a chemical soup with no spatial structure, but other work with artificial life systems suggests that a spatial structure may be valuable. So we propose a multi-compartment system, in which each compartment has its own “algorithmic chemical soup.” A codelet, relative to a specific compartment, can be said to have a certain “count” indicating how many copies of that codelet exist in that compartment.
Inside each compartment, multiple dynamics happen concurrently:
- Migration: codelets migrate from one compartment to neighboring compartments. This can be done initially via random diffusion.
- Importance spreading: codelets have short and long term importance values, which spread e.g. via ECAN dynamics
- Forgetting: codelets with low long-term importance are “forgotten” from a compartment (and may or may not be saved to some persistent store, depending on system configuration)
- Reaction: codelets in the compartment act on other codelets in the compartment, producing new codelets. Codelets may be chosen for reaction with probability based on their short term importance values.
- Generation: new codelets are introduced into the compartment, either representing inputs to the system from some external source, or generated randomly from some distribution, or drawn from some set of codelets believed to be generally useful
I will use the word “codenet” to refer to a collection of codelets that interact with each other in a coherent way. This is intentionally a vague, intuitive definition – because there are many kinds of coherent networks of codelets, and it’s not obvious which ones are going to be the most useful to look at, in which contexts. In some cases a chain of the form “In the presence of background codelet-set B, A1 reacts with B to make A2, which reacts with B to make A3, etc. …” may be very influential. In other cases cycles of the form “A and B react to make C; then A and C create to make more B; and B and C react to make more A” may be critical. In other cases it may be more complex sorts of networks. Exactly how to chop up a soup of codelets, growing and interacting over time, into distinct or overlapping codenets is not entirely clear at present and may never be. However, it’s clear that for understanding what is going on in an algorithmic-chemistry situation, it’s the codenets and not just the codelets that need to be looked at. If codelets are like chemicals, then codenets are like chemical compounds and/or biological systems.
In terms of current computing architectures, it would be natural to run different compartments on different machines, and to run the four processes in different threads, perhaps with multiple threads handling reaction, which will generally be the most intensive process.
If implemented in an OpenCog framework, then potentially, separate compartments could be separate Atomspaces, and the dynamic processes could be separate MindAgents running in different threads, roughly similar to the agents now comprising the ECAN module. Also, in OpenCog, codelets could be sub-hypergraphs in Atomspace, perhaps each codelet corresponding to a DefineLink.
Reactions would naturally be implemented using the Forward Chainer (a part of the Rule Engine, which leverages the Pattern Matcher). This differs a bit from PLN’s use of the Forward Chainer, because in PLN one is applying an inference rule (drawn from a small set thereof) to premises, whereas here one is applying a codelet (drawn from a large set of possible codelets) to other codelets.
One interesting question is how to measure the interestingness of a codelet, or codenet.
For codelets, we can likely just bump the issue up a level: A codelet is as interesting as the codenets it’s part of.
For codenets, we can presumably rely on information theory. A compartment, or a codenet, as it exists at a particular point in time, can be modeled using a sparse vector with an entry for each codelet that has nonzero count in the compartment or codenet (where the entry for a codelet contains the relevant count). A compartment or codenet as it exists during an interval of time can then be modeled as a series of sparse vectors. One can then calculate the interaction information of this vector-series (or the “surprisingness” as defined in the context of OpenCog’s Pattern Miner). This is a good first stab at measuring how much novelty there is in the dynamics of a codenet or compartment.
In a codenet containing some codelets representing Inputs and others representing Outputs, one can also calculate interaction information based only on the Input and Output codelets. This is a measure of the surprisingness or informativeness of the codenet’s relevant external behaviors, rather than its internal dynamics.
Pattern Mining and Inference for Algorithmic Chemistry
Given the above approach to assessing interestingness, one can use a modification of OpenCog’s Pattern Miner to search for codenets that have surprising dynamics. One can also, in this way, search for patterns among codenets, so that specific codenets fulfilling the patterns have surprising dynamics. Such patterns may be expressed in the Atomspace, in terms of “abstract codelets” — codelets that have some of their internal Atoms represented as VariableNodes instead.
An “abstract codenet” may be defined as a set of (possibly-abstract codelet, count-interval, time-interval) triples, where the time-interval is defined as a pair (start, end), where start and end are defined as offsets from the initiation of the codenet. The interpretation of such a triple is that (C, (m,n) (s,e)) means that some codelet instantiating abstract codelet C exists with count between m and n, during the time interval spanning from s to e.
Note that in order to form useful abstractions from codenets involving different codelets, it will be useful to be able to represent codelets in some sort of normal form, so that comparison of different codelets is tractable and meaningful. This suggests that having the codelet language support some sort of Elegant Normal Form, similar to the Reduct library used in OpenCog’s MOSES subsystem, will be valuable.
Using the Pattern Miner to search for abstract codenets with high degrees of informational surprisingness, should be a powerful way to drive algorithmic chemistry in interesting directions. Once one has found abstract codenets that appear to systematically yield high surprisingness, one can then use these to drive probabilistic generation of concrete codenets, and let them evolve in the algorithmic soup, and see what they lead to.
Furthermore, once one has abstract codenets with apparent value, one can apply probabilistic inference to generalize from these codenets, using deductive, inductive and abductive reasoning, e.g. using OpenCog’s Probabilistic Logic Networks. This can be used to drive additional probabilistic generation of concrete codenets to be tried out.
“Mutation” and “crossover” of codenets or codelets can be experimented with on the inferential level as well – i.e. one can ask one’s inference engine to estimate the likely interestingness of a mutation or crossover of observed codenets or codelets, and then try out the mutation or crossover products that have passed this “fitness estimation” test.
This kind of pattern mining and inference certainly will be far from trivial to get right. However, conceptually, it seems a route with a reasonably high probability of surmounting the fundamental difficulties faced by earlier work in artificial life and algorithmic chemistry. It is something conceptually different than “mere self-organization” or “mere logical reasoning” – it is Alife/Achem-style self-organization and self-modification, but explicitly guided by abstract pattern recognition and reasoning. One is doing symbolic AI to accelerate and accentuate the creativity of subsymbolic AI.
The above pertains to the case where one is purely trying to create algorithmic soups with interesting internal dynamics and structures. However, it applies also in cases where one is trying to use algorithmic chemistry to learn effective input-output functions according to some fitness criteria. In that case, after doing pattern mining of surprising abstract codenets, one can ask a different question: Which codenets, and which combinations thereof, appear to differentially lead to high-fitness transformation of inputs into outputs? One can then generate new codenets from the distribution obtained via answering this question. This is an approach to solving the “assignment of credit” problem from a “God’s eye” point of view – by mining patterns from the network over time … a fundamentally different approach to assignment of credit than has been taken in subsymbolic AI systems in the past.
Desirable Properties of a Cogistry Codelet Language
Designing the right language for the above general “Cogistry” approach is a subtle task, and I won’t try to do so fully here. I’ll just sketch some ideas and possible requirements.
Fontana’s original algorithmic chemistry work uses a variety of LISP, which seems a sound and uncontroversial choice (and the same choice we made in OpenCog’s MOSES GA-EDA tool, for example). However, a few variations and additions to this basic LISP-ish framework seem potentially valuable:
- Addition of unordered sets as primitives, alongside ordered lists; and then associated set operations
- Addition of float-weighted lists and sets as primitives; and of basic vector operations acting on such lists, such as the dot product, and the multiplication of a vector by a scalar.
Float-weighted lists are very handy for dealing with perceptual data, for example. They also provide an element of continuity, which may help with robustness. Codelets relying on float vectors of weights can be modified slightly via modifying the weights, leading to codelets with slightly different behaviors – and this continuity may make learning of new codelets via sampling from distributions implied by abstract codenets easier.
Further, it seems to me we may want to make all operations reversible. If the atomic operations on bits, ints and floats are reversible, then corresponding operations on lists and sets and weighted vectors can also easily be made reversible. (For instance, removing the final element of a list can be made into a reversible operation by, instead, using an operation that splits a list into two parts: the list with the final element removed, and the final element itself.) The intuition here is that reversibility introduces a kind of “conservation of information” into the system, which should prevent the advent of various sorts of pathological runaway dynamics like Fontana observed in his early simulations. If codelets can produce “more” than they take in, then evolution will naturally lead to codelets that try to exploit this potential and produce more and more and more than they take in. But if codelets have to be “conservative” and essentially act only by rearranging their inputs, then they have to be cleverer to survive and flourish, and are more strongly pushed to create complexly interlocking self-organizing structures.
I’m well aware that mining patterns among functions of float variables is difficult, and it would be easier to restrict attention to discrete variables – but ultimately I think this would be a mistake. Perceptual data seems to be very naturally represented in terms of float vectors, for example. Perhaps an innovative approach will be needed here, e.g. instead of floats one could use confidence intervals (x% chance of lying in the interval (L,U) ). Reversible division on confidence intervals would require a bit of fiddling to work out, but seems unlikely to be fundamentally difficult.
The idea of Cogistry seemed very simple when I initially thought of it; but when I sat down to write this post summarizing it, it started to seem a lot more complicated. There are a lot of moving parts! But, hey, nobody said building a thinking machine and creating digital life in one fell swoop was going to be easy…. (Well actually OK, some people did say that… er, actually, I did say that, but I took it back a decade and a half ago!! …)
What I like about the idea of Cogistry, though, is that – if it works (and I’m sure it will, after a suitable amount of research and fiddling!) — it provides a way to combine the fabulous generative creativity of biological systems, with the efficiency and precision of digital-computer-based pattern mining and probabilistic reasoning. Such a combination has the potential to transcend the “symbolic versus subsymbolic” and “biological versus computational” dichotomies that have plagued the AI field since nearly its inception (and that indeed reflect deeper complex dichotomies and confusions in our contemporary culture with its mixture of information-age, machine-age, and biological/emotional/cultural-humanity aspects). Some of the details are gonna be freakin’ complicated to work out but, though I realize it sounds a bit vain or whatever, I have to say I feel there is some profound stuff lurking here…
Notes Toward a Possible Development Plan
At first blush, it seems to me that most of the hard work here is either
- A) testing and experimenting; or
- B) re-using OpenCog AI tools that we need anyway for other reasons
From this perspective, an approach to making Cogistry real would be to start by
- designing the programming language (with room for experimentation/variation) for codelets
- implementing the basic framework in OpenCog
- doing some basic experimentation with this as an Alife/Achem framework, without the added AI bells and whistles
Now, I would not expect this initial work to yield great results… since basically it’s a matter of reimplementing good old Alife/Achem stuff in a context where inference, pattern mining, ECAN etc. can be layered on. Without the layering on of these AI tools, one would expect to find familiar Alife-y issues: some interesting structures emerging, but hitting a complexity ceiling … and then being uncertain whether increased scale or a change to the codelet language might be the key to getting more interesting things to emerge.
But beyond this basic framework, the other things needed for Cogistry are all things needed for other OpenCog AGI work anyway:
- more general use of pattern mining than has been done so far
- PLN on output of pattern mining
- ECAN working together w/ pattern mining and PLN
- probabilistic programming for guiding the forward chainer
With the basic codelet-system framework in place, using these things for Cogistry alongside their other uses would be “straightforward in principle”
— Thanks are due to Cassio Pennachin and Zar Goertzel for comments on an earlier version of the text contained in this post.