This summer OpenCog was chosen by Google to participate in the Google Summer of Code project: Google funded 11 students from around the world to work on OpenCog coding projects under the supervision of experienced mentors associated with the OpenCog project, and the associated OpenBiomind project
Applying for GSoC was David Hart’s idea originally; and, David and the Singularity Institute did a lot of the work needed to make it happen — so I need to extend very hearty thanks to the both of them, as all in all it worked out wonderfully well.
There were plenty of ups and downs over the summer, but overall the GSoC projects went extremely well and a lot of fabulous work got done. Furthermore, a number of the projects are going to be continued during the fall and beyond, either via students continuing them as course or thesis projects, or via students continuing to work on them in their spare time … and in one case, via a student being funded to continue the project by a commercial organization interested in using their OpenCog work.
OpenCog is a large AI software project with hugely ambitious goals (you can’t get much more ambitious than “creating powerful AI at the human level and beyond”) and a lot of “moving parts” — and the most successful OpenCog GSoC projects seemed to be the ones that successfully split off “summer sized chunks” from the whole project, which were meaningful and important in themselves, and yet also formed part of the larger OpenCog endeavor … moving toward greater and greater general intelligence.
Should OpenCog be chosen to participate in GSoC next year, I believe the projects will take a quite different flavor because OpenCog will be more mature then: I would hope to see more 2009 OpenCog GSoC projects involving the integrated functionality of the OpenCog system. But this year OpenCog is young so the best approach was to have students work on various important pieces of the overall system, and that’s what happened, generally to quite good effect.
contains brief summaries of the projects that were done, and links to Web pages, blogs and code repositories allowing you to dig in more detail into the work if you’re interested. Here I’ll just give an extremely high level summary.
Many of the projects were outstanding but perhaps the most dramatically successful (in my own personal view) was Filip Maric’s project (mentored by Predrag Janicic) which involved pioneering an entirely new approach to natural language parsing technology. The core parsing algorithm of the link parser, a popular open-source English parser (that is used within OpenCog’s RelEx language processing subsystem), was replaced with a novel parsing algorithm based on a Boolean satisfaction solver: and the good news is, it actually works … getting the best parses of a sentence faster than the old, standard parsing algorithm; and, most importantly, providing excellent avenues for future integration of NL parsing with semantic analysis and other aspects of language-utilizing AI systems. This work was very successful but needs a couple more months effort to be fully wrapped up and Filip will continue working on it during September and October.
Cesar Maracondes, working with Joel Pitt, made a lot of progress on porting the code of the Probabilistic Logic Networks (PLN) probabilistic reasoning system from a proprietary codebase to the open-source OpenCog codebase, resolving numerous software design issues along the way. This work was very important as PLN is a key aspect of OpenCog’s long-term AI plans. Along the way Cesar helped with porting OpenCog to MacOS.
There were two extremely successful projects involving OpenBiomind a sister project to OpenCog:
- Bhavesh Sanghvi (working with Murilo Queiroz) designed and implemented a Java user interface to the OpenBiomind bioinformatics toolkit, an important step which should greatly increase the appeal of the toolkit within the biological community (not all biologists are willing to use command-line tools, no matter how powerful)
- Paul Cao (working with Lucio Coelho) implemented a new machine learning technique within OpenBiomind, in which recursive feature selection is combined with OpenBiomind’s novel “model ensemble based important features analysis.” The empirical results on real bio datasets seem good. This is novel scientific research embodied in working open-source code, and should be a real asset to scientists doing biological data analysis.
Two projects dealt with improvements to OpenCog’s probabilistic program learning system: Shuo Chen (working with Moshe Looks) experimented with ways of improving the internals of the MOSES algorithms; whereas Alesis Novik (working with Nil Geissweiller) implemented an initial version of the PLEASURE algorithm, an alternative to MOSES that shares some of the latter’s code infrastructure. Both these difficult research-coding projects yielded promising though preliminary results and will be continued into the fall.
And the list goes on and on: in this short post I can’t come close to doing justice to all that was done, but please see the above page and the links in it for more details!
Costa Ciprian worked with Boris Iordanov on designing and creating a distributed version of the HypergraphDB, a persistent store for OpenCog; and Rich Jones worked with David Hart on creating a distributed web crawler suitable for massively distributed text parsing using OpenCog’s RelEx language parser.
In a different direction, Kino High Coursey (working with Andre Senna) designed and implemented a very elegant approach for interfacing between OpenCog and online simulation worlds such as OpenSim, implementing a framework using LISP to execute OpenCog-originated actions in simulation worlds. There is (conceptual and code-level) work to be done integrating this with other OpenCog work that involves OpenCog control of agents in simulated worlds, but Kino has introduced some excellent code and ideas into the project that is sure to be of value as things unfold.
Junfei Guo (working with Ben Goertzel) attacked a problem deep in the heart of OpenCog: mapping OpenCog’s unique AtomTable hypergraph knowledge representation into the more standard graph format used by the standard open-source Boost Graph Library. This opened up some important new discussions regarding the extent to which various graph algorithms (applied to the graph derived from a hypergraph) can serve as heuristic approximations to less-tractable hypergraph algorithms.
Elizabeth Dawn Alpert (working with Luke Kaiser) investigated the problem of making the link parser (used within OpenCog’s RelEx language framework) better handle ungrammatical text as seen in chats, IM, Twitter and so forth. This proved a thorny issue and the most progress was made on the level of cleaning up ungrammatical formats of individual words.
All in all, we are very grateful to Google for creating the GSoC program and including us in it.
Thanks to Google, and most of all to the students and mentors involved.