<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>OpenCog Brainwave &#187; GSoC</title>
	<atom:link href="http://blog.opencog.org/tag/gsoc/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.opencog.org</link>
	<description>The latest developments in building an open-source mind</description>
	<lastBuildDate>Thu, 04 Aug 2011 02:45:12 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Frequency of grammatical disjuncts</title>
		<link>http://blog.opencog.org/2009/07/06/frequency-of-grammatical-disjuncts/</link>
		<comments>http://blog.opencog.org/2009/07/06/frequency-of-grammatical-disjuncts/#comments</comments>
		<pubDate>Mon, 06 Jul 2009 18:14:56 +0000</pubDate>
		<dc:creator>Linas Vepstas</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Theory]]></category>
		<category><![CDATA[frequency]]></category>
		<category><![CDATA[grammar]]></category>
		<category><![CDATA[GSoC]]></category>
		<category><![CDATA[linguistics]]></category>
		<category><![CDATA[link-grammar]]></category>
		<category><![CDATA[NLP]]></category>

		<guid isPermaLink="false">http://opencog.wordpress.com/?p=123</guid>
		<description><![CDATA[The link-grammar parser uses labeled links to connect together pairs of words.  In order to capture the idea of proper grammatical construction, any given word is only allowed to have very specific links to its right or left: for ...]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.abisource.com/projects/link-grammar/">link-grammar parser</a> uses labeled links to connect together pairs of words.  In order to capture the idea of proper grammatical construction, any given word is only allowed to have very specific links to its right or left: for example, verbs have their subject on the left, and an object on the right.  Link-grammar defines hundreds of different link types, and there are typically dozens or even hundreds of ways that these can attach to a word. Each allowed set of links is called a &#8220;disjunct&#8221;. So, for example:</p>
<p style="text-align:center">MVp- Js+</p>
<p>is a disjunct that says &#8220;there must be an MVp link from this word, going to the left, and an Js link, going to the right&#8221;. This disjunct commonly connects prepositions to a verb on their left (the MV- link) and the object of the preposition on the right (the J+ link).</p>
<p>A good way to think about disjuncts is to imagine them as very fine-grained part-of-speech tags. Thus, when one sees &#8220;MVp- Js+&#8221; associated to a word, one knows not only that the word is a preposition, but even a bit more: its a preposition that took a singular object.  Disjuncts classify words not just into crude part-of-speech categories, but much finer categories:  thus verbs are not just as transtivie or intransitive verbs, but mgiht be transitive verbs that take both direct and indirect objects, or participles, etc.</p>
<p>Siva Reddy, a GSOC 2009 summer student, prepared a table of the frequency of occurrence of different disjuncts in a large collection of text. The top six entries are</p>
<p style="text-align:center">Ds+           950275.635843<br />
Xp-           838569.90527<br />
A+          616522.664867<br />
AN+        566658.997313<br />
MVp- Js+       563082.649325<br />
MVp- Jp+      446487.310222</p>
<p style="text-align:left">and these are exactly what one might expect:</p>
<ul>
<li>Ds+ connects the determiner &#8220;the&#8221; to nouns: and of course, &#8220;the&#8221; is the most frequent word in the English language.</li>
<li>Xp- connects the period at the end of the sentence to the start of the sentence, so of course its frequently observed.</li>
<li>A+ connects adjectives to nouns, AN+ connects noun modifiers to nouns.</li>
<li>As noted above, MV connects verbs to modifying phrases, and J connects prepositions to objects, so that MV- J+ is the disjunct that most prepositions will get. Js connects to a singular object, Jp connects to a plural count or mass noun.</li>
</ul>
<p>A graph of rank vs. frequency is shown below:</p>
<div id="attachment_132" class="wp-caption alignnone" style="width: 490px"><img class="size-full wp-image-132" src="http://blog.opencog.org/files/2009/07/disjunct-true-rank2.png" alt="Disjunct rank vs. frequency of occurance " width="480" height="360" /><p class="wp-caption-text">Disjunct rank vs. frequency of occurance </p></div>
<p>As can be seen, the distribution is more or less Zipfian, with a power-law exponent of 1.5.  The fact that the long tail appears to be linear indicates that grammatical construction in the English language appears to be more ore less scale-free: difficult and akward constructions are increasingly rare.  The fact that the graph is not purely Zipfian, but instead has a knee for the most common grammatical connections suggests that the most common grammatical constructions are &#8220;less common than they should be&#8221;: almost as if English speakers are resisting the use of formulaic sentence constructions. So, for example, since adjectives and noun-modifiers appear near the top of the rank, this suggests that English speakers &#8220;could have&#8221; used more adjectives and noun-modifiers, but didn&#8217;t. Quite why this is so is not clear.  Perhaps the use of anaphora and references in general  helps decrease the need for lots of modifiers.</p>
<p>The open questions are then:</p>
<ol>
<li>Why a power law of 1.5?</li>
<li>Why is there a knee?</li>
<li>Does this result hold for other languages?</li>
</ol>
<p>The corpus used here consists of approximately 1 million sentences, obtained by parsing entire Wikipedia articles, Voice of America news stories, and 10 books from Project Gutenberg, including War and Peace, Jane Austen, and some scientific or medical texts.</p>
<p>&#8211; Linas Vepstas</p>
<p class="wp-flattr-button"></p> <p><a href="http://blog.opencog.org/?flattrss_redirect&amp;id=123&amp;md5=071ee4693b8c455d8b281862fe2a1e0d" title="Flattr" target="_blank"><img src="http://blog.opencog.org/wp-content/plugins/flattr/img/flattr-badge-large.png" alt="flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://blog.opencog.org/2009/07/06/frequency-of-grammatical-disjuncts/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>OpenCog and Google Summer of Code 2009</title>
		<link>http://blog.opencog.org/2009/03/25/gsoc-2009-student-applications/</link>
		<comments>http://blog.opencog.org/2009/03/25/gsoc-2009-student-applications/#comments</comments>
		<pubDate>Wed, 25 Mar 2009 01:53:53 +0000</pubDate>
		<dc:creator>David Hart</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[GSoC]]></category>
		<category><![CDATA[OpenCog]]></category>

		<guid isPermaLink="false">http://brainwave.opencog.org/?p=107</guid>
		<description><![CDATA[We are happy to announce that the SIAI has been selected again this year to participate in the Google Summer of Code program as a mentoring organization. GSoC is an annual program that awards successful student contributors a 4500 ...]]></description>
			<content:encoded><![CDATA[<p>We are happy to announce that the SIAI has been selected again this year to participate in the Google Summer of Code program as a mentoring organization. GSoC is an annual program that awards successful student contributors a 4500 USD summer stipend to work on open source and free software projects for three months. Around one thousand students worldwide participated in GSoC 2008, with eleven students working on <a href="http://google-opensource.blogspot.com/2008/11/opencog-and-gsoc.html">OpenCog related projects</a>. Students may <a href="http://google-opensource.blogspot.com/2009/03/students-apply-now-for-google-summer-of.html">apply for GSoC 2009</a>, beginning at the <a href="http://socghop.appspot.com/org/show/google/gsoc2009/opencog">SIAI organization page</a>. The student application period closes on April 3, 2009 at 19:00 UTC.</p>
<p class="wp-flattr-button"></p> <p><a href="http://blog.opencog.org/?flattrss_redirect&amp;id=107&amp;md5=6bd3a232c8778332972c8476790a8a44" title="Flattr" target="_blank"><img src="http://blog.opencog.org/wp-content/plugins/flattr/img/flattr-badge-large.png" alt="flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://blog.opencog.org/2009/03/25/gsoc-2009-student-applications/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OpenCog Google-Summer-Of-Code Roundup</title>
		<link>http://blog.opencog.org/2008/09/05/opencog-google-summer-of-code-roundup/</link>
		<comments>http://blog.opencog.org/2008/09/05/opencog-google-summer-of-code-roundup/#comments</comments>
		<pubDate>Fri, 05 Sep 2008 02:55:38 +0000</pubDate>
		<dc:creator>Ben Goertzel</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[GSoC]]></category>
		<category><![CDATA[OpenCog]]></category>
		<category><![CDATA[RelEx]]></category>

		<guid isPermaLink="false">http://opencog.wordpress.com/?p=46</guid>
		<description><![CDATA[This summer OpenCog was chosen by Google to participate in the Google Summer of Code project: Google funded 11 students from around the world to work on OpenCog coding projects under the supervision of experienced mentors associated with the ...]]></description>
			<content:encoded><![CDATA[<p>This summer OpenCog was chosen by Google to participate in the Google Summer of Code project: Google funded 11 students from around the world to work on OpenCog coding projects under the supervision of experienced mentors associated with the OpenCog project, and the associated OpenBiomind project</p>
<p>Applying for GSoC was David Hart&#8217;s idea originally; and, David and the Singularity Institute did a lot of the work needed to make it happen &#8212; so I need to extend very hearty thanks to the both of them, as all in all it worked out wonderfully well.</p>
<p>There were plenty of ups and downs over the summer, but overall the GSoC projects went extremely well and a lot of fabulous work got done.  Furthermore, a number of the projects are going to be continued during the fall and beyond, either via students continuing them as course or thesis projects, or via students continuing to work on them in their spare time &#8230; and in one case, via a student being funded to continue the project by a commercial organization interested in using their OpenCog work.</p>
<p>OpenCog is a large AI software project with hugely ambitious goals (you can&#8217;t get much more ambitious than &#8220;creating powerful AI at the human level and beyond&#8221;) and a lot of &#8220;moving parts&#8221; &#8212; and the most successful OpenCog GSoC projects seemed to be the ones that successfully split off &#8220;summer sized chunks&#8221; from the whole project, which were meaningful and important in themselves, and yet also formed part of the larger OpenCog endeavor &#8230; moving toward greater and greater general intelligence.</p>
<p> Should OpenCog be chosen to participate in GSoC next year, I believe the projects will take a quite different flavor because OpenCog will be more mature then: I would hope to see more 2009 OpenCog GSoC projects involving the integrated functionality of the OpenCog system.  But this year OpenCog is young so the best approach was to have students work on various important pieces of the overall system, and that&#8217;s what happened, generally to quite good effect.</p>
<p>This page</p>
<p><a id="r7qd" title="http://opencog.org/wiki/GSoCProjects2008" href="http://opencog.org/wiki/GSoCProjects2008" target="_blank">http://opencog.org/wiki/GSoCProjects2008</a> </p>
<p>contains brief summaries of the projects that were done, and links to Web pages, blogs and code repositories allowing you to dig in more detail into the work if you&#8217;re interested.  Here I&#8217;ll just give an extremely high level summary.</p>
<p>Many of the projects were outstanding but perhaps the most dramatically successful (in my own personal view) was Filip Maric&#8217;s project (mentored by Predrag Janicic) which involved pioneering an entirely new approach to natural language parsing technology.  The core parsing algorithm of the link parser, a popular open-source English parser (that is used within OpenCog&#8217;s RelEx language processing subsystem), was replaced with a novel parsing algorithm based on a Boolean satisfaction solver: and the good news is, it actually works &#8230; getting the best parses of a sentence faster than the old, standard parsing algorithm; and, most importantly, providing excellent avenues for future integration of NL parsing with semantic analysis and other aspects of language-utilizing AI systems.  This work was very successful but needs a couple more months effort to be fully wrapped up and Filip will continue working on it during September and October.</p>
<p>Cesar Maracondes, working with Joel Pitt, made a lot of progress on porting the code of the Probabilistic Logic Networks (PLN) probabilistic reasoning system from a proprietary codebase to the open-source OpenCog codebase, resolving numerous software design issues along the way.  This work was very important as PLN is a key aspect of OpenCog&#8217;s long-term AI plans.   Along the way Cesar helped with porting OpenCog to MacOS.</p>
<p>There were two extremely successful projects involving OpenBiomind a sister project to OpenCog: </p>
<ul>
<li>Bhavesh Sanghvi (working with Murilo Queiroz) designed and implemented a Java user interface to the OpenBiomind bioinformatics toolkit, an important step which should greatly increase the appeal of the toolkit within the biological community (not all biologists are willing to use command-line tools, no matter how powerful)</li>
<li>Paul Cao (working with Lucio Coelho) implemented a new machine learning technique within OpenBiomind, in which recursive feature selection is combined with OpenBiomind&#8217;s novel &#8220;model ensemble based important features analysis.&#8221;  The empirical results on real bio datasets seem good.  This is novel scientific research embodied in working open-source code, and should be a real asset to scientists doing biological data analysis.</li>
</ul>
<p>Two projects dealt with improvements to OpenCog&#8217;s probabilistic program learning system: Shuo Chen (working with Moshe Looks) experimented with ways of improving the internals of the MOSES algorithms; whereas Alesis Novik (working with Nil Geissweiller) implemented an initial version of the PLEASURE algorithm, an alternative to MOSES that shares some of the latter&#8217;s code infrastructure.  Both these difficult research-coding projects yielded promising though preliminary results and will be continued into the fall.</p>
<p>And the list goes on and on: in this short post I can&#8217;t come close to doing justice to all that was done, but please see the above page and the links in it for more details!</p>
<p>Costa Ciprian worked with Boris Iordanov on designing and creating a distributed version of the HypergraphDB, a persistent store for OpenCog; and Rich Jones worked with David Hart on creating a distributed web crawler suitable for massively distributed text parsing using OpenCog&#8217;s RelEx language parser.</p>
<p>In a different direction, Kino High Coursey (working with Andre Senna) designed and implemented a very elegant approach for interfacing between OpenCog and online simulation worlds such as OpenSim, implementing a framework using LISP to execute OpenCog-originated actions in simulation worlds.  There is (conceptual and code-level) work to be done integrating this with other OpenCog work that involves OpenCog control of agents in simulated worlds, but Kino has introduced some excellent code and ideas into the project that is sure to be of value as things unfold.</p>
<p>Junfei Guo (working with Ben Goertzel) attacked a problem deep in the heart of OpenCog: mapping OpenCog&#8217;s unique AtomTable hypergraph knowledge representation into the more standard graph format used by the standard open-source Boost Graph Library.  This opened up some important new discussions regarding the extent to which various graph algorithms (applied to the graph derived from a hypergraph) can serve as heuristic approximations to less-tractable hypergraph algorithms.</p>
<p>Elizabeth Dawn Alpert (working with Luke Kaiser) investigated the problem of making the link parser (used within OpenCog&#8217;s RelEx language framework) better handle ungrammatical text as seen in chats, IM, Twitter and so forth.  This proved a thorny issue and the most progress was made on the level of cleaning up ungrammatical formats of individual words. </p>
<p>All in all, we are very grateful to Google for creating the GSoC program and including us in it.   </p>
<p>Thanks to Google, and most of all to the students and mentors involved.</p>
<p>Onward!</p>
<p>Ben G</p>
<p class="wp-flattr-button"></p> <p><a href="http://blog.opencog.org/?flattrss_redirect&amp;id=46&amp;md5=6495e36b965ba9dc7f14567380b74cda" title="Flattr" target="_blank"><img src="http://blog.opencog.org/wp-content/plugins/flattr/img/flattr-badge-large.png" alt="flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://blog.opencog.org/2008/09/05/opencog-google-summer-of-code-roundup/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Summer of Code</title>
		<link>http://blog.opencog.org/2008/05/05/google-summer-of-code/</link>
		<comments>http://blog.opencog.org/2008/05/05/google-summer-of-code/#comments</comments>
		<pubDate>Mon, 05 May 2008 20:36:11 +0000</pubDate>
		<dc:creator>David Hart</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[GSoC]]></category>
		<category><![CDATA[HyperGraphDB]]></category>
		<category><![CDATA[MOSES]]></category>
		<category><![CDATA[NLP]]></category>
		<category><![CDATA[OpenBiomind]]></category>
		<category><![CDATA[OpenCog]]></category>
		<category><![CDATA[OpenSim]]></category>
		<category><![CDATA[Pleasure]]></category>
		<category><![CDATA[RelEx]]></category>

		<guid isPermaLink="false">http://opencog.wordpress.com/?p=11</guid>
		<description><![CDATA[Crunch time is here! Our participation in Google's Summer of Code program has accelerated release schedules and shifted priorities. Ben is busy writing initial documentation, converting much of it from Novamente documentation. Gustavo, Senna and Linas are working to ...]]></description>
			<content:encoded><![CDATA[<p>Crunch time is here! Our participation in Google&#8217;s <a href="http://code.google.com/soc/2008/">Summer of Code</a> program has accelerated release schedules and shifted priorities. Ben is busy writing initial documentation, converting much of it from Novamente documentation. Gustavo, Senna and Linas are working to tidy OpenCog code, removing crufty and embarrassing bits and improving infrastructure and interfaces. Joel is working on the first collection of research-oriented MindAgents. You&#8217;ll hear more soon on this blog from these <a href="http://opencog.org/wiki/Community">team members</a>, and from GSoC students on the <a href="http://opencog.ning.com/">OpenCog Collective</a> blog and the new list <a href="http://groups.google.com/group/opencog-soc">opencog-soc@googlegroups.com</a>.</p>
<p>To quote Ben Goertzel&#8217;s post on <a href="http://groups.google.com/group/opencog/browse_thread/thread/aa7159e328f00406/63db705f7f518fb4">opencog@googlegroups.com</a>:<br />
<blockquote>The Google Summer of Code selection process is done, and 11 proposals were chosen.</p>
<p>It was a really painful process to go through, as we had more than 70 applications, and at least 25-30 of them were really quite good.</p>
<p>The accepted proposals span a fairly wide variety of areas, and the choices were ultimately made based on a number of factors including</p>
<p>&#8211; clarity and completeness of the proposal<br />
&#8211; background of the student<br />
&#8211; readiness of the OpenCog codebase for the project<br />
&#8211; critical-ness of the project for OpenCog</p>
<p>Of the 11 selected, 2 were for OpenBiomind projects, and the other 9 for OpenCog proper &#8230; including a bunch of stuff for the RelEx NLP toolkit.</p>
<p>There was a strong bias toward proposals dealing with improvements to OpenCog-related software components that already are moderately mature, like RelEx and MOSES.</p>
<p>Next year when OpenCog is more mature, if we are chosen to participate in GSoC again (as we hope, and have reason to somewhat expect), you can expect to see more explicitly, broadly, AGI-related proposals.</p>
<p>Anyway this list of selected projects is here for all who are curious:</p>
<p><b>OpenSim for OpenCog<br />
by Kino High Coursey, mentored by Andre Luiz de Senna</p>
<p>Implementing a SAT/SMT Based Link Grammar Parser<br />
by Filip Marić, mentored by Predrag Janicic</p>
<p>Bayesian and Causal Networks Inference using Indefinite Probabilities<br />
by Cesar Augusto Cavalheiro Marcondes, mentored by Cassio Pennachin</p>
<p>Java GUI for OpenBiomind<br />
by Bhavesh Sanghvi, mentored by Murilo Saraiva de Queiroz</p>
<p>MOSES: the Pleasure Algorithm<br />
by Alesis Novik, mentored by Nil Geisweiller</p>
<p>Graph Algorithms for HyperGraphDB<br />
by Guo Junfei, mentored by Ben Goertzel</p>
<p>Improved MOSES<br />
by ChenShuo, mentored by Moshe Looks</p>
<p>RelEx Web Crawler and HypergraphDB Manager<br />
by Rich Jones, mentored by David Hart</p>
<p>RelEx: Learning Simple Grammars<br />
by Elizabeth Dawn Alpert, mentored by Lukasz Kaiser</p>
<p>Distributed HipergraphDB Version<br />
by Costa Ciprian, mentored by Borislav Iordanov</p>
<p>Recursive Feature Selection for Enhancing Genetic Disease Prediction<br />
by Paul Cao, mentored by Lucio de Souza Coelho</b></p>
<p>Many thanks to all who applied, all who agreed to help mentor &#8230; and especially to David Hart for coming up with the idea of applying for SIAI to be included as a mentoring organization in GSoC, with a focus on OpenCog work.</p></blockquote>
<p>We&#8217;d also like to thank the terrific Open Source team at Google, particularly Leslie Hawthorn, Dave Anderson and Chris DiBona, for their patience and good advice.</p>
<p class="wp-flattr-button"></p> <p><a href="http://blog.opencog.org/?flattrss_redirect&amp;id=11&amp;md5=086f820ddaad5c5fdd685376a9da451e" title="Flattr" target="_blank"><img src="http://blog.opencog.org/wp-content/plugins/flattr/img/flattr-badge-large.png" alt="flattr this!"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://blog.opencog.org/2008/05/05/google-summer-of-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

