Putting Deep Perceptual Learning in OpenCog

This post presents some speculative ideas and plans, but I broadcast them here because I think they are of particular strategic importance for the OpenCog project….
The topic is: how OpenCog and “current-variety deep learning perception algorithms” can help each other.
Background: Modern Deep Learning Networks

“Deep learning” architectures have worked wonders on visual and auditory data in recent years, and have also shown limited interesting results on other sorts of data such as natural language.   The impressive applications have all involved training deep learning nets using a supervised learning methodology, on large training corpora; and the particulars of the network tend to be specifically customized to the problem at hand.   There is also work on unsupervised learning, though so far purely unsupervised learning has not yielded practically impressive results.  There is not much new conceptually in the new deep learning work, and nothing big that’s new mathematically; it’s mostly the availability of massive computing power and training data that has led to the recent, exciting successes…
These deep learning methods are founded on broad conceptual principles, such as
  • intelligence consists largely of hierarchical pattern recognition — recognition of patterns within patterns within patterns.. —
  • a mind should use both bottom-up and top-down dynamics to recognize patterns in a given data-item based on its own properties and based on experience from looking at other items
  • in many cases, the dimensional structure of spacetime can be used to guide hierarchical pattern recognition (so that patterns higher-up in the hierarchy pertain to larger regions of spacetime)
However, the tools normally labeled “deep learning” these days constitute a very, very particular way of embodying these general principles, using certain sorts of “formal neural nets” and related structures.  There are many other ways to manifest the general principles of “hierarchical pattern recognition via top-down and bottom-up learning, guided by spatiotemporal structure.”
The strongest advocates of the current deep learning methods claim that the deep networks currently used for perception, can be taken as templates or at least close inspirations for creating deep networks to be used for everything else a human-level intelligence needs to do.  The use of human-labeled training examples obviously doesn’t constitute a general-intelligence-capable methodology, but if one substitutes a reinforcement signal for a human label, then one has an in-principle workable methodology.
Weaker advocates claim that networks such as these may serve as a large part of a general intelligence architecture, but may ultimately need to be augmented by other components with (at least somewhat) different structures and dynamics.
It is sometimes suggested that the “right” deep learning network might serve the role of the “one crucial learning algorithm” underlying human and human-like general intelligence.   However, the deep learning paradigm does not rely on this idea… it might also be that a human-level intelligence requires a significant number of differently-configured deep networks, connected together in an appropriate architecture.
Deep Learning + OpenCog

My own intuition is that, given the realities of current (or near future) computer hardware technology, deep learning networks are a great way to handle visual and auditory perception and some other sorts of data processing; but that for many other critical parts of human-like cognition, deep learning is best suited for a peripheral role (or no role at all).   Based on this idea, Ted Sanders, Jade O’Neill and I did some prototype experiments a few years ago, connecting a deep learning vision system (DeSTIN) with OpenCog via extracting patterns from DeSTIN states over time and importing relations among these patterns into the OpenCog Atomspace.   This prototype work served to illustrate a principle, but did not represent a scalable methodology (the example dataset used was very small, and the different components of the architecture were piped together using ad hoc specialized scripts).
I’ve now started thinking seriously about how to resume this direction of work, but “doing it for real” this time.   What I’d like to do is build a deep learning architecture inside OpenCog, initially oriented toward vision and audition, with a goal of making it relatively straightforward to interface between deep learning perception networks and OpenCog’s cognitive mechanisms.
What cognitive mechanisms am I thinking of?
  1. The OpenCog Pattern Miner, written by Shujing Ke (in close collaboration with me on the conceptual and math aspects), can be used to recognize (frequent or surprising) patterns among states of a deep learning network — if this network’s states are represented as Atoms.   Spatiotemporal patterns among these “generally common or surprising” patterns may then be recognized and stored in the Atomspace as well. Inference may be done, using PLN, on the links representing these spatiotemporal patterns.  Clusters of spatiotemporal patterns may be formed, and inference may be done regarding these clusters.
  2. Having recognized common patterns within a set of states of a deep network, one can then annotate new deep-network states with the “generally common patterns” that they contain.   One may then use the links known in the Atomspace regarding these patterns, to create new *features* associated with nodes in the deep-network.  These features may be used as inputs for the processing occurring within the deep network.
This would be a quite thorough and profound form of interaction between perceptual and cognitive algorithms.
This sort of interaction could be done without implementing deep learning networks in the Atomspace, but it will be much easier operationally if they are represented in the Atomspace.
A Specific Proposal
So I’ve put together a specific proposal for putting deep learning into OpenCog, for computer vision (at first) and audition.   In its initial version, this would let one build quite flexible deep learning networks in OpenCog, deferring the expensive number-crunching operations to the GPU via the Theano library developed by Yoshua Bengio’s group at U. Montreal.
As it may get tweaked and improved or augmented by others, I’ve put it at the OpenCog wiki site instead of packing it into this blog post… you can read it at
This entry was posted in Uncategorized. Bookmark the permalink.

10 Responses to Putting Deep Perceptual Learning in OpenCog

  1. Blaine Whiteley says:

    I’m a newbie, however I hope we have already started work on creating ‘artificial’ emotional centers so that a basic understanding of say ‘pain’ has some grounding in an actual occurrence of pain.

    I’m thinking of not just “hunger” correlated with low battery life but also the AGI having the awareness that the AGI should not engage in excessive computation until it has access to more energy (food) and that the AGI doesn’t like this restriction in its current abilities and “suffers”. In the same manner “too hot” could be associated with an awareness of the diminished operation of a CPU due to excess heat.

    I feel this awareness of diminished capabilities due to external factors will be necessary before any true understanding of the intentional actions observed in the external world could be truly understood and will help the AGI get a foothold in understanding the intentional stance of natural language.

    • Dustin Ezell says:

      OpenPsi is the “artificial emotion” component of the OpenCog project. Minsky’s “The Emotion Machine” is a good read on the topic.

      • Blaine Whiteley says:

        Hi Dustin,

        Thanks for the recommendation, though Minsky’s ideas are interesting they still sit on outdated ideas of how our brain’s neurons are grouped and function compared with what neuroscience research has discovered over the past 10 years. I’d recommend Anderson’s paper, as a good starting place https://www.cs.umd.edu/~anderson/papers/AI_Review.pdf

        • Dustin Ezell says:

          Yes, Minsky may be a bit dated with regards to neuroscience. However what I find so refreshing about OpenCog is how very little it follows the “copy the brain” paradigm of AI.

          • Blaine Whiteley says:

            Yes, it is very healthy for research to have sometimes opposing theories running in parallel. Though there some distinct advantages of Human-Like AI namely;

            Embodiment gives us the possibility of grounding meaning within hardware and may make NLU of general conversations much easier.

            Putting Human-like emotional centers at the base of our inference engines may offset the problems of unintentional consequences of AI following mostly optimization maximization algorithms and possible future risks to humanity 😉

    • Dustin Ezell says:

      OpenPsi is the “artificial emotion” component of the OpenCog project. Minsky’s “The Emotion Machine” is a good read on the topic.

    • Dustin Ezell says:

      OpenPsi is the “artificial emotion” component of the OpenCog project. Minsky’s “The Emotion Machine” is a good read on the topic.

    • Dustin Ezell says:

      OpenPsi is the “artificial emotion” component of the OpenCog project. Minsky’s “The Emotion Machine” is a good read on the topic.

  2. Elene Purkey says:

    I was requiring a form several days ago and encountered a web service that hosts lots of form templates . If others are searching for it too , here’s http://goo.gl/D52UbA

  3. John Kintree says:

    I just read “AGI Revolution” as part of my introduction to the field of artificial general intelligence, and had the pleasure of being the first to review that book at Amazon.

    Considering the success that is being enjoyed with deep learning such as AlphaGo, WaveNet speech generation, and with language translation, it makes sense to integrate that approach in the cognitive synergy of opencog.

    I found the blog posting at DeepMind about Decoupled Neural Interfaces using Simulated Gradients to be interesting, https://deepmind.com/blog/decoupled-neural-networks-using-synthetic-gradients/­.

    My understanding is that a
    gradient is the error correction that is backpropagated through a
    neural network. By attaching a neural network to each layer of a neural
    network, which can generate an approximation of the gradient, therefore
    a synthetic gradient, a number of benefits are realized compared with
    each layer of a neural network having to wait for the signal to feed
    forward to the output, and then backpropagate through the network.

    Anyway, according to the blog posting at DeepMind, one benefit of this decoupling may be to allow for an interface between multiple neural networks.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.