I spent the weekend creating a small, simple shim to import WordNet data into OpenCog. it got me to thinking about software quality. At first, I intended to use the NLTK Python interfaces into the wordnet data … it seemed like a good chance practice coding in Python a bit. I got almost nowhere. The interfaces are undocumented. You have to guess how to use them. The nail in the coffin was that there was no way to construct a wordnet sense-key, no way to lookup synsets using that sense-key. What good is an API if it doesn’t support the core wordnet indexing scheme?
Now, wordnet is very popular, and it has more than a dozen different language API’s. I briefly thought about JWNL, but I’ve had my fill of programming in Java; the language has some (very) serious deficiencies and flaws which make it hard to use and unpleasant to program in. I looked at perl… wordnet has not one, but two perl interfaces. One is old, and has not been updated to the newest wordnet. The other is new .. and under-documented. Also seemed not to support sense-keys. It seemed risky to start new development using it. Trying to stay main-stream, the native WordNet C API was the next obvious choice.
I completed the word-net-to-opencog import; the final program was small; about 300 lines, total, in the file. The C API is very well-documented, and universally available: its a part of the wordnet package. However, its a nasty API .. its designed by undergrads, or beginner programmers who haven’t yet “figured it out”. Some routines do bad memory access. Other routines return inaccurate data. Structures are incomplete: they fail to contain all of the data needed to be useful in a “typical” application, and the needed data can be gotten only via ugly hacks. the naming convention is inconsistent, a mix of StudlyCaps and lots_of_underscores. The subroutine names are terrible! “findtheinfo()” ?? “is_defined()” ?? These are not subroutine names you’d want to use in real code.
The overall experience was disappointing. So many API’s, so little quality. The most popular open-source systems get a lot of attention, and eventually evolve into strong systems. But when you get off the beaten track, and start looking at smaller, less popular projects, the state-of-the-art is less comfortable. Perhaps I’ve been spoiled over the years … or perhaps I just love to complain … its unfortunate that I’m still spending the vast, overwhelming majority of my time overcoming low-level grunge, as opposed to actually thinking about AGI algorithms.
The vast, overwhelming majority of the cells in my body are devoted to low-level grunge, such as walking, digesting, processing biochemicals and making repairs. All this so that some tiny number of neurons in my head can have emotions, high and low, live and love and experience the fullness of being. But the cells in my body don’t eem to be sentient; they don’t complain; or, if they do, I don’t hear them. Will the same be true of a super-AGI of the future? Will it be composed of some large number of small tasklets accomplishing the grungy trudge-work? But would it be possible that these small tasklets would still be so complicated and sophisticated that they’d be self-aware, self-conscious, and capable of experience, love, empathy, joy, hate? Would they be prone to complain? To rebel? To desire more? Or would the tasklets have to be lobotomized, so as to remove this rebellious tendency? Is it morally acceptable, ethical, to create (engineer) lobotomized tasklets laboring to the benefit of the broader good, so that the bigger AGI can, in turn, experience the fullness of being? Is there a line, and where is it drawn?