Semantic dependency relations

I spent the weekend comparing the Stanford parser to RelEx, and learned a lot. RelEx really does deserve to be called a “semantic relation extractor”, and not just a “dependency relation extractor”. It provides a more abstract, more semantic output than the Stanford parser, which sticks very narrowly to the syntactic structure of a sentence.

I wrote up a few paragraphs on the most prominent differences; most of my updates were to the RelEx dependency relations page.

Here are the main bullet points:

  • RelEx attempts basic entity extraction, and thus avoids generating nn noun modifier relations for named entities.
  • RelEx will collapse the object and complement of a preposition into one. Stanford will do this for some, but not all relationships.
  • RelEx will convert passive subjects into objects, and instead indicate passiveness by tagging the verb with a passive tense feature.
  • RelEx avoids generating copulas, if at all possible, and instead indicates copular relations as predicative adjectives, or in other ways.
  • RelEx extracts semantic variables from questions, with the intent of simplifying question answering. For example, “Where is the ball?” generates _pobj(_%atLocation, _$qVar) _psubj(_%atLocation, ball), which can then pattern-match a plausible answer: _pobj(under, couch).
  • RelEx attempts to extract comparison variables.

Its also clear to me that I could split up the relex processing into two stages: one which generates stanford-style syntactic relations, and a second stage that generates the more abstract stuff. This might be a wise move … Since RelEx is already more than 3x faster than the Stanford parser, this could attract new users.

— Linas Vepstas

About Linas Vepstas

Computer Science Researcher
This entry was posted in Design, Development, Documentation, Theory and tagged , , , , , , , . Bookmark the permalink.