If you’re a user of tgrep2, you may have noticed that it hasn’t been updated since 2005. That’s because the main guy behind it, Doug Rohde, has moved on to other things (outside of academia). The good news is that tgrep2 is holding up to the tests of time pretty well (see this intro post if you’re new to tgrep2). But usually when things aren’t being actively developed, they fall into disrepair. So what are the alternatives?
The chief alternative is Tregex, which is a home-grown Stanford tool.
The good news:
- Has a GUI, which makes exploring trees a lot better.
- This is actually more than just pretty trees, though. It works fine with different languages–like Chinese and Arabic. So you can type your queries in without needing some sort of transcription.
- Extra operators like “restricted dominance” (I want something that dominats something else through a particular set of categories)
- Tgrep2 is only Unix, but Tregex is cross-platform (because it uses Java).
The mixed news:
- Tregex doesn’t pre-index, so it’s doing a grep each time you search. In Tgrep2, you have to pre-index your corpus, so if someone hasn’t done that, then you have to figure that out and spend that time, but then pre-indexing makes all your searches faster.
- So if you just have trees in a file, you’re ready to go with tregex. If you’ve got a big corpus, you’re probably going to be frustrated by speed.
The bad news:
- Tgrep2 lets you do macros (“these are control verbs”). That’s not a current feature in Tregex.
So I was getting ready to tell you about TigerSearch and sum it up by “you have to specify even more of the syntax than you do in Tgrep2, so that seems too bulky”, but the point is moot because TigerSearch isn’t maintained any more: http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERSearch/. (Note that TIGERSearch searches XML markup rather than Penn Treebank style.)
Know any other alternatives to Tgrep2?