
Yesterday I received in the mail from O'Reilly my copy of Natural Language Processing with Python, which should hit bookshelves soon. (From what I hear, it's not available on Amazon yet.) The book is a introduction to natural language processing using the Natural Language Toolkit, an open source code library written in Python.
I've been eagerly awaiting the publication of this book for a couple of reasons. First, I think it's a significant event for the field when a major tech publisher like O'Reilly publishes a book devoted to an NLP project. Second, I've been tangentially involved in the NLTK for some time now, having contributed some code to the module that provides functionality for processing files in the format used by Shoebox/Toolbox: toolbox.py. Finally, I've written about the NLTK for freshmeat.net (Processing Corpora with Python and the NLTK") and for The Journal of Language Conservation and Documentation (Managing Fieldwork Data with Toolbox and the Natural Language Toolkit).
I love the cover, by the way, and I think that whales area pretty good animal for an O'Reilly book on NLP. Apparently, the ones on the cover are right whales, which are endangered. As long as we're on the topic, it's worth pointing out that the international commercial whaling moratorium is not as effective as it could and should be, given that there are still a handful of countries that work around it by exploiting technicalities (Japan) or simply ignore it (Norway, Iceland). The issue is in the public mind again, thanks to the TV show Whale Wars. Hopefully it will help galvanize public opinion against whale hunting.