The New York Times isn't just a good newspaper. It's also a smart media player that understands what it takes to stay in the game as technology changes the newspaper business. (That's why they're "elite".)
As pointed out by ReadWriteWeb (among others), part of their strategy for staying ahead of the curve is to remain an indispensable content provider, and providing APIs is one way to do it.
As they announced on their blog a little more than a week ago, The New York Times has released an article search API that goes back 28 years (to 1981, if you don't feel like doing the math). It exposes a ton of article metadata: title, byline, publication date, descriptive terms, to name a few (go here for more info). What the API doesn't give you, however, is the body of the article. I point that out because the blog post doesn't say that explicitly, and I only figured it out after poking around a bit and reading some of the comments (including one from my former co-worker, Brendan O'Connor).
Not having the full article is a bummer, but you can always get it from the LDC's NY Times corpus. And there are other APIs if that doesn't feed your data hunger: congress, bestsellers, campaign finance, or movie review.
So much data, so little time...
Comments (1)
so much too much data!! i still really wish the articles would be released apart from the LDC. i did once run my own scrape of NYT articles from their normal search interface and i have like a million of those suckers. but needs html cleanup and crap. kinda silly.
Posted by Brendan O'Connor | February 14, 2009 3:11 AM
Posted on February 14, 2009 03:11