paullb
paullb

Reputation: 4325

Determining language of twitter posts

What is the best way to determine the language of twitter posts.

There is the language parameter that comes with the streaming API but it doesn't really seem to be very accurate. Even many Japanese posts are labelled as English.

What have others done to sort out the langauges?

Upvotes: 1

Views: 308

Answers (2)

Adam Green
Adam Green

Reputation: 1356

I've had very good results with this PHP package: http://pear.php.net/package/Text_LanguageDetect/

It is fast and open source. We use it to select English only posts for a site we run at http://2012twit.com.

Upvotes: 2

trickwallett
trickwallett

Reputation: 2468

google have language detection within their Translate API if using evil external services is a go-er?

http://code.google.com/apis/language/translate/v1/reference.html#detectResult

Upvotes: 1

Related Questions