Reputation: 406
hi i'am a newbie to data mining. My task is to automatically classify text documents using n-grams method.
I could not find proper resources on this topic, kindly help me how to proceed in this topic, where can i find tutorials based on n-gram classification.
i need java source code on this topic for my understanding.
thanks in advance.
Upvotes: 2
Views: 3339
Reputation: 406
i found better tutorial with documentation in
http://textcat.sourceforge.net/README.txt
http://textcat.sourceforge.net/doc/index.html
Upvotes: 2
Reputation: 637
I highly recommend Stanford's online NLP course by Dan Jurafsky & Chris Manning. Chapter 4 addresses n-grams, but all the chapters before it give a great background.
Stanford also has some great open source software you can use for text classification, from tokenizing to part of speech tagging.
Upvotes: 3