Text Mining with SVM Classifier

Question

I want to apply SVM classification for text-mining purpose using python nltk and get precision, recall accuracy different measurement information.For doing this, I preprocess dataset and split my dataset into two text files namely-pos_file.txt (positive label) and neg_file.txt (negative label). And now I want to apply SVM classifier with Random Sampling 70% for training the data and 30% for testing. I saw some documentation of scikit-learn, but not exactly sure how I shall apply this?

Both pos_file.txt and neg_file.txt are can be considered as bag of words. Useful links-

Sample files: pos_file.txt

stackoverflowerror restor default properti page string present
multiprocess invalid assert fetch process inform
folderlevel discoveri option page seen configur scope select project level

Sample files: neg_file.txt

class wizard give error enter class name alreadi exist
unabl make work linux
eclips crash
semant error highlight undeclar variabl doesnt work

And furthermore it would be interesting to apply the same approach for unigram, bigram and trigram. Looking forward your suggestion or sample code.

Text Mining with SVM Classifier

Answers (1)

Related Questions