Reputation: 703
i have a bunch of data harvested from a forum I own, and would like to do some text mining or use some linguistic library to extract useful information.
any text mining, data mining library in any language will do.
Thank you.
Upvotes: 0
Views: 1138
Reputation: 20
I would recommend the following Python libraries:
nltk
keras
tensorflow
Note: Before any text analysis you should clean the data based on your requirement
Upvotes: 0
Reputation: 81288
You may like to have a look at the Python NLTK (Natural Language ToolKit): it's specifically designed for this kind of thing.
There is also a great book you can but to get you started.
Upvotes: 2
Reputation: 76
stanford core-nlp is good for English text, and has things like Named Entity Recognition. Take a look at: http://nlp.stanford.edu/software/corenlp.shtml
GATE, which Ehsan already recommended, is also good, but it can be a bit complicated if you need to write your own components. For large-scale stuff it's great though.
UIMA is similar to GATE, but not as easy to use because it doesn't feature an extensive GUI like GATE. (http://uima.apache.org)
Upvotes: 0
Reputation: 171
Try GATE, it has GUI and of course you can use java api for more power: http://gate.ac.uk/family/developer.html
You can also use Weka for processing text and doing text mining, have a look at these useful lectures: http://sentimentmining.net/weka/
Upvotes: 0
Reputation: 100164
I recommend that you have a look at R. It has an extensive number of text mining packages: have a look at the Natural Language Processing view. In particular, look at the tm
package. Here are some relevant links:
Another example of useful package for this is Gary King's readme package.
Upvotes: 4