Reputation: 21
I am working on sentiment analysis using coreNLP, I have some questions about training with my own dataset, it would be a great help if someone was able to give me some idea.
According to https://nlp.stanford.edu/sentiment/code.html to train ones own dataset
java -mx8g edu.stanford.nlp.sentiment.SentimentTraining -numHid 25 -trainPath train.txt -devPath dev.txt -train -model model.ser.gz
What's the dev.txt and what data do I need to add in this file? Also I have checked class PTBTokenizer but I didn't find any text2PTB token so I can train my data?
Can someone tell me how can I train with my data?
for example test data
Upvotes: 1
Views: 564
Reputation: 21
I have found the answer which works for me call
java -cp "*" -mx5g edu.stanford.nlp.sentiment.BuildBinarizedDataset -input sample.txt
sample.text would contain training data, Example 1 Today is not a good day. 3 good 3 good day 3 a good day this will generate
(1 (1 Today) (1 (1 (1 (1 is) (1 not)) (3 (1 a) (3 (3 good) (1 day)))) (1 .)))
Upvotes: 1