Abdallah Sayed
Abdallah Sayed

Reputation: 105

Training Stanford POS tagger using multiple text files

I have a corpus of about 20000 text files and i want to train the tagger using these text files, which is better,to group these text files into one text file(i don't know if it will affect tagging accuracy or not) or to include all these text files in the props file?

Upvotes: 0

Views: 121

Answers (1)

StanfordNLPHelp
StanfordNLPHelp

Reputation: 8739

I don't think it matters. The code should just load all of the data in, it's just for convenience if you have it split into multiple files. Also, you can specify different input formats for different files, but this is not going to affect the final model.

Upvotes: 1

Related Questions