Reputation: 4663
I'm trying to tweak this code: http://snipperize.todayclose.com/snippet/py/Use-NLTK-Toolkit-to-Classify-Documents--5671027/ to accept some additional features. It seems to be determining its class based on having separate files for separate classes of information, which is fine. But I'd like to also be able to add some additional data for it to look for. What needs to be modified? Any good resources? The book on NLTK/Python doesn't address this.
Upvotes: 0
Views: 453
Reputation: 5543
What do you mean by feature? It seems to me that you want to just add more data, not features.
If you want to consider new features you have to modify extract words accordingly to your needs.
If you just need more data, which may be stored in different files, you should edit the main code to take into account sets of file names rather than single files for features. That of course implies a modification to the loop at line 74. You have to add another inner loop to iterate over all the filenames in the set
Upvotes: 1