Reputation:
I know how to convert a Set of text or web page files in to arff file using TextDirectoryLoader.
I want to know how to convert a single Text file in to Arff file.
Any help will be highly appreciated.
Upvotes: 0
Views: 2997
Reputation: 1061
Please be more specific. Anyway:
If the text in the file corresponds to a single document (that it, a
single instance), then all you need is to replace all "new lines"
with the escape code \n
to make the full text be in a single line,
then manually format as an arff with a single text attribute and a
single instance.
If the text corresponds to several instances (e.g. documents), then I
suggest to make an script to break it into several files and to apply
TextDirectoryLoader
. If there is any specific formating (e.g.
instances are enclosed in XML tags), you can either do the same (by
taking advantage of the XML format), or to write a custom Loader
class in WEKA to recognize your format and build an Instances object.
If you post an example, it would be easier to get a more precise suggestion.
Upvotes: 3