user2007237
user2007237

Reputation:

Convert a Text file in to ARFF Format

I know how to convert a Set of text or web page files in to arff file using TextDirectoryLoader.

I want to know how to convert a single Text file in to Arff file.

Any help will be highly appreciated.

Upvotes: 0

Views: 2997

Answers (1)

Jose Maria Gomez Hidalgo
Jose Maria Gomez Hidalgo

Reputation: 1061

Please be more specific. Anyway:

  • If the text in the file corresponds to a single document (that it, a single instance), then all you need is to replace all "new lines" with the escape code \n to make the full text be in a single line, then manually format as an arff with a single text attribute and a single instance.

    If the text corresponds to several instances (e.g. documents), then I suggest to make an script to break it into several files and to apply TextDirectoryLoader. If there is any specific formating (e.g. instances are enclosed in XML tags), you can either do the same (by taking advantage of the XML format), or to write a custom Loader class in WEKA to recognize your format and build an Instances object.

If you post an example, it would be easier to get a more precise suggestion.

Upvotes: 3

Related Questions