Reputation: 41
Is there any simple way to do that? I'm not in Java and I'm new in Python so I would need another way(s). Thanks in advance!
Upvotes: 4
Views: 51033
Reputation: 1
package WekaDemo;
public class Txt2Arff {
static ArrayList inList=new ArrayList();
static String colNames[];
static String colTypes[];
static String indata[][];
static ArrayList clsList=new ArrayList();
static ArrayList disCls=new ArrayList();
static String res="";
public String genTrain()
{File fe=new File("input2.txt");
FileInputStream fis=new FileInputStream(fe);
byte bt[]=new byte[fis.available()];
fis.read(bt);
fis.close();
String st=new String(bt);
String s1[]=st.trim().split("\n");
String col[]=s1[0].trim().split("\t");
colNames=col;
colTypes=s1[1].trim().split("\t");
for(int i=2;i<s1.length;i++)
{
inList.add(s1[i]);
}
ArrayList at1=new ArrayList();
for(int i=0;i<inList.size();i++)
{
String g1=inList.get(i).toString();
if(!g1.contains("?"))
{
at1.add(g1);
res=res+g1+"\n";
}
}
indata=new String[at1.size()][colNames.length-1]; // remove cls
for(int i=0;i<at1.size();i++)
{
String s2[]=at1.get(i).toString().trim().split("\t");
for(int j=0;j<s2.length-1;j++)
{
indata[i][j]=s2[j].trim();
}
if(!disCls.contains(s2[s2.length-1].trim()))
disCls.add(s2[s2.length-1].trim());
clsList.add(s2[s2.length-1]);
}
String ar="@relation tra\n";
try
{
for(int i=0;i<colNames.length-1;i++) // all columName which you have split
//and store in Colname
{
//where yor attitude in nominal or you can say character value
if(colTypes[i].equals("con"))
ar=ar+"@attribute "+colNames[i].trim().replace(" ","_")+" real\n";
else
{
ArrayList at1=new ArrayList();
for(int j=0;j<indata.length;j++) //your all numeric data
{
if(!at1.contains(indata[j][i].trim()))
at1.add(indata[j][i].trim());
}
String sg1="{";
for(int j=0;j<at1.size();j++)
{
sg1=sg1+at1.get(j).toString().trim()+",";
}
sg1=sg1.substring(0,sg1.lastIndexOf(","));
sg1=sg1+"}";
ar=ar+"@attribute "+colNames[i].trim().replace(" ", "_")+" "+sg1+"\n";
}
}
//end of attribute
// now adding a class Attribute
ArrayList dis=new ArrayList();
String c1="";
for(int i=0;i<clsList.size();i++)
{
String g=clsList.get(i).toString().trim();
if(!dis.contains(g))
{
dis.add(g);
c1=c1+g+",";
}
}
c1=c1.substring(0, c1.lastIndexOf(","));
ar=ar+"@attribute class {"+c1+"}\n"; //attribute name
//adding class attribute is done
//now data
ar=ar+"@data\n";
for(int i=0;i<indata.length;i++)
{
String g1="";
for(int j=0;j<indata[0].length;j++)
{
g1=g1+indata[i][j]+",";
}
g1=g1+clsList.get(i);
ar=ar+g1+"\n";
}
}
catch(Exception e)
{
e.printStackTrace();
}
return ar;
}
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
Txt2Arff T2A=new Txt2Arff();
String ar1=T2A.genTrain();
File fe1=new File("tr.arff");
FileOutputStream fos1=new FileOutputStream(fe1);
fos1.write(ar1.getBytes());
fos1.close();
}}
Upvotes: -1
Reputation: 1
This solution assumes you have your data in .csv format - see kaz's solution.
One simple way to do this is in version 3.6.11 (I'm on a mac) is to open up the Explorer and then in the Preprocess tab select "Open file...", just as you would when you want to open a .arff file. Then where it asks for the File Format at the bottom of the dialog box, change it to .csv. You can now load CSV files straight into Weka. If the first line of your CSV file is a header line, these names will be used as the attribute names.
On the right-hand side of the Preprocesses tabs is a "Save..." button. You can click on that and save your data as a .arff file.
This is a bit long-winded to explain, but takes only a few moments to perform and is very intuitive.
Upvotes: 0
Reputation: 1042
Missing -dir argument specifier:
java weka.core.converters.TextDirectoryLoader -dir /directory/with/your/text/files > output.arff
Upvotes: 3
Reputation: 685
Do you perhaps mean a csv
file that ends in .txt
? If the data inside the file looks like this:
1,434,2236,5,569,some,value,other,value
4,347,2351,1,232,different,value,than,those
Then it has comma separated values (csv) and Weka has classes and functions which convert a csv file into an arff: http://weka.wikispaces.com/Converting+CSV+to+ARFF You can use these from the command line, like this:
java weka.core.converters.CSVLoader filename.csv > filename.arff
Otherwise, @D3mon-1stVFW 's comment links to great documentation from weka about turning text files (things like blog posts or books or essays) into the arff format. http://weka.wikispaces.com/ARFF+files+from+Text+Collections and this can also be called from the command line, like this:
java weka.core.converters.TextDirectoryLoader /directory/with/your/text/files > output.arff
Upvotes: 4