Weka data load error

Question

I want to load the data in breast-cancer-wisconsin through Weka Explorer as a C4.5 data file and I'm getting the following errors when choosing both to load C4.5 .data and C4.5 .names: enter image description here

Any ideas?

chl · Accepted Answer

It does not look like the C45 names file is correct. Try replacing breast-cancer-wisconsin.names with this one:

2, 4.
clump: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
size: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
shape: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
adhesion: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
epithelial: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
nuclei: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
chromatin: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
nucleoli: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
mitoses: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.

Note that class comes first (only labels).

Here I have removed the first column of subjects' id in the original dataset using

$ cut -d, -f2-11 breast-cancer-wisconsin.data > breast-cancer-wisconsin.data

but it is not difficult to adapt the above code.

Alternative solutions:

Generate a csv file: you just need to add a header to the *.data file and rename it as *.csv. E.g., replace breast-cancer-wisconsin.data with breast-cancer-wisconsin.csv which should look like
```
clump,size,shape,adhesion,epithelial,nuclei,chromatin,nucleoli,mitoses,class
5,1,1,1,2,1,3,1,1,2
5,4,4,5,7,10,3,2,1,2
3,1,1,1,2,2,3,1,1,2
6,8,8,1,3,4,3,7,1,2
...
```
Construct directly an *.arff file by hand; that's not really complicated as there are few variables. An example file can be found here.

Weka data load error

Answers (1)

Related Questions