Reputation: 101
I have a .csv file which consists of 10 columns. The first 9 are related to the properties of a particular item, while the 10th column has the "Class" which states which item it is.
I am trying to run the following classifiers -
I am having some trouble trying to proceed. I am supposed to divide my data such that - First half is to be trained and test the results using the second half of the data.
I begin with going to the "Explorer" and opening the .csv file. I select all the attributes, including "CLASS' and then go to the classify tab.
From there, I select the "Percentage Split" as 50% and simply "Start" the different classifiers (as mentioned before).
So these are the questions -
Can anyone help me with this?
Thanks!
Upvotes: 1
Views: 341
Reputation: 756
Someone asked a similar question here How to build a good training data set for machine learning and predictions? They look like different questions but involve the same considerations.
Upvotes: 1
Reputation: 34
Your question is a little bit too general, but I will try to help:
Make sure that the "Class" column is selected in the "Classify" tab (below "More Options" button)
You can use 2-fold cross validation which correspond to 50%/50% split
Increase training set size - use 80%/20% percentage split or even 90%/10% instead of 50%/50% (corresponds to 5-fold and 10-fold cross validation respectively). This may help if you have a small sample size
Choose your classifiers wisely - depending on your problem, you can also use for example Decision Trees (such as J48) and Random Forest.
Upvotes: 1