qing zhangqing
qing zhangqing

Reputation: 401

machine learning from sklearn

I am learning sklearn module and how to split data.

I followed the instruction code

 categories = ['alt.atheism', 'talk.religion.misc', 'comp.graphics', 
 'sci.space']
  newsgroups_train = fetch_20newsgroups(subset='train',
                                  remove=('headers', 'footers', 
 'quotes'),
                                  categories=categories)
newsgroups_test = fetch_20newsgroups(subset='test',
                                 remove=('headers', 'footers', 
'quotes'),
                                 categories=categories)

num_test = len(newsgroups_test.target)
test_data, test_labels = int(newsgroups_test.data[num_test/2:]), 
int(newsgroups_test.target[num_test/2:])
dev_data, dev_labels = int(newsgroups_test.data[:num_test/2]), 
int(newsgroups_test.target[:num_test/2])
train_data, train_labels = int(newsgroups_train.data),
int(newsgroups_train.target)
print('training label shape:', train_labels.shape)
print( 'test label shape:', test_labels.shape)
print( 'dev label shape:', dev_labels.shape)
print('labels names:', newsgroups_train.target_names)

But I got error like this

TypeError Traceback (most recent call last) in () 8 9 num_test = len(newsgroups_test.target) ---> 10 test_data, test_labels = int(newsgroups_test.data[num_test/2:]), int(newsgroups_test.target[num_test/2:]) 11 dev_data, dev_labels = int(newsgroups_test.data[:num_test/2]), int(newsgroups_test.target[:num_test/2]) 12 train_data, train_labels = int(newsgroups_train.data), int(newsgroups_train.target)

TypeError: slice indices must be integers or None or have an index method

Not sure what's wrong.

Thanks guys

Upvotes: 3

Views: 90

Answers (1)

Jacob Morton
Jacob Morton

Reputation: 92

Although I'm not very familiar with scikits dataloaders, your error may be unrelated if you are using python3. You should do integer division, because the [] operator expects an integer value. Try using the division operator //, which ensures the value returned are an integer IF both args are integers, which is basically math.floor(a/b). In python3, the division operator / returns a float not an integer, regardless if the 2 arguments are both integers.

Try to change

num_test/2

to

num_test//2

Example:

newsgroups_test.target[num_test//2:]

The operator // is also available in some python2 versions.

Upvotes: 2

Related Questions