Reputation: 867
My dataset format is as shown below:
8,2,1,1,1,0,3,2,6,2,2,2,2
8,2,1,2,0,0,15,2,1,2,2,2,1
5,5,4,4,0,0,6,1,6,2,2,1,2
8,2,1,3,0,0,2,2,6,2,2,2,2
8,2,1,2,0,0,3,2,1,2,2,2,1
8,2,1,4,0,1,3,2,1,2,2,2,1
8,2,1,2,0,0,3,2,1,2,2,2,1
8,2,1,3,0,0,2,2,6,2,2,2,2
8,2,1,12,0,0,5,2,2,2,2,2,1
3,1,1,2,0,0,3,2,1,2,2,2,1
It consists of all categorical data, where each feature is coded numerically. I tried with the following code:
monthly_income = tf.contrib.layers.sparse_column_with_keys("monthly_income", keys=['1','2','3','4','5','6'])
#Other columns are also declared in the same way
m = tf.contrib.learn.LinearClassifier(feature_columns=[
caste, religion, differently_abled, nature_of_activity, school, dropout, qualification,
computer_literate, monthly_income, smoke,drink,tobacco,sex],
model_dir=model_dir)
But I am getting the following error:
TypeError: Signature mismatch. Keys must be dtype <dtype: 'string'>, got <dtype: 'int64'>.
Upvotes: 2
Views: 798
Reputation: 4637
I think the problem is outside the code that you shown. My guess is that the features in csv file were read as ints, but you expect them to be strings, by passing keys=['1', '2', ...]
.
Nevertheless, in this situation, I recommend you to use sparse_column_with_integerized_feature:
monthly_income = tf.contrib.layers.sparse_column_with_integerized_feature("monthly_income", bucket_size=7)
Upvotes: 5