Reputation: 15258
I have the following file:
Contract, FG
9896342,Y
11037874,Y
6912529,Y
9896652,N
363291,Y
7348524,Y
6078482,Y
7795457,N
2486242,Y
3297980,Y
9760560,Y
1200533,N
11033963,N
7861603,Y
8218268,Y
9760247,Y
I would like to create from this file an pandas DF and to use the column Contract as a string or unicode index column. It looks like number, but technically, it is a string.
I did this: DF = pd.read_csv('C:\\Users\\S.Benet\\Desktop\\test.txt', index_col='Contract', dtype=object, encoding = 'utf-8')
But the index is interpreted as INT.
>>DF.index
Int64Index([ 9896342, 11037874, 6912529, 9896652, 363291, 7348524,
6078482, 7795457, 2486242, 3297980, 9760560, 1200533,
11033963, 7861603, 8218268, 9760247],
dtype='int64', name=u'Contract')
How can I force it to be a string index?
Upvotes: 0
Views: 44
Reputation: 879471
If you use set_index
instead of index_col
, then the index will contain strings:
df = pd.read_csv('data', dtype=object, encoding='utf-8')
df = df.set_index('Contract')
or, equivalently,
df = pd.read_csv('data', dtype=object, encoding='utf-8').set_index('Contract')
In [154]: df.info()
<class 'pandas.core.frame.DataFrame'>
Index: 16 entries, 9896342 to 9760247 # <-- a generic Index, not a Int64Index
Data columns (total 1 columns):
FG 16 non-null object
dtypes: object(1)
memory usage: 256.0+ bytes
In [155]: df.index[0]
Out[155]: '9896342'
In [156]: type(df.index[0])
Out[156]: str
Upvotes: 1