Reputation: 10872
I have a Dataframe(table2) that looks something like
57 INVERNESS
361 INVERNESS
533 INVERNESS
535 INVERNESS KERRY DOWNS
758 INVERNESS GREEN
807 INVERNESS
970 INVERNESS POINT
971 INVERNESS
And so on..
And I need to map/replace the names using a Dict, (which I have in a Excel sheet) When I read the translate table into Pandas I get a DF that looks like
NSUBDIVISION
SUBDIVISION
*HUFFMAN**8MILES NE OTHER
0 OTHER
00 OTHER
000 OTHER
INVERNESS POINT INVERNESS
And so on.. When I convert it to a DICT using xlate=df.to_dict() I get a dict(xlate) that looks like:
{u'NSUBDIVISION': {u'*HUFFMAN**8MILES NE': u'OTHER',
u'0': u'OTHER',
u'00': u'OTHER',
u'000': u'OTHER',
u'0000': u'OTHER',
u'INVERNESS POINT': u'INVERNESS',
And so ..on (I mention this as I'm not sure the dict is Properly formed)
I want to do something like
table2['SUBDIVISION'].replace(to_replace=xlate,inplace=True)
I want to look up values in the 1st col of the xlate table match them to table2['SUBDIVISION'] and if found replace contents of SUBDIVISION with the values in xlate column 2 if not leave them alone (bonus..actually if col 2 is NAn I'd like to leave it alone as well) for instance above finding INVERNESS POINT will be replaced by INVERNESS
currently I just get TypeError: unhashable type: 'dict'
Upvotes: 2
Views: 1861
Reputation: 375695
I think you want to create a dictionary from the Series (rather than the DataFrame):
In [11]: translate_df['NSUBDIVISION'].to_dict()
Out[11]:
{'*HUFFMAN**8MILES NE': 'OTHER',
'0': 'OTHER',
'00': 'OTHER',
'000': 'OTHER',
'INVERNESS POINT': 'INVERNESS'}
And use this to replace
the column:
In [12]: df['SUBDIVISION'].replace(translate_df['NSUBDIVISION'].to_dict())
Out[12]:
0 INVERNESS
1 INVERNESS
2 INVERNESS
3 INVERNESS KERRY DOWNS
4 INVERNESS GREEN
5 INVERNESS
6 INVERNESS
7 INVERNESS
Name: SUBDIVISION, dtype: object
Upvotes: 4