Reputation: 331
I have a dataframe:
ga_deviceCategory_codes ga_channelgrouping_codes ga_sourceMedium_codes
1.0 6.0 9.0
1.0 6.0 9.0
Which I have converted into categorical codes from categorical values using :
data['ga_deviceCategory_codes'] = data['ga_deviceCategory'].astype('category').cat.codes
data['ga_channelgrouping_codes'] = data['ga_channelgrouping'].astype('category').cat.codes
data['ga_sourceMedium_codes'] = data['ga_sourceMedium'].astype('category').cat.codes
How do I get back to the original categorical values now from the above codes?
Upvotes: 3
Views: 6285
Reputation: 8884
Assume you have a categorical series which you converted to a series of codes:
ser = df['col'].astype('category')
categorical = ser.dtype
ser_codes = ser.cat.codes
Then you can convert back to the original categorical series as follows:
pd.Series(pd.Categorical.from_codes(ser_codes, dtype=categorical), index=ser_codes.index)
Upvotes: 0
Reputation: 63
you could do this as follow:
df['col'] = df['col'].astype('category')
my_list = df['col'].cat.codes
my_values = df['col'].tolist()
result= dict(zip(my_list,my_values))
first convert your target columns to 'category' type, and then do label encoding on it using .cat.codes. then in line 3 of code, a list of target column real values is created, and in line 4 by zip, each values achived by cat.codes and related real values are consider as an element of a dict.
Upvotes: 1
Reputation: 164783
Category mappings are stored internally by Pandas, but not as a regular Python dictionary. You can create such a dictionary yourself to map backwards:
df['mycol'] = df['mycol'].astype('category')
d = dict(enumerate(df['mycol'].cat.categories))
Then map backwards:
df['mycol_codes'] = df['mycol'].cat.codes
df['mycol_reversed'] = df['mycol_codes'].map(d)
Be careful with this method. Make sure you create the dictionary straight after you convert to categories. When concatenating dataframes with categorical series, you may find the mapping changes.
Upvotes: 10
Reputation: 970
not 100% on what you are asking, but have you tried e.g.:
data['ga_sourceMedium_codes'].cat.categories
Upvotes: 0