Reputation: 59
I have a dataset of a series with missing values that I want to replace by the index. The second column contains the same numbers than the first column, but in a different order.
here's an example:
>>> df
ind u v d
0 5 7 151
1 7 20 151
2 8 40 151
3 20 5 151
this should turn out to:
>>>df
ind u v d
0 1 2 151
1 2 4 151
2 3 5 151
3 4 1 151
i reindexed the values in row 'u' by creating a new column:
>>>df['new_index'] = range(1, len(numbers) + 1)
but how do I now replace values of the second column referring to the indexes?
Thanks for any advice!
Upvotes: 0
Views: 618
Reputation: 863166
You can use Series.rank
, but first need create Series
with unstack
and last create DataFrame
with unstack
again:
df[['u','v']] = df[['u','v']].unstack().rank(method='dense').astype(int).unstack(0)
print (df)
u v d
ind
0 1 2 151
1 2 4 151
2 3 5 151
3 4 1 151
If use only DataFrame.rank
, output in v
is different:
df[['u','v']] = df[['u','v']].rank(method='dense').astype(int)
print (df)
u v d
ind
0 1 2 151
1 2 3 151
2 3 4 151
3 4 1 151
Upvotes: 1