Reputation: 69
Id Freq ID2 PSC
a1 0 xy 33
a1 0 yz 35
a1 1 xz 60
a2 0 pq 70
a2 1 qr 75
a2 0 rs 80
output should be
Id Freq ID2 PSC
a1 0 xy 33
yz 35
a1 1 xz 60
a2 0 pq 70
rs 80
a2 1 qr 75
after that check a1 and freq=o psc shlould be unique
Upvotes: 2
Views: 43
Reputation: 765
def function1(dd:pd.DataFrame):
dd.iloc[1:,:2]=''
return dd
df1.groupby(["Id","Freq"]).apply(function1)
output:
Id Freq ID2 PSC
0 a1 0 xy 33
1 yz 35
2 a1 1 xz 60
3 a2 0 pq 70
5 rs 80
4 a2 1 qr 75
Upvotes: 0
Reputation: 494
You can have a look at the Pandas functions set_index
and sort_index
. For your specific problem let's say that you have a dataset called df
df.set_index(['Id','Freq'])
will give you
ID2 PSC
Id Freq
a1 0 xy 33
0 yz 35
1 xz 60
a2 0 pq 70
1 qr 75
0 rs 80
Then you can decide if you want to sort by the specific index of your choice (to get unique values).
Upvotes: 0
Reputation: 261924
What exactly do you want to do?
If you want to mask the duplicated values with empty string (or NA) to highlight the consecutive duplicates like it would appear on a MultiIndex you could use:
df = df.sort_values(by=['Id', 'Freq'])
m = df.duplicated(['Id', 'Freq'])
df.loc[m, ['Id', 'Freq']] = ''
output:
Id Freq ID2 PSC
0 a1 0 xy 33
1 yz 35
2 a1 1 xz 60
3 a2 0 pq 70
5 rs 80
4 a2 1 qr 75
Note that this denaturates your data, so you should only do this for display purposes.
Another option, set the columns as MultiIndex:
df.set_index(['Id', 'Freq']).sort_index()
The display will hide the consecutive duplicates, not exactly the way you want though.
Upvotes: 1