Reputation: 1
I wanna show the rows that match specific numbers of time in dataframe.
All numbers are sorted, I know how to show if all numbers are matched, but I don't know how to show if match 2, 3, 4, or non-mathes dataframe.
check = [4,5,6,8,9]
my_list = list(range(0,10)) # generate non duplicate numbers from 0 to 9
n = []
trials = 5
for i in range(trials):
t = random.sample(my_list, k=5)
t.sort()
n.append(t)
df = pd.DataFrame(n, columns=['Num1','Num2','Num3','Num4','Num5'])
df
Num1 | Num2 | Num3 | Num4 | Num5 |
---|---|---|---|---|
4 | 5 | 6 | 8 | 9 |
0 | 3 | 5 | 8 | 9 |
1 | 2 | 3 | 5 | 7 |
5 | 6 | 7 | 8 | 9 |
0 | 1 | 2 | 4 | 6 |
If I want to show all matches row:
df_match5 = df[(df['Num1']==check[0]) & (df['Num2']==check[1]) & (df['Num3']==check[2]) & (df['Num4']==check[3]) & (df['Num5']==check[4])]
df_match5
Num1 | Num2 | Num3 | Num4 | Num5 |
---|---|---|---|---|
4 | 5 | 6 | 8 | 9 |
If I want to show match 1 time Only:
df_match1 = do something
df_match1
Num1 | Num2 | Num3 | Num4 | Num5 |
---|---|---|---|---|
1 | 2 | 3 | 5 | 7 |
If I want to show match 2 times Only:
df_match2 = do something
df_match2
Num1 | Num2 | Num3 | Num4 | Num5 |
---|---|---|---|---|
0 | 1 | 2 | 4 | 6 |
And keep going for match 3,4,or non-mathes
Upvotes: 0
Views: 175
Reputation: 30022
You can check if each dataframe value is in list then sum the result by columns
times = 1
m = df.applymap(lambda x: x in check).sum(axis=1).eq(times)
out = df[m]
print(out)
Num1 Num2 Num3 Num4 Num5
2 1 2 3 5 7
Upvotes: 0