Reputation: 51
I've below data: I need to delete all rows which have a combination of (ColX='F' and ColY='Y'). I'm trying the below different codes as suggested by below answers.
df = df[(df.ColX != 'F')&(df.ColY !='Y')] But this code is removing all rows with (ColX=P and ColY=Y) along with (ColX=F and ColY=Y). I need rows with (ColX=P and ColY=Y) in my data.
mask = (df.ColX == 'F') & (df.ColY == 'Y') df[~mask] This code is not removing any of the rows which I want to remove.
3.mask = (df.ColX == 'F') | (df.ColY == 'Y') df[~mask] This code is removing all rows with (ColX=P and ColY=Y) along with (ColX=F and ColY=Y). I need rows with (ColX=P and ColY=Y) in my data.
Any inputs are highly appreciated. Thank you!
line_date ColX ColY ColZ rating rw
2007-03-31 P Y 56 1.000000 56.000000
2007-03-10 P Y 67 1.000000 67.000000
2007-02-10 F Y 66 1.000000 66.000000
2007-01-13 2 83 0.880678 73.096278
2006-12-23 2 Y 88 0.793033 69.786942
2006-11-09 F 52 0.636655 33.106077
2006-10-22 P 66 0.581946 38.408408
2006-09-29 F Y 70 0.518825 36.317752
Upvotes: 2
Views: 2579
Reputation: 51
The below code gave me my expected dataset:
filter_df=df[ ~(df['ColY'].str.contains("Y") & df['ColX'].str.contains("F")) ]
Upvotes: 1
Reputation: 153510
Try this:
mask = (df.ColX == 'F') & (df.ColY == 'Y')
df[~mask]
Output:
line_date ColX ColY ColZ rating rw
0 2007-03-31 P Y 56 1.000000 56.000000
1 2007-03-10 P Y 67 1.000000 67.000000
3 2007-01-13 2 NaN 83 0.880678 73.096278
4 2006-12-23 2 Y 88 0.793033 69.786942
5 2006-11-09 F NaN 52 0.636655 33.106077
6 2006-10-22 P NaN 66 0.581946 38.408408
Or, because I am a little confused what your output should be.
mask = (df.ColX == 'F') | (df.ColY == 'Y')
df[~mask]
Output:
line_date ColX ColY ColZ rating rw
3 2007-01-13 2 NaN 83 0.880678 73.096278
6 2006-10-22 P NaN 66 0.581946 38.408408
Upvotes: 2
Reputation: 210932
you can use DataFrame.query()
method:
In [20]: df.query("ColX != 'F' or ColY != 'Y'")
Out[20]:
line_date ColX ColY ColZ rating rw
0 2007-03-31 P Y 56 1.000000 56.000000
1 2007-03-10 P Y 67 1.000000 67.000000
3 2007-01-13 2 NaN 83 0.880678 73.096278
4 2006-12-23 2 Y 88 0.793033 69.786942
5 2006-11-09 F NaN 52 0.636655 33.106077
6 2006-10-22 P NaN 66 0.581946 38.408408
Upvotes: 2