Reputation: 175
I'm trying to remove columns that have 0 in the 2nd row of the following dataframe snippet(there are many more columns than this however):
1st Year Gender CDK6 1st Year Gender GBP1 1st Year Gender LY9 Future All CCDC144B
0 1 1 1 0
1 0 1 0 1
I simply need to remove the columns where the 2nd row has a 0 in it. The result will be:
1st Year Gender GBP1 Future All CCDC144B
0 1 0
1 1 1
I have code here that gets the column names and then I attempt to drop them, however I am getting a key error.
drop_columns = []
for x in percent_scoring:
if percent_scoring[x][1] == 0:
drop_columns.append(x)
for x in drop_columns:
percent_scoring = percent_scoring.drop(columns=x)
but I get an unexpected key error
KeyError: "['1st Year All CDK6', '1st Year Gender CDK6', '1st Year Gender LY9'] not in index"
Not sure why the key error, but an easy way to do this would be appreciated. I couldn't find any info on this task which seems to be simple. Thanks
Upvotes: 1
Views: 1153
Reputation: 496
I would use loc
and iloc
to just select all columns that do not have a 0 value in the second row.
# Create dummy DataFrame
d = {'col1': [0, 2], 'col2': [3, 0], 'col4': [3, 1], 'col5': [0, 0]}
df = pd.DataFrame(data=d)
col1 col2 col4 col5
0 0 3 3 0
1 2 0 1 0
# Select all columns where the second row doesn't equal 0
new_df = df.loc[:,~(df.iloc[1]==0)]
print(new_df)
col1 col4
0 0 3
1 2 1
Upvotes: 1
Reputation: 36
I just tried your code and got no error. Maybe compare my results with yours:
import pandas as pd
d = {'1st Year Gender CDK6': [1, 0], '1st Year Gender GBP1': [1, 1], '1st Year Gender LY9': [1, 0], 'Future All CCDC144B': [0, 1]}
df = pd.DataFrame(data=d)
drop_columns = []
for x in df:
if df[x][1] == 0:
drop_columns.append(x)
for x in drop_columns:
df = df.drop(columns=x)
df first:
1st Year Gender CDK6 1st Year Gender GBP1 1st Year Gender LY9 Future All CCDC144B
0 1 1 1 0
1 0 1 0 1
after:
1st Year Gender GBP1 Future All CCDC144B
0 1 0
1 1 1
Upvotes: 0
Reputation: 321
Instead of
percent_scoring = percent_scoring.drop(columns=x)
try:
del percent_scoring[x]
Upvotes: 0