Reputation: 385
I have a dataframe that I am trying to drop some columns from, based on their content. If all of the rows in a column have the same value as one of the items in a list, then I want to drop that column. I am having trouble doing this without messing up the loops. Is there a better way to do this, or some error I can fix? I am getting an error that says:
IndexError: index 382 is out of bounds for axis 0 with size 382
Code:
def trimADAS(df):
notList = ["Word Recall Test","Result"]
print("START")
print(len(df.columns))
numCols = len(df.columns)
for h in range(numCols): # for every column
for i in range(len(notList)): # for every list item
if df[df.columns[h]].all() == notList[i]: # if all column entries == list item
print(notList[i]) # print list item
print(df[df.columns[h]]) # print column
print(df.columns[h]) # print column name
df.drop([df.columns[h]], axis = 1, inplace = True) # drop this column
numCols -= 1
print("END")
print(len(df.columns))
print(df.columns)
return()
Upvotes: 1
Views: 479
Reputation: 164773
Loopy isn't usually the way to go with pandas. Here's one solution.
import pandas as pd
df = pd.DataFrame({'A': [1, 1, 1, 1],
'B': [2, 2, 2, 2],
'C': [3, 3, 3, 3],
'D': [4, 4, 4, 4],
'E': [5, 5, 5, 5]})
lst = [2, 3, 4]
df = df.drop([x for x in df if any((df[x]==i).all() for i in lst)], 1)
# A E
# 0 1 5
# 1 1 5
# 2 1 5
# 3 1 5
Upvotes: 1