Korzak
Korzak

Reputation: 385

Drop columns if all entries in column match item from a list in Pandas

I have a dataframe that I am trying to drop some columns from, based on their content. If all of the rows in a column have the same value as one of the items in a list, then I want to drop that column. I am having trouble doing this without messing up the loops. Is there a better way to do this, or some error I can fix? I am getting an error that says:

IndexError: index 382 is out of bounds for axis 0 with size 382

Code:

def trimADAS(df):
    notList = ["Word Recall Test","Result"]
    print("START")
    print(len(df.columns))
    numCols = len(df.columns)
    for h in range(numCols): #  for every column
        for i in range(len(notList)):   #   for every list item
            if df[df.columns[h]].all() == notList[i]:   #   if all column entries == list item
                print(notList[i])   #   print list item
                print(df[df.columns[h]])    #   print column
                print(df.columns[h])    #   print column name
                df.drop([df.columns[h]], axis = 1, inplace = True)  #   drop this column
                numCols -= 1
    print("END")
    print(len(df.columns))
    print(df.columns)
    return()

Upvotes: 1

Views: 479

Answers (1)

jpp
jpp

Reputation: 164773

Loopy isn't usually the way to go with pandas. Here's one solution.

import pandas as pd

df = pd.DataFrame({'A': [1, 1, 1, 1],
                   'B': [2, 2, 2, 2],
                   'C': [3, 3, 3, 3],
                   'D': [4, 4, 4, 4],
                   'E': [5, 5, 5, 5]})

lst = [2, 3, 4]

df = df.drop([x for x in df if any((df[x]==i).all() for i in lst)], 1)

#    A  E
# 0  1  5
# 1  1  5
# 2  1  5
# 3  1  5

Upvotes: 1

Related Questions