How to delete DF rows based on multiple column conditions?

Question

Here's an example of DF:

        EC1     EC2     CDC      L1      L2      L3      L4      L5      L6      VNF
0    [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]   [1, 0]
1    [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]   [0, 1]
2    [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [-1, 0]
3    [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, -1]
4    [0, 0]  [0, 0]  [0, 1]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 1]  [0, 1]   [1, 0]
5    [0, 0]  [0, 0]  [0, 1]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 1]  [0, 1]   [0, 1]
6    [1, 0]  [0, 0]  [0, 1]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 1]  [0, 1]  [-1, 0]

How to delete those rows where df['VNF'] = [-1, 0] or [0, -1] and df['EC1'], df['EC2'] and df['CDC'] has a value of 0 in the same index position as the -1 in df['VNF'])?

The expected result would be:

        EC1     EC2     CDC      L1      L2      L3      L4      L5      L6      VNF
0    [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]   [1, 0]
1    [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]   [0, 1]
2    [0, 0]  [0, 0]  [0, 1]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 1]  [0, 1]   [1, 0]
3    [0, 0]  [0, 0]  [0, 1]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 1]  [0, 1]   [0, 1]
4    [1, 0]  [0, 0]  [0, 1]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 1]  [0, 1]  [-1, 0]

Here's the constructor for the DataFrame:

data = {'EC1': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [1, 0]],
 'EC2': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
 'CDC': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 1], [0, 1], [0, 1]],
 'L1': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
 'L2': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
 'L3': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
 'L4': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0], [0, 0]],
 'L5': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 1], [0, 1], [0, 1]],
 'L6': [[0, 0], [0, 0], [0, 0], [0, 0], [0, 1], [0, 1], [0, 1]],
 'VNF': [[1, 0], [0, 1], [-1, 0], [0, -1], [1, 0], [0, 1], [-1, 0]]}

user7864386 · Accepted Answer

You can explode every column of df, then identify the elements satisfying the first (sum of "VNF" values must be -1) and second condition and filter out the elements that satisfy both conditions to create temp. Then since each cell must have two elements, you can count whether each index contains 2 elements by transforming count, then filter the rows with two indices and groupby the index and aggregate to list:

exploded = df.explode(df.columns.tolist())
first_cond = exploded.groupby(level=0)['VNF'].transform('sum').eq(-1)
second_cond = exploded['VNF'].eq(-1) & exploded['EC1'].eq(0) & exploded['EC2'].eq(0) & exploded['CDC'].eq(0)

temp = exploded[~(first_cond & second_cond)]
out = temp[temp.groupby(level=0)['VNF'].transform('count').gt(1)].groupby(level=0).agg(list).reset_index(drop=True)

Output:

      EC1     EC2     CDC      L1      L2      L3      L4      L5      L6  \
0  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]   
1  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 0]   
2  [0, 0]  [0, 0]  [0, 1]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 1]  [0, 1]   
3  [0, 0]  [0, 0]  [0, 1]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 1]  [0, 1]   
4  [1, 0]  [0, 0]  [0, 1]  [0, 0]  [0, 0]  [0, 0]  [0, 0]  [0, 1]  [0, 1]   

       VNF  
0   [1, 0]  
1   [0, 1]  
2   [1, 0]  
3   [0, 1]  
4  [-1, 0]

How to delete DF rows based on multiple column conditions?

Answers (2)

Related Questions