Daniel
Daniel

Reputation: 31

How to extract rows with conditions on multiple columns in Python

I would like to return all columns for a unique value of Col1 and Col2 if Col3 is not 9. For example for a given dataframe df the returned dataframe will be selected

import pandas as pd
df = pd.DataFrame.from_dict({
'Col1': [1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2], 
'Col2':[1,1,1,1,2,2,2,3,3,'A','A','A','B','B','AA',1,1,2,2,2,3,3,'A','A','A','B','B','AA','AA','BA'], 
'Col3': [1,2,3,9,1,5,9,6,9,2,6,7,8,9,5,3,9,7,8,9,3,7,3,7,9,7,9,5,9,5]
})
selected= pd.DataFrame.from_dict({
'Col1': [1,1,1,1,2,2,2], 
'Col2': ['A','A','A','AA',3,3,'BA'], 
'Col3': [2,6,7,5,3,7,5]
})

here is my code I tried.

selected = {'Col1': [], 'Col2' : [], 'Col3':[]}
for i in df.Col1.unique():
    for val in df[df['Col1'] == i].Col2.unique():
        val2 = df[df['Col2'] == val].Col3.values
        if val2.any() == 9:
            pass
        else:
            selected['Col1'].append(i)
            selected['Col2'].append(val)
            selected['Col3'].append(val2)

selected =  pd.DataFrame.from_dict(selected)

Here is the error I get:

ValueError: arrays must all be same length

Upvotes: 1

Views: 200

Answers (1)

Uttasarga Singh
Uttasarga Singh

Reputation: 27

I looked at your sample data and tried to understand the question that you are trying to ask.

Dataset - Looking at the expected data from the whole dataset, Column 3 doesn't consists of any repeated values or duplicates.

You can filter your whole dataset using drop_duplicates().

df = df.drop_duplicates(subset = ["col3"])

Also, I would appreciate if you post a question; rather than commenting it, Like the manner you posted about your dataset.

Welcome to Stack Overflow!

Upvotes: 1

Related Questions