Reputation: 31
I would like to return all columns for a unique value of Col1
and Col2
if Col3
is not 9
. For example for a given dataframe df
the returned dataframe will be selected
import pandas as pd
df = pd.DataFrame.from_dict({
'Col1': [1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2],
'Col2':[1,1,1,1,2,2,2,3,3,'A','A','A','B','B','AA',1,1,2,2,2,3,3,'A','A','A','B','B','AA','AA','BA'],
'Col3': [1,2,3,9,1,5,9,6,9,2,6,7,8,9,5,3,9,7,8,9,3,7,3,7,9,7,9,5,9,5]
})
selected= pd.DataFrame.from_dict({
'Col1': [1,1,1,1,2,2,2],
'Col2': ['A','A','A','AA',3,3,'BA'],
'Col3': [2,6,7,5,3,7,5]
})
here is my code I tried.
selected = {'Col1': [], 'Col2' : [], 'Col3':[]}
for i in df.Col1.unique():
for val in df[df['Col1'] == i].Col2.unique():
val2 = df[df['Col2'] == val].Col3.values
if val2.any() == 9:
pass
else:
selected['Col1'].append(i)
selected['Col2'].append(val)
selected['Col3'].append(val2)
selected = pd.DataFrame.from_dict(selected)
Here is the error I get:
ValueError: arrays must all be same length
Upvotes: 1
Views: 200
Reputation: 27
I looked at your sample data and tried to understand the question that you are trying to ask.
Dataset - Looking at the expected data from the whole dataset, Column 3 doesn't consists of any repeated values or duplicates.
You can filter your whole dataset using drop_duplicates().
df = df.drop_duplicates(subset = ["col3"])
Also, I would appreciate if you post a question; rather than commenting it, Like the manner you posted about your dataset.
Welcome to Stack Overflow!
Upvotes: 1