cdub
cdub

Reputation: 25701

Figuring out if an entire column in a Pandas dataframe is the same value or not

I have a pandas dataframe that works just fine. I am trying to figure out how to tell if a column with a label that I know if correct does not contain all the same values.

The code

below errors out for some reason when I want to see if the column contains -1 in each cell
# column = "TheColumnLabelThatIsCorrect"
# df = "my correct dataframe"

# I get an () takes 1 or 2 arguments but 3 is passed in error    
if (not df.loc(column, estimate.eq(-1).all())):

I just learned about .eq() and .all() and hopefully I am using them correctly.

Upvotes: 1

Views: 378

Answers (4)

wwnde
wwnde

Reputation: 26676

Question not as clear. Lets try the following though

Contains only -1 in each cell

df['estimate'].eq(-1).all()

Contains -1 in any cell

df['estimate'].eq(-1).any()

Filter out -1 and all columns

df.loc[df['estimate'].eq(-1),:]

Upvotes: 1

PreciXon
PreciXon

Reputation: 453

df['column'].value_counts() gives you a list of all unique values and their counts in a column. As for checking if all the values are a specific number, you can do that by dropping duplicates and checking the length to be 1.

len(set(df['column'])) == 1

Upvotes: 0

onepan
onepan

Reputation: 954

It's a syntax issue - see docs for .loc/indexing. Specifically, you want to be using [] instead of ()

You can do something like

if not df[column].eq(-1).all():
    ...

If you want to use .loc specifically, you'd do something similar:

if not df.loc[:, column].eq(-1).all():
    ...

Also, note you don't need to use .eq(), you can just do (df[column] == -1).all()) if you prefer.

Upvotes: 2

Franco Piccolo
Franco Piccolo

Reputation: 7410

You could drop duplicates and if you get only one record it means all records are the same.

import pandas as pd
df = pd.DataFrame({'col': [1, 1, 1, 1]})
len(df['col'].drop_duplicates()) == 1
> True

Upvotes: 1

Related Questions