prat_pad
prat_pad

Reputation: 41

Using value_counts in pandas with conditions

I have a column with around 20k values. I've used the following function in pandas to display their counts:

weather_data["snowfall"].value_counts()

weather_data is the dataframe and snowfall is the column.

My results are:

0.0     12683
M       7224
T       311
0.2     32
0.1     31
0.5     20
0.3     18
1.0     14
0.4     13

etc.

Is there a way to:

  1. Display the counts of only a single variable or number

  2. Use an if condition to display the counts of only those values which satisfy the condition?

Upvotes: 3

Views: 22863

Answers (2)

lubna_shereen
lubna_shereen

Reputation: 1

Use an if condition to display the counts of only those values which satisfy the condition?

First create a new column based on the condition you want. Then you can use groupby and sum.

For example, if you want to count the frequency only if a column has a non-null value. In my case, if there is an actual completion_date non-null value:

dataset['Has_actual_completion_date']  = np.where(dataset['ACTUAL_COMPLETION_DATE'].isnull(), 0, 1)
dataset['Mitigation_Plans_in_progress'] = dataset['Has_actual_completion_date'].groupby(dataset['HAZARD_ID']).transform('sum')

Upvotes: 0

ysearka
ysearka

Reputation: 3855

I'll be as clear as possible without having a full example as piRSquared suggested you to provide.

value_counts' output is a Series, therefore the values in your originale Series can be retrieved from the value_counts' index. Displaying only the result of one of the variables then is exactly slicing your series:

my_value_count = weather_data["snowfall"].value_counts()
my_value_count.loc['0.0']
output: 
0.0     12683

If you want to display only for a list of variables:

my_value_count.loc[my_value_count.index.isin(['0.0','0.2','0.1'])]
output: 
0.0     12683
0.2     32
0.1     31

As you have M and T in your values, I suspect the other values will be treated as strings and not float. Otherwise you could use:

my_value_count.loc[my_value_count.index < 0.4]
output:
0.0     12683
0.2     32
0.1     31
0.3     18

Upvotes: 4

Related Questions