Reputation: 153
I have an example dataframe (df) like the one shown below, and I would like to use pandas to create a series with labels that correspond to each color and the number of times it appears an entry with that color appears in the dataframe, kind of like a totals for each color. I have tried the following, but Instead get a series with the total number of rows showing as the color sum for each color:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df= pd.read_csv('data_set.txt', index _col=0)
total_count = {_:len(df['type']) for _ in df['type'].unique() }
total_count
Current Output:
{'red': 12,
'green': 12,
'yellow': 12,
'blue': 12}
However, clearly there are not 12 entries for each of the 4 colors in the dataframe. What am I doing wrong?
number | date | color | weight | temperature | size |
---|---|---|---|---|---|
0 | 1/1/2021 | red | 0.2 | 0.2 | big |
1 | 1/1/2021 | red | 0.6 | 0.6 | small |
2 | 1/1/2021 | red | 0.4 | 0.6 | small |
3 | 1/1/2021 | green | 0.2 | 0.4 | big |
4 | 1/1/2021 | green | 1 | 1 | small |
5 | 1/1/2021 | yellow | 0.4 | 0.4 | big |
6 | 1/1/2021 | yellow | 0.1 | 0.2 | big |
7 | 1/1/2021 | yellow | 1.3 | 0.5 | big |
8 | 1/1/2021 | yellow | 1.5 | 0.5 | small |
9 | 1/1/2021 | yellow | 1.5 | 0.5 | small |
10 | 1/1/2021 | blue | 0.4 | 0.3 | big |
11 | 1/1/2021 | blue | 0.8 | 0.2 | small |
Upvotes: 0
Views: 148
Reputation: 24322
try:-
df['color'].value_counts()
Output:-
yellow 5
red 3
green 2
blue 2
Name: color, dtype: int64
Upvotes: 1