Reputation: 53
I am trying to figure out how to built a table in pandas, having pandas count unique values, retreived from an excel sheet.
table:
|--------------|--------------------|
| location | signal |
|--------------|--------------------|
| New York | Vehicle 20 open |
| New York | Vehicle 22 open |
| Washington | Vehicle 20 open |
| Washington | Vehicle 21 open |
| New York | Vehicle 20 open |
| New York | Vehicle 22 open |
| Washington | Vehicle 20 open |
| Washington | Vehicle 21 open |
| New York | Vehicle 20 open |
| New York | Vehicle 22 open |
| Washington | Vehicle 20 closed |
| Washington | Vehicle 21 closed |
| New York | Vehicle 20 closed |
| New York | Vehicle 22 closed |
| Washington | Vehicle 20 closed |
| Washington | Vehicle 21 closed |
| New York | Vehicle 20 open |
| New York | Vehicle 20 open |
| New York | Vehicle 20 open |
|--------------|--------------------|
How I wan't it to be printed out (And exported into Excel)
|--------------|-------------------|------------------|
| Alarmtype | Vehicle open | Vehicle Closed |
|--------------|-------------------|------------------|
| New York | 9 | 2 |
| Washington | 4 | 4 |
|--------------|-------------------|------------------|
So I want to count the amount of times each event(group) happens within each location, and some them up into a table
This is what I have tried
top = df.groupby(['Location', 'Sign Descr']).count()
or
sorted = df.sort_values(["Location", "Sign Descr"]).groupby(['Location', 'Sign Descr']).nunique()
Upvotes: 1
Views: 1113
Reputation: 166
You can have the same with the groupby and pivot as well. To try this, please find the code below
import pandas as pd
data = pd.read_csv('c.csv')
print(data)
grp_data = data.groupby(by=['location','status']).count().reset_index()
print(grp_data)
grp_data.pivot(index='location',columns='status',values=['signal'])
Original Data:
location signal status
0 New York 20 open
1 New York 22 open
2 Washington 20 open
3 Washington 21 open
4 New York 20 open
5 New York 22 open
6 Washington 20 open
7 Washington 21 open
8 New York 20 open
9 New York 22 open
10 Washington 20 closed
11 Washington 21 closed
12 New York 20 closed
13 New York 22 closed
14 Washington 20 closed
15 Washington 21 closed
16 New York 20 open
17 New York 20 open
18 New York 20 open
Group By Output:
location status signal
0 New York closed 2
1 New York open 9
2 Washington closed 4
3 Washington open 4
Final Output:
signal
status closed open
location
New York 2 9
Washington 4 4
Upvotes: 0
Reputation: 75080
Another one with crosstab
:
pd.crosstab(df.location,df.signal.str.replace('\d+',''))
signal Vehicle closed Vehicle open
location
New York 2 9
Washington 4 4
Upvotes: 2
Reputation: 42886
First replace the numbers in your signal
column, then use pd.pivot_table
:
df['signal'] = df['signal'].str.replace('([0-9])', '')
pd.pivot_table(df, index='location', columns='signal', aggfunc='size')
signal Vehicle closed Vehicle open
location
New York 2 9
Washington 4 4
If you want Alarmtype
as index name. Add rename_axis
:
pd.pivot_table(df, index='location', columns='signal', aggfunc='size').rename_axis('Alarmtype')
signal Vehicle closed Vehicle open
Alarmtype
New York 2 9
Washington 4 4
Upvotes: 4