Reputation: 53

Pandas to sum unique values, and into a table

I am trying to figure out how to built a table in pandas, having pandas count unique values, retreived from an excel sheet.

table:

|--------------|--------------------|
|  location    |   signal           |
|--------------|--------------------|
|  New York    |  Vehicle 20 open   |
|  New York    |  Vehicle 22 open   |
|  Washington  |  Vehicle 20 open   |
|  Washington  |  Vehicle 21 open   |
|  New York    |  Vehicle 20 open   |
|  New York    |  Vehicle 22 open   |
|  Washington  |  Vehicle 20 open   |
|  Washington  |  Vehicle 21 open   |
|  New York    |  Vehicle 20 open   |
|  New York    |  Vehicle 22 open   |
|  Washington  |  Vehicle 20 closed |
|  Washington  |  Vehicle 21 closed |
|  New York    |  Vehicle 20 closed |
|  New York    |  Vehicle 22 closed |
|  Washington  |  Vehicle 20 closed |
|  Washington  |  Vehicle 21 closed |
|  New York    |  Vehicle 20 open   |
|  New York    |  Vehicle 20 open   |
|  New York    |  Vehicle 20 open   |
|--------------|--------------------|

How I wan't it to be printed out (And exported into Excel)

|--------------|-------------------|------------------|
|  Alarmtype   |   Vehicle open    |  Vehicle Closed  | 
|--------------|-------------------|------------------|
|  New York    |      9            |      2           |
|  Washington  |      4            |      4           |
|--------------|-------------------|------------------|

So I want to count the amount of times each event(group) happens within each location, and some them up into a table

This is what I have tried

top = df.groupby(['Location', 'Sign Descr']).count()

or

sorted = df.sort_values(["Location", "Sign Descr"]).groupby(['Location', 'Sign Descr']).nunique()

Upvotes: 1

Answers (3)

Himmat

Reputation: 166

You can have the same with the groupby and pivot as well. To try this, please find the code below

import pandas as pd

data = pd.read_csv('c.csv')
print(data)



grp_data = data.groupby(by=['location','status']).count().reset_index()
print(grp_data)
grp_data.pivot(index='location',columns='status',values=['signal'])

Original Data:

      location  signal  status
0     New York      20    open
1     New York      22    open
2   Washington      20    open
3   Washington      21    open
4     New York      20    open
5     New York      22    open
6   Washington      20    open
7   Washington      21    open
8     New York      20    open
9     New York      22    open
10  Washington      20  closed
11  Washington      21  closed
12    New York      20  closed
13    New York      22  closed
14  Washington      20  closed
15  Washington      21  closed
16    New York      20    open
17    New York      20    open
18    New York      20    open

Group By Output:

     location  status  signal
0    New York  closed       2
1    New York    open       9
2  Washington  closed       4
3  Washington    open       4

Final Output:

           signal
status  closed  open
location        
New York    2   9
Washington  4   4

Upvotes: 0

anky

Reputation: 75080

Another one with crosstab:

pd.crosstab(df.location,df.signal.str.replace('\d+',''))

signal      Vehicle  closed  Vehicle  open
location                                  
New York                  2              9
Washington                4              4

Upvotes: 2

Erfan

Reputation: 42886

First replace the numbers in your signal column, then use pd.pivot_table:

df['signal'] = df['signal'].str.replace('([0-9])', '')

pd.pivot_table(df, index='location', columns='signal', aggfunc='size')

signal      Vehicle  closed  Vehicle  open
location                                  
New York                  2              9
Washington                4              4

If you want Alarmtype as index name. Add rename_axis:

pd.pivot_table(df, index='location', columns='signal', aggfunc='size').rename_axis('Alarmtype')

signal      Vehicle  closed  Vehicle  open
Alarmtype                                 
New York                  2              9
Washington                4              4

Upvotes: 4

Pandas to sum unique values, and into a table

Answers (3)

Related Questions