Papouche Guinslyzinho
Papouche Guinslyzinho

Reputation: 5458

How can I get the number of occurence of 2 rows in a DataFrame?

I am trying to create a network graph. My desired output should have 3 columns: from, to, value

import pandas as pd
data = [
    ['nyc', 'la'], 
    ['nyc', 'atl'], 
    ['nyc', 'la'], 
    ['nyc', 'la'], 
    ['nyc', 'mia'], 
    ['nyc', 'wash'], 
    ['nyc', 'la'], 
    ['dtr', 'la']
    ] 

df = pd.DataFrame(data, columns = ['from', 'to']) 

desired outcome

pd.DataFrame({
        "from": ['nyc', 'nyc', 'nyc', 'dtr'],
        "to": ['la', 'atl', 'wash', 'la'],
        "value": [4, 1, 1, 1]}) 

How can I get the number of occurence of 2 columns in a dataframe?

When I do df.groupby(['from', 'to']).count() I get an empty dataframe

>>> df.groupby(['from', 'to']).count()                                                        
Empty DataFrame
Columns: []
Index: [(dtr, la), (nyc, atl), (nyc, la), (nyc, mia), (nyc, wash)]

Upvotes: 2

Views: 31

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150785

You can use groupby().value_counts:

df.groupby('from')['to'].value_counts().reset_index(name='value')

Output:

  from    to  value
0  dtr    la      1
1  nyc    la      4
2  nyc   atl      1
3  nyc   mia      1
4  nyc  wash      1

Upvotes: 2

fibonachoceres
fibonachoceres

Reputation: 777

You're probably looking to use df.groupby(['from', 'to']).size()

Upvotes: 1

Related Questions