Rajarshi Das
Rajarshi Das

Reputation: 12320

pandas: group by with nan and with multi column

I have below datasets

host          infra    domain  user   entity   bios  ip
 
abcd.com      Lake      Hov    bat     sos     nvm   x.x.x.x
kpm.com       Lake      Hov    bat     sos     nvm   x.x.x.x
ffger.com     Data      JOV    sat     sim     nvm   x.x.x.x
ffger.com     Data      TOV    sat     sim     nvm   NAN
kko.com       Lake      POV    et      som     nvm   NAN
spm.com       Lake      Hov    bat     sos     nvm   NAN

I want to group by infra and with domain include all NAN

so one df would be if df.group_by(infra,domain) like below

abcd.com      Lake      Hov    bat     sos     nvm   x.x.x.x
kpm.com       Lake      Hov    bat     sos     nvm   x.x.x.x
spm.com       Lake      Hov    bat     sos     nvm   NAN

Upvotes: 0

Views: 79

Answers (2)

Arpit Omprakash
Arpit Omprakash

Reputation: 118

I think you can use the dropna parameter of the groupby function for the same.

pd.__version__
# '1.1.0.dev0+2004.g8d10bfb6f'

# Example from the docs
df

   a    b  c
0  1  2.0  3
1  1  NaN  4
2  2  1.0  3
3  1  2.0  2

# without NA (the default)
df.groupby('b').sum()

     a  c
b        
1.0  2  3
2.0  2  5
# with NA
df.groupby('b', dropna=False).sum()

     a  c
b        
1.0  2  3
2.0  2  5
NaN  1  4

Upvotes: 0

jezrael
jezrael

Reputation: 862511

You can create dictionary of DataFrames:

dfs = {f'{"_".join(name)}':df for name, df in df.groupby(['infra','domain'])}

print (dfs['Lake_Hov'])
       host infra domain user entity bios       ip
0  abcd.com  Lake    Hov  bat    sos  nvm  x.x.x.x
1   kpm.com  Lake    Hov  bat    sos  nvm  x.x.x.x
5   spm.com  Lake    Hov  bat    sos  nvm      NAN

If need loops by groups:

for name, df in tuple(df.groupby(['infra','domain'], sort=False)):
    print (df)
           host infra domain user entity bios       ip
    0  abcd.com  Lake    Hov  bat    sos  nvm  x.x.x.x
    1   kpm.com  Lake    Hov  bat    sos  nvm  x.x.x.x
    5   spm.com  Lake    Hov  bat    sos  nvm      NAN
            host infra domain user entity bios       ip
    2  ffger.com  Data    JOV  sat    sim  nvm  x.x.x.x
            host infra domain user entity bios   ip
    3  ffger.com  Data    TOV  sat    sim  nvm  NAN
          host infra domain user entity bios   ip
    4  kko.com  Lake    POV   et    som  nvm  NAN

Upvotes: 1

Related Questions