a deluna
a deluna

Reputation: 23

Count the number of rows in a Pandas data frame that meet a particular criteria

I have a list of items that could have a variety of prefixes "Fa", "e", "Gi", "Te" etc.
This data is in a Pandas data frame.

I need to get a count of each in x below. So in this example, it would be Gi = 1 and "Fa" = 8. Ultimately I need to get to the max of that result (so "Fa"). Basically asking, for the various prefixes, which type do I have the most of. "Fa" in this example.

I've tried changing the axis but so far I am missing the mark. I could to this by looking through x below but I'd like to understand how to do it with pandas.

I can't do it on the data frame itself because counting that way gives me "Gi" as there are 14 "rows" with Gi (although all the same interface as you can see from the data below). I need the prefix of the the interfaces of which there are the most of that type.

amax = no_cpu['INT_PREFIX'].value_counts().argmax()

In [53]: x = df['DESTINATION_PORT'].value_counts().to_dict()                                                                                                                               

In [54]: x                                                                                                                                                                                     
Out[54]: 
{'Gi1/0/4': 14,
 'Fa1/0/18': 3,
 'Fa1/0/20': 2,
 'Fa1/0/17': 2,
 'Fa1/0/2': 1,
 'Fa1/0/11': 1,
 'Fa1/0/39': 1,
 'Fa1/0/4': 1,
 'Fa1/0/1': 1}

Here is an example of the data frame, if this helps.

In [59]: df.head(2)                                                                                                                                                                            
Out[59]: 
   DESTINATION_ADDRESS     TYPE VLAN DESTINATION_PORT             OUI INT_PREFIX PORT           IP
20      000b.7866.79f7  DYNAMIC    1          Fa1/0/1  TAIFATECH INC.         Fa    1      0.0.0.0
21      000c.294f.a20b  DYNAMIC    1         Fa1/0/17    VMware, Inc.         Fa   17  10.1.10.170

Ive also tried .gropuby but same thing. I don't need the maximum of the one with most items, "Gi" in this case, I need the maximum of the prefixes. There is one row of "Gi" and 8 of "Fa" so the answer is "Fa".

In [65]: df.groupby('INT_PREFIX').size()                                                                                                                                                       
Out[65]: 
INT_PREFIX
Fa    12
Gi    14
dtype: int64

In [66]: df.groupby('DESTINATION_PORT').size()                                                                                                                                                 
Out[66]: 
DESTINATION_PORT
Fa1/0/1      1
Fa1/0/11     1
Fa1/0/17     2
Fa1/0/18     3
Fa1/0/2      1
Fa1/0/20     2
Fa1/0/39     1
Fa1/0/4      1
Gi1/0/4     14
dtype: int64

Upvotes: 2

Views: 234

Answers (3)

Code Different
Code Different

Reputation: 93151

Since your ultimate goal is to get the prefix with the max count:

max_prefix = df['INT_PREFIX'].value_counts().index[0]

Upvotes: 1

BENY
BENY

Reputation: 323226

You may check with nunique

df.groupby(df.INT_PREFIX)['DESTINATION_PORT'].nunique()

Upvotes: 1

Georgina Skibinski
Georgina Skibinski

Reputation: 13387

IIUC Try:

df["DESTINATION_PORT"].str.replace(r"/.*", "").value_counts()

(That is if this prefix you want to group by is everything before first /)

Upvotes: 1

Related Questions