Reinaldo Chaves
Reinaldo Chaves

Reputation: 995

In pandas / numpy, how to make a PivotTable with count of string items?

In python3 and pandas I have this dataframe:

df_selecao_atual.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 63 entries, 2 to 72
Data columns (total 24 columns):
nome                     63 non-null object
nome_completo            63 non-null object
partido                  63 non-null object
uf                       63 non-null object
cargo_parlamentar        63 non-null object
tipo                     63 non-null object
classe                   63 non-null object
numero                   63 non-null object
único                    63 non-null object
assunto                  63 non-null object
data_inicial             63 non-null object
data_final               63 non-null object
andamento                63 non-null object
link                     63 non-null object
transparencia            63 non-null object
conferencia              63 non-null object
data_conferencia         63 non-null object
resumo                   62 non-null object
observacao               60 non-null object
link_noticia_tribunal    22 non-null object
interessa                62 non-null object
ministro_relator         63 non-null object
processo_conectado       20 non-null object
situacao                 63 non-null object
dtypes: object(24)
memory usage: 12.3+ KB

Column "nomes" has names of people. The column "tipos" has the types of processes these people have - "INQ" or "AP"

I made a pivot table to count how many types of processes each has:

df_selecao_atual.pivot_table('tipo','nome',aggfunc='count').fillna(0).astype(int)

                tipo
nome    
ARTHUR LIRA     10
BENEDITO DE LIRA    3
CÉSAR MESSIAS   2
CÍCERO ALMEIDA  4
FERNANDO COLLOR     12
FLAVIANO MELO   4
GLADSON CAMELI  3
JORGE VIANA     5
MARX BELTRÃO    1
MAURÍCIO QUINTELLA LESSA    1
PAULÃO  2
RENAN CALHEIROS     3
ROCHA   1
RONALDO LESSA   3
SIBÁ MACHADO    1
SÉRGIO PETECÃO  8

But the result just counts how many lines in the column "tipo"

I look for a result like this:

ARTHUR LIRA     
    INQ 9
    AP 1
BENEDITO DE LIRA
    INQ 3 
    AP 0
CÉSAR MESSIAS   
    INQ 1
    AP 1
...

That is, count how many types "INQ" and "AP" exist in each name

Please, does anyone know how I can do it?

Data sample:

df_selecao_atual[['nome','tipo']]
nome    tipo
2   CÉSAR MESSIAS   INQ
3   CÉSAR MESSIAS   AP
4   FLAVIANO MELO   INQ
5   FLAVIANO MELO   INQ
6   FLAVIANO MELO   AP
7   FLAVIANO MELO   INQ
10  ROCHA   AP
13  SIBÁ MACHADO    INQ
14  GLADSON CAMELI  INQ
15  GLADSON CAMELI  INQ
16  GLADSON CAMELI  INQ
17  JORGE VIANA     INQ
18  JORGE VIANA     INQ
19  JORGE VIANA     INQ
20  JORGE VIANA     INQ
21  JORGE VIANA     INQ
22  SÉRGIO PETECÃO  INQ
23  SÉRGIO PETECÃO  INQ
...

Upvotes: 1

Views: 125

Answers (1)

jezrael
jezrael

Reputation: 863166

You can use:

df_selecao_atual.pivot_table(index=['tipo','nome'],aggfunc='size')

Or:

df_selecao_atual.groupby(['tipo','nome']).size()

Upvotes: 1

Related Questions