dspractician
dspractician

Reputation: 89

How to zip just a string with a list of values in python?

I have an ID and its corresponding multiple ages

|ID| AGE| SEX|
|----|---- |---- |
|25| 11|1 |
|25| 12| 1 |
|18| 11| 1 |
|18| 12| 2 |
|18| 13|2 |
|199| 11| 1 |
|409| 11| 1 |

I would like to plot number of profiles for each ID and its corresponding age, and use hue to separate by gender. For example, So,11 occurred 1 time for ID 25, 1 time for ID 18, Id 199 and ID 409. It is not necassary to show ID number, but just occurrence of each age for different ID should be shown.How can I achieve this?

I used following approach. The problem is when I use unique(), I get a list of ages for some ID numbers.

for an_id in df.ID.unique():
    if (len(df[df['ID'] == an_id]['AGE'].unique()))==1:
        print(an_id, df[df['ID'] == an_id]['AGE'].unique()[0])
    else:
        print(an_id,df[df['ID']==an_id]['AGE'].unique())

How can I plot number of unique ID for each age?

Upvotes: 0

Views: 89

Answers (1)

SimoN SavioR
SimoN SavioR

Reputation: 604

this works ?

df = pd.DataFrame(data={
    "AGE":[11,12,13,13,13,13,12,11,11,11,11],
    "ID":[1,2,3,4,5,6,7,8,1,2,1],
    "SEX":["MALE","FEMALE","FEMALE","FEMALE","FEMALE","MALE","MALE","MALE","MALE","MALE","MALE"]
    })

pivot = df.pivot_table(index=["AGE","SEX"], aggfunc="value_counts").reset_index().rename(columns={0:"COUNTS"})

import seaborn as sb
sb.scatterplot(x=pivot["AGE"], y=pivot["ID"], hue=pivot["SEX"], size=pivot["COUNTS"])

Upvotes: 1

Related Questions