Reputation: 89
I have an ID and its corresponding multiple ages
|ID| AGE| SEX|
|----|---- |---- |
|25| 11|1 |
|25| 12| 1 |
|18| 11| 1 |
|18| 12| 2 |
|18| 13|2 |
|199| 11| 1 |
|409| 11| 1 |
I would like to plot number of profiles for each ID and its corresponding age, and use hue to separate by gender. For example, So,11 occurred 1 time for ID 25, 1 time for ID 18, Id 199 and ID 409. It is not necassary to show ID number, but just occurrence of each age for different ID should be shown.How can I achieve this?
I used following approach. The problem is when I use unique(), I get a list of ages for some ID numbers.
for an_id in df.ID.unique():
if (len(df[df['ID'] == an_id]['AGE'].unique()))==1:
print(an_id, df[df['ID'] == an_id]['AGE'].unique()[0])
else:
print(an_id,df[df['ID']==an_id]['AGE'].unique())
How can I plot number of unique ID for each age?
Upvotes: 0
Views: 89
Reputation: 604
this works ?
df = pd.DataFrame(data={
"AGE":[11,12,13,13,13,13,12,11,11,11,11],
"ID":[1,2,3,4,5,6,7,8,1,2,1],
"SEX":["MALE","FEMALE","FEMALE","FEMALE","FEMALE","MALE","MALE","MALE","MALE","MALE","MALE"]
})
pivot = df.pivot_table(index=["AGE","SEX"], aggfunc="value_counts").reset_index().rename(columns={0:"COUNTS"})
import seaborn as sb
sb.scatterplot(x=pivot["AGE"], y=pivot["ID"], hue=pivot["SEX"], size=pivot["COUNTS"])
Upvotes: 1