Plot the most (n) occurring values of a column with all of data of other column

Question

I have following data in a Pandas DataFrame from a sql query:

      latin_brands   group phone_brand_chinese_match only_latin_brands
0           xiaomi  M32-38                        小米            xiaomi
1           xiaomi  M32-38                        小米            xiaomi
2           xiaomi  M32-38                        小米            xiaomi
3           xiaomi  M29-31                        小米            xiaomi
4           xiaomi  M29-31                        小米            xiaomi
5             None  F24-26                      OPPO              OPPO
6          coolpad  M32-38                        酷派           coolpad
7           xiaomi  M32-38                        小米            xiaomi
8             None  M32-38                      vivo              vivo
9          samsung  F33-42                        三星           samsung
10          huawei  M29-31                        华为            huawei
11          huawei  F33-42                        华为            huawei
12         samsung  F27-28                        三星           samsung
13          huawei  M32-38                        华为            huawei
14         aiyouni    M39+                       艾优尼           aiyouni
15          huawei  F27-28                        华为            huawei
16          xiaomi  M32-38                        小米            xiaomi
17          xiaomi  M32-38                        小米            xiaomi
18           meizu    M39+                        魅族             meizu
19          xiaomi  M32-38                        小米            xiaomi
20         samsung  F33-42                        三星           samsung
21          xiaomi  M23-26                        小米            xiaomi
22          huawei  M23-26                        华为            huawei
23         samsung  M27-28                        三星           samsung
24          xiaomi  M29-31                        小米            xiaomi
25         samsung  M32-38                        三星           samsung
26         samsung  M32-38                        三星           samsung
27         samsung  F33-42                        三星           samsung
28         samsung  M32-38                        三星           samsung
29         samsung  M32-38                        三星           samsung
...            ...     ...                       ...               ...
74809       huawei  M27-28                        华为            huawei
74810         None  M29-31                       TCL               TCL

I want to map two columns and plot this on a line chart. My approach:

phones = phones.groupby(['only_latin_brands', 'group']).size()
phones = phones.unstack()
phones = phones.fillna(0)
phones.head()
phones.plot(kind='line')
plt.show()

I want to plot the relation between the group and the only_latin_brands.

How can I plot only the most occurring 20 only_latin_brands column with their groups?

Scott Boston · Accepted Answer

Using @AndyHayden start:

df[df.only_latin_brands.isin(df.groupby('only_latin_brands').size().nlargest(3).index)]\
  .groupby(['group','only_latin_brands']).size().unstack().fillna(0)\
  .plot(kind='line')

Edit to show all groups:

df[df.only_latin_brands.isin(df.groupby('only_latin_brands').size().nlargest(3).index)]\
  .groupby(['group','only_latin_brands']).size().unstack()\
  .reindex(df.group.unique()).fillna(0).plot(kind='line')

Plot the most (n) occurring values of a column with all of data of other column

Answers (2)

Edit to show all groups:

Related Questions