Khaine775
Khaine775

Reputation: 2765

Pandas subplot using two series

I have two series' which contains the same data, but they contain a different number of occurrences of this data. I want to compare these two series' by making a bar chart, where the two are compared. Below is what I've done so far.

import matplotlib.patches as mpatches

fig = plt.figure()

ax = fig.add_subplot(111)

width = 0.3

tree_amount15.plot(kind='bar', color='red', ax=ax, width=width, position=1, label='NYC')
queens_tree_types.plot(kind='bar', color='blue', ax=ax, width=width, position=0, label='Queens')
plt.legend(bbox_to_anchor=(0., 1.02, 1., .102), loc=3,
       ncol=2, mode="expand", borderaxespad=0.)

ax.set_ylabel('Total trees')
ax.set_xlabel('Tree names')

plt.show()

Which gives me the following chart:

enter image description here

The problem I have is that, even though all the 'Tree names' are the same in each series, the 'Total trees' is of course different, so for example, #5 (Callery pear) is only #5 in 'tree_amount15', where it's #3 in 'queens_tree_types' and so on. How can I order the series such that it's the value that corresponds to the right label shown on the chart? Because right now, it's the labels from the series that gets added first, which is shown, which makes the values of the second series be misleading.

Any hints?

Here's how the two series look, when I do a value_counts() them.

tree_amount15:

London planetree     87014
honeylocust          64264
Callery pear         58931
pin oak              53185
Norway maple         34189
littleleaf linden    29742
cherry               29279
Japanese zelkova     29258
ginkgo               21024
Sophora              19338
red maple            17246
green ash            16251
American linden      13530
silver maple         12277
sweetgum             10657
northern red oak      8400
silver linden         7995
American elm          7975
maple                 7080
purple-leaf plum      6879

queens_tree_types:

London planetree     31111
pin oak              22610
honeylocust          20290
Norway maple         19407
Callery pear         16547
cherry               13497
littleleaf linden    11902
Japanese zelkova      8987
green ash             7389
silver maple          6116
ginkgo                5971
Sophora               5386
red maple             4935
American linden       4769
silver linden         4146
purple-leaf plum      3035
maple                 2992
northern red oak      2697
sweetgum              2489
American elm          1709

Upvotes: 0

Views: 139

Answers (1)

James
James

Reputation: 36598

You can create a data frame from your two series that uses the tree name index. By default pandas will sort the index alphabetically, so we tell it to sort using the values of NYC. With both series as columns, we can use a single call to the plot method to put them on the same graph.

df = pd.concat([tree_amount15, queens_tree_types], axis=1).rename_axis(
          {0:'NYC', 1:'Queens'}, axis='columns') # sets the column names

df.sort_values('NYC', ascending=False)           # sort the df using NYC values

df.plot.bar(color=['red','blue'])  

Upvotes: 2

Related Questions