Reputation: 2765
I have two series' which contains the same data, but they contain a different number of occurrences of this data. I want to compare these two series' by making a bar chart, where the two are compared. Below is what I've done so far.
import matplotlib.patches as mpatches
fig = plt.figure()
ax = fig.add_subplot(111)
width = 0.3
tree_amount15.plot(kind='bar', color='red', ax=ax, width=width, position=1, label='NYC')
queens_tree_types.plot(kind='bar', color='blue', ax=ax, width=width, position=0, label='Queens')
plt.legend(bbox_to_anchor=(0., 1.02, 1., .102), loc=3,
ncol=2, mode="expand", borderaxespad=0.)
ax.set_ylabel('Total trees')
ax.set_xlabel('Tree names')
plt.show()
Which gives me the following chart:
The problem I have is that, even though all the 'Tree names' are the same in each series, the 'Total trees' is of course different, so for example, #5 (Callery pear) is only #5 in 'tree_amount15', where it's #3 in 'queens_tree_types' and so on. How can I order the series such that it's the value that corresponds to the right label shown on the chart? Because right now, it's the labels from the series that gets added first, which is shown, which makes the values of the second series be misleading.
Any hints?
Here's how the two series look, when I do a value_counts() them.
tree_amount15:
London planetree 87014
honeylocust 64264
Callery pear 58931
pin oak 53185
Norway maple 34189
littleleaf linden 29742
cherry 29279
Japanese zelkova 29258
ginkgo 21024
Sophora 19338
red maple 17246
green ash 16251
American linden 13530
silver maple 12277
sweetgum 10657
northern red oak 8400
silver linden 7995
American elm 7975
maple 7080
purple-leaf plum 6879
queens_tree_types:
London planetree 31111
pin oak 22610
honeylocust 20290
Norway maple 19407
Callery pear 16547
cherry 13497
littleleaf linden 11902
Japanese zelkova 8987
green ash 7389
silver maple 6116
ginkgo 5971
Sophora 5386
red maple 4935
American linden 4769
silver linden 4146
purple-leaf plum 3035
maple 2992
northern red oak 2697
sweetgum 2489
American elm 1709
Upvotes: 0
Views: 139
Reputation: 36598
You can create a data frame from your two series that uses the tree name index. By default pandas will sort the index alphabetically, so we tell it to sort using the values of NYC. With both series as columns, we can use a single call to the plot
method to put them on the same graph.
df = pd.concat([tree_amount15, queens_tree_types], axis=1).rename_axis(
{0:'NYC', 1:'Queens'}, axis='columns') # sets the column names
df.sort_values('NYC', ascending=False) # sort the df using NYC values
df.plot.bar(color=['red','blue'])
Upvotes: 2