sp0iler
sp0iler

Reputation: 33

How to merge two plots in Pandas?

I want to merge two plots, that is my dataframe: df_inc.head()

id  date    real_exe_time   mean    mean+30%    mean-30%
0   Jan           31        33.14   43.0         23.0
1   Jan           30        33.14   43.0         23.0
2   Jan           33        33.14   43.0         23.0
3   Jan           38        33.14   43.0         23.0
4   Jan           36       33.14    43.0         23.0

My first plot: df_inc.plot.scatter(x = 'date', y = 'real_exe_time')

scatter

Then

My second plot: df_inc.plot(x='date', y=['mean','mean+30%','mean-30%'])

lines

When I try to merge with:

fig=plt.figure()
ax = df_inc.plot(x='date', y=['mean','mean+30%','mean-30%']);
df_inc.plot.scatter(x = 'date', y = 'real_exe_time', ax=ax)

plt.show()

I got the following:

fail

How I can merge the right way?

Upvotes: 3

Views: 10186

Answers (2)

Cadone
Cadone

Reputation: 101

I'm Guessing that you haven't transform the Date to a datetime object so the first thing you should do is this

#Transform the date to datetime object
df_inc['date']=pd.to_datetime(df_inc['date'],format='%b')
fig=plt.figure()
ax = df_inc.plot(x='date', y=['mean','mean+30%','mean-30%']);
df_inc.plot.scatter(x = 'date', y = 'real_exe_time', ax=ax)

plt.show()

Upvotes: 0

Mr. T
Mr. T

Reputation: 12410

You should not repeat your mean values as an extra column. df.plot() for categorical data will be plotted against the index - hence you will see the original scatter plot (also plotted against the index) squeezed into the left corner. You could create instead an additional aggregation dataframe that you can plot then into the same graph:

import matplotlib.pyplot as plt
import pandas as pd

#test data generation
import numpy as np
n=30
np.random.seed(123)
df = pd.DataFrame({"date": np.random.choice(list("ABCDEF"), n), "real_exe_time": np.random.randint(1, 100, n)})
df = df.sort_values(by="date").reindex()

#aggregate data for plotting
df_agg = df.groupby("date")["real_exe_time"].agg(mean="mean").reset_index()
df_agg["mean+30%"] = df_agg["mean"] * 1.3
df_agg["mean-30%"] = df_agg["mean"] * 0.7

#plot both into the same subplot
ax = df.plot.scatter(x = 'date', y = 'real_exe_time')
df_agg.plot(x='date', y=['mean','mean+30%','mean-30%'], ax=ax)

plt.show()

Sample output: enter image description here

You could also consider using seaborn that has, for instance, pointplots for categorical data aggregation.

Upvotes: 2

Related Questions