Reputation: 340
I have three dataframes containing only one column, 'Time', and differing numbers of rows of pandas datetime
values. Ex:
Time
0 3 days
1 16 days
2 6 days
3 4 days
4 4 days
5 4 days
I would like to create a single box plot (candlestick) that has three bars representing the distribution of the times in all dataframes side by side. How do I go about accomplishing this?
Upvotes: 3
Views: 7007
Reputation: 539
You could accomplish this by adding a column to each dataframe which holds an information to group the single values. Here is a small example:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
I have generated only some random int values.
time1 = pd.DataFrame(np.random.randint(1,30,10), columns=['Time'] )
time2 = pd.DataFrame(np.random.randint(1,30,10), columns=['Time'] )
time3 = pd.DataFrame(np.random.randint(1,30,10), columns=['Time'] )
Instead of int
values you can also use the pands Timedelta
. But you need to get the days value.
time1 = pd.DataFrame([pd.Timedelta(days=random.randint(0,30)).days for x in range(10)], columns=['Time'] )
time2 = pd.DataFrame([pd.Timedelta(days=random.randint(0,30)).days for x in range(10)], columns=['Time'] )
time3 = pd.DataFrame([pd.Timedelta(days=random.randint(0,30)).days for x in range(10)], columns=['Time'] )
Then I added the column "Data" to each dataframe with a unique identifier.
time1["Data"] = "A"
time2["Data"] = "B"
time3["Data"] = "C"
Now I concatenate all dataframe.
times = [time1, time2, time3]
allTimes = pd.concat(times)
With the method boxplot you can group now the data by the column "Data"
plt.figure()
allTimes.boxplot(by="Data")
This results in following image:
Upvotes: 6