cjmaria
cjmaria

Reputation: 340

Creating a single boxplot from multiple dataframes

I have three dataframes containing only one column, 'Time', and differing numbers of rows of pandas datetime values. Ex:

      Time
0  3 days    
1  16 days   
2  6 days     
3  4 days     
4  4 days     
5  4 days     

I would like to create a single box plot (candlestick) that has three bars representing the distribution of the times in all dataframes side by side. How do I go about accomplishing this?

Upvotes: 3

Views: 7007

Answers (1)

Mr_Z
Mr_Z

Reputation: 539

You could accomplish this by adding a column to each dataframe which holds an information to group the single values. Here is a small example:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

I have generated only some random int values.

time1 = pd.DataFrame(np.random.randint(1,30,10), columns=['Time'] )
time2 = pd.DataFrame(np.random.randint(1,30,10), columns=['Time'] )
time3 = pd.DataFrame(np.random.randint(1,30,10), columns=['Time'] )

Instead of int values you can also use the pands Timedelta. But you need to get the days value.

time1 = pd.DataFrame([pd.Timedelta(days=random.randint(0,30)).days for x in range(10)], columns=['Time'] )
time2 = pd.DataFrame([pd.Timedelta(days=random.randint(0,30)).days for x in range(10)], columns=['Time'] )
time3 = pd.DataFrame([pd.Timedelta(days=random.randint(0,30)).days for x in range(10)], columns=['Time'] )

Then I added the column "Data" to each dataframe with a unique identifier.

time1["Data"] = "A"
time2["Data"] = "B"
time3["Data"] = "C"

Now I concatenate all dataframe.

times = [time1, time2, time3]
allTimes = pd.concat(times)

With the method boxplot you can group now the data by the column "Data"

plt.figure()
allTimes.boxplot(by="Data")

This results in following image:

enter image description here

Upvotes: 6

Related Questions