Dinosaurius
Dinosaurius

Reputation: 8628

Create intervals for a continuous variable

Given the following data frame df:

A      B
14.5   1
12.1   3
14.2   4
5.0    1
6.0    3
8.0    5
12.0   1

I want to create a chart with median values of B per each interval of values in A (step size is equal to 3).

I can create this chart without using intervals.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

grouped_df = df.groupby('A')['B'].aggregate(np.median).reset_index()
plt.figure(figsize=(12,8))
sns.pointplot(grouped_df.A.values, grouped_df.B.values)
plt.ylabel('Median B', fontsize=12)
plt.xlabel('A', fontsize=12)
plt.show()

But in this case the chart looks very messy. Therefore I want to put the values of A into the intervals of 3. How can I do it?

Upvotes: 0

Views: 4072

Answers (1)

gereleth
gereleth

Reputation: 2482

You can use pd.cut to cut a continuous variable into bins:

cut = pd.cut(df.A, bins=list(range(3,18,3))
grouped_df = df.groupby(cut)['B'].median().reset_index()
#           A  B
# 0    (3, 6]  2
# 1    (6, 9]  5
# 2   (9, 12]  1
# 3  (12, 15]  3

Upvotes: 2

Related Questions