john johns
john johns

Reputation: 167

python: divide a dataframe into the same intervals as another dataframe

I divided the following dataframe into 4 intervals according to the 'ages' column. Let's say that I want another dataframe to have the same exact intervals, is there a quick way to do so? In other words, the following lines

df1['age_groups'] = pd.cut(df1.ages,4)
print(df1['age_groups'])

divides the dataframe into the following intervals

(1.944, 16.0]    5
(16.0, 30.0]     3
(30.0, 44.0]     2
(44.0, 58.0]     2

but if I have a different dataframe with slightly different numbers in a column with the same name, the same code will produce different intervals. How do I make sure I can subdivide other dataframes into the same intervals?

ages=[35.000000,
       2.000000,
      27.000000,
      14.000000,
      4.000000,
     58.000000,
     20.000000,
     39.000000,
     14.000000,
     55.000000,
      2.000000,
     29.699118]
values=[1,0,1,1,0,0,0,1,0,0,1,1]
df1=pd.DataFrame()
df1['ages']=ages
df1['values']=values
#print(df1)

df1['age_groups'] = pd.cut(df1.ages,4)

Upvotes: 0

Views: 967

Answers (1)

not_speshal
not_speshal

Reputation: 23156

  1. Save the bins from the first DataFrame using the retbins keyword
  2. Use it as the bins argument in for the second DataFrame:
df1['age_groups'], bins = pd.cut(df1["ages"], 4, retbins=True)
df2['age_groups'] = pd.cut(df2["ages"], bins=bins)
Working example:
import numpy as np
import pandas as pd

np.random.seed(100)
df1 = pd.DataFrame({"ages": np.random.randint(10, 80, 20)})
df2 = pd.DataFrame({"ages": np.random.randint(10, 80, 20)})

df1['age_groups'], bins = pd.cut(df1["ages"], 4, retbins=True)
df2['age_groups'] = pd.cut(df2["ages"], bins=bins)

>>> df1.head()

    ages     age_groups
0   18  (11.935, 28.25]
1   34    (28.25, 44.5]
2   77    (60.75, 77.0]
3   58    (44.5, 60.75]
4   20  (11.935, 28.25]

>>> df2.head()

    ages     age_groups
0   11              NaN
1   23  (11.935, 28.25]
2   14  (11.935, 28.25]
3   69    (60.75, 77.0]
4   77    (60.75, 77.0]

Upvotes: 2

Related Questions