Reputation: 167
I divided the following dataframe into 4 intervals according to the 'ages' column. Let's say that I want another dataframe to have the same exact intervals, is there a quick way to do so? In other words, the following lines
df1['age_groups'] = pd.cut(df1.ages,4)
print(df1['age_groups'])
divides the dataframe into the following intervals
(1.944, 16.0] 5
(16.0, 30.0] 3
(30.0, 44.0] 2
(44.0, 58.0] 2
but if I have a different dataframe with slightly different numbers in a column with the same name, the same code will produce different intervals. How do I make sure I can subdivide other dataframes into the same intervals?
ages=[35.000000,
2.000000,
27.000000,
14.000000,
4.000000,
58.000000,
20.000000,
39.000000,
14.000000,
55.000000,
2.000000,
29.699118]
values=[1,0,1,1,0,0,0,1,0,0,1,1]
df1=pd.DataFrame()
df1['ages']=ages
df1['values']=values
#print(df1)
df1['age_groups'] = pd.cut(df1.ages,4)
Upvotes: 0
Views: 967
Reputation: 23156
retbins
keywordbins
argument in for the second DataFrame:df1['age_groups'], bins = pd.cut(df1["ages"], 4, retbins=True)
df2['age_groups'] = pd.cut(df2["ages"], bins=bins)
import numpy as np
import pandas as pd
np.random.seed(100)
df1 = pd.DataFrame({"ages": np.random.randint(10, 80, 20)})
df2 = pd.DataFrame({"ages": np.random.randint(10, 80, 20)})
df1['age_groups'], bins = pd.cut(df1["ages"], 4, retbins=True)
df2['age_groups'] = pd.cut(df2["ages"], bins=bins)
>>> df1.head()
ages age_groups
0 18 (11.935, 28.25]
1 34 (28.25, 44.5]
2 77 (60.75, 77.0]
3 58 (44.5, 60.75]
4 20 (11.935, 28.25]
>>> df2.head()
ages age_groups
0 11 NaN
1 23 (11.935, 28.25]
2 14 (11.935, 28.25]
3 69 (60.75, 77.0]
4 77 (60.75, 77.0]
Upvotes: 2