Reputation: 676
I have a dataframe:
speciality_id speciality_name
1 Acupuncturist
2 Andrologist
3 Anaesthesiologist
4 Audiologist
5 Ayurvedic Doctor
6 Biochemist
7 Biophysicist
I want to copy the above dataframe for a range of value, year and month.
For example:
year = [2018]
Month = [1,2]
I want to produce a dataframe like the following:
Year Month speciality_id speciality_name
2018 1 1 Acupuncturist
2018 1 2 Andrologist
2018 1 3 Anaesthesiologist
2018 1 4 Audiologist
2018 1 5 Ayurvedic Doctor
2018 1 6 Biochemist
2018 1 7 Biophysicist
2018 2 1 Acupuncturist
2018 2 2 Andrologist
2018 2 3 Anaesthesiologist
2018 2 4 Audiologist
2018 2 5 Ayurvedic Doctor
2018 2 6 Biochemist
2018 2 7 Biophysicist
I am unable to think of an approach. What is the right way to do?
Upvotes: 2
Views: 561
Reputation: 71
I hope can help you with this.
# A: Create the new columns
df['Year'], df['Month'] = 2018, None
# A: Create the two new DataFrame
df1 = df.copy()
df2 = df.copy()
# A: Edith the month in both DataFrames
df1['Month'], df2['Month'] = 1, 2
Upvotes: 0
Reputation: 164613
You can calculate the Cartesian product via pd.MultiIndex.from_product
, then join with a tiled dataframe:
year = [2018]
month = [1, 2]
# calculate Cartesian product and repeat by number of rows in dataframe
cart_prod = pd.MultiIndex.from_product([year, month], names=['year', 'month'])
# tile dataframe and join year_month index
res = df.loc[np.tile(df.index, len(year) * len(month))]\
.set_index(cart_prod.repeat(df.shape[0])).reset_index()
print(res)
year month speciality_id speciality_name
0 2018 1 1 Acupuncturist
1 2018 1 2 Andrologist
2 2018 1 3 Anaesthesiologist
3 2018 1 4 Audiologist
4 2018 1 5 AyurvedicDoctor
5 2018 1 6 Biochemist
6 2018 1 7 Biophysicist
7 2018 2 1 Acupuncturist
8 2018 2 2 Andrologist
9 2018 2 3 Anaesthesiologist
10 2018 2 4 Audiologist
11 2018 2 5 AyurvedicDoctor
12 2018 2 6 Biochemist
13 2018 2 7 Biophysicist
Upvotes: 0
Reputation: 862406
Use product
for all combinations, create DataFrame
and merge
with left join:
year = [2018]
Month = [1,2]
from itertools import product
df1 = pd.DataFrame(list(product(year, Month, df['speciality_id'])),
columns=['Year','Month','speciality_id'])
print (df1)
Year Month speciality_id
0 2018 1 1
1 2018 1 2
2 2018 1 3
3 2018 1 4
4 2018 1 5
5 2018 1 6
6 2018 1 7
7 2018 2 1
8 2018 2 2
9 2018 2 3
10 2018 2 4
11 2018 2 5
12 2018 2 6
13 2018 2 7
df = df1.merge(df, on='speciality_id', how='left')
print (df)
Year Month speciality_id speciality_name
0 2018 1 1 Acupuncturist
1 2018 1 2 Andrologist
2 2018 1 3 Anaesthesiologist
3 2018 1 4 Audiologist
4 2018 1 5 Ayurvedic Doctor
5 2018 1 6 Biochemist
6 2018 1 7 Biophysicist
7 2018 2 1 Acupuncturist
8 2018 2 2 Andrologist
9 2018 2 3 Anaesthesiologist
10 2018 2 4 Audiologist
11 2018 2 5 Ayurvedic Doctor
12 2018 2 6 Biochemist
13 2018 2 7 Biophysicist
Upvotes: 2