Reputation:
I have a dataframe that looks like this
import pandas as pd
import numpy as np
# Create data set.
dataSet = {'id': ['A', 'A', 'B'],
'id_2': [1, 2, 1] ,
'number': [320, 169, 120],
'add_number' : [4,6,3]}
# Create dataframe with data set and named columns.
df = pd.DataFrame(dataSet, columns= ['id', 'id_2','number', 'add_number'])
id id_2 number add_number
0 A 1 320 4
1 A 2 169 6
2 B 1 120 3
I would like use number and add_number so that I can explode this dynamically, ie) 320 + 4 would have [320,321,322,323,324] (up to 324, and would like to explode on this)
DESIRED OUTPUT
id id_2 number
0 A 1 320
1 A 1 321
2 A 1 322
3 A 1 323
4 A 1 324
5 A 2 169
6 A 2 170
7 A 2 171
8 A 2 172
9 A 2 173
10 A 2 174
11 A 2 175
12 B 1 120
13 B 1 121
14 B 1 122
15 B 1 123
I looked over explode, wide_to_long pandas function, but I do not know where to start, any sense of direction would be appreciated!!
Upvotes: 6
Views: 264
Reputation: 153460
You try using np.arange
and explode
:
df['range'] = df.apply(lambda x: np.arange(x['number'], x['number']+x['add_number']+1), axis=1)
df.explode('range')
or
df['range'] = [np.arange(n, n+a+1) for n, a in zip(df['number'],df['add_number'])]
df.explode('range')
Output:
id id_2 number add_number range
0 A 1 320 4 320
0 A 1 320 4 321
0 A 1 320 4 322
0 A 1 320 4 323
0 A 1 320 4 324
1 A 2 169 6 169
1 A 2 169 6 170
1 A 2 169 6 171
1 A 2 169 6 172
1 A 2 169 6 173
1 A 2 169 6 174
1 A 2 169 6 175
2 B 1 120 3 120
2 B 1 120 3 121
2 B 1 120 3 122
2 B 1 120 3 123
Upvotes: 2
Reputation: 26676
Multiply columns based od number
column
new=pd.DataFrame(np.repeat(df.values,df['add_number']+1, axis=0), columns=df.columns)
#Use groupby cummulatively count the added rows and increment number by the same
new=new.assign(number=new['number']+new.groupby(['number','id'])['number'].transform('cumcount'))
Outcome
id id_2 number add_number
0 A 1 320 4
1 A 1 321 4
2 A 1 322 4
3 A 1 323 4
4 A 1 324 4
5 A 2 169 6
6 A 2 170 6
7 A 2 171 6
8 A 2 172 6
9 A 2 173 6
10 A 2 174 6
11 A 2 175 6
12 B 1 120 3
13 B 1 121 3
14 B 1 122 3
15 B 1 123 3
Upvotes: 3
Reputation: 2202
Here is an alternative approach, in case you wanted to use explode specifically:
import pandas as pd
import numpy as np
# Create data set.
dataSet = {'id': ['A', 'A', 'B'],
'id_2': [1, 2, 1] ,
'number': [320, 169, 120],
'add_number' : [4,6,3]}
df = pd.DataFrame(dataSet, columns= ['id', 'id_2','number', 'add_number'])
new_numbers = []
for row in df.iterrows():
new_numbers.append([row[1]['number']+ i for i in range(row[1]['add_number']+1)])
df['number'] = new_numbers
df = df.drop(['add_number'], axis=1)
df = df.explode('number').reset_index().drop(['index'], axis=1)
print(df)
Output:
id id_2 number
0 A 1 320
1 A 1 321
2 A 1 322
3 A 1 323
4 A 1 324
5 A 2 169
6 A 2 170
7 A 2 171
8 A 2 172
9 A 2 173
10 A 2 174
11 A 2 175
12 B 1 120
13 B 1 121
14 B 1 122
15 B 1 123
Upvotes: 1
Reputation: 323226
In your case doing repeat
then groupby
cumcount
add the value to number
out = df.reindex(df.index.repeat(df.add_number+1))
out['number'] = out['number'].add(out.groupby(out.index).cumcount())
out
Out[22]:
id id_2 number add_number
0 A 1 320 4
0 A 1 321 4
0 A 1 322 4
0 A 1 323 4
0 A 1 324 4
1 A 2 169 6
1 A 2 170 6
1 A 2 171 6
1 A 2 172 6
1 A 2 173 6
1 A 2 174 6
1 A 2 175 6
2 B 1 120 3
2 B 1 121 3
2 B 1 122 3
2 B 1 123 3
Upvotes: 2