Reputation: 2085
I've got a following Data Frame:
example_df = pd.DataFrame({'id': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4},
'seq_start': {0: 0.0, 1: 2800.0, 2: 6400.0, 3: 8400.0, 4: 9800.0},
'seq_end': {0: 1400.0, 1: 4700.0, 2: 8400.0, 3: 9800.0, 4: 11400.0}})
I'd like to obtain a Data Frame that has a sequences of values from example_df['seq_start']
to example_df['seq_end']
so that I could later use newly created column in a join.
So the expected output would look like below:
out_df = pd.DataFrame({'id': np.concatenate([[0] * 15, [1] * 20, [2] * 21]),
'expected_output': np.concatenate([np.arange(0, 1500, 100),
np.arange(2800, 4800, 100),
np.arange(6400, 8500, 100)])})
id expected_output
0 0 0
1 0 100
2 0 200
3 0 300
4 0 400
5 0 500
...
12 0 1200
13 0 1300
14 0 1400
15 1 2800
16 1 2900
17 1 3000
...
31 1 4400
32 1 4500
33 1 4600
34 1 4700
35 2 6400
36 2 6500
37 2 6600
...
54 2 8300
55 2 8400
How can I approach this?
Upvotes: 0
Views: 273
Reputation: 29732
Using pandas.DataFrame.explode
:
def listify(x, step=100, right_closed=True):
lower, upper = sorted(x)
return range(lower, upper+step*right_closed, step)
example_df['expected'] = example_df[['seq_end', 'seq_start']].astype(int).apply(listify, 1)
new_df = example_df[['id','expected']].explode('expected')
print(new_df)
Output:
id expected
0 0 0
0 0 100
0 0 200
0 0 300
0 0 400
.. .. ...
4 4 11000
4 4 11100
4 4 11200
4 4 11300
4 4 11400
Upvotes: 2