Reputation: 415
I have a pandas dataframe
import pandas as pd
import numpy as np
d = pd.DataFrame({
'col': ['A', 'B', 'C', 'D'],
'start': [1, 4, 6, 8],
'end': [4, 9, 10, 12]
})
I'm trying to calculate a range field based on start and end fields such that the values for it are
[1, 2, 3, 4]
[4, 5, 6, 7, 8, 9]
[6, 7, 8, 9, 10]
[8, 9, 10, 11, 12]
I have tried the following options
d['range_'] = np.arange( d.start, d.end, 1)
d['range_'] = range(d['start'], d['end'])
but get the following errors
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
TypeError: 'Series' object cannot be interpreted as an integer <- second attempt
Any help would be appreciated
Thanks
Upvotes: 1
Views: 249
Reputation: 323326
IIUC
l = [list(range(x,y+1)) for x , y in zip(d.start,d.end)]
[[1, 2, 3, 4], [4, 5, 6, 7, 8, 9], [6, 7, 8, 9, 10], [8, 9, 10, 11, 12]]
d['range_']=l
Upvotes: 1
Reputation: 153500
Try this:
d.apply(lambda x: np.arange(x['start'], x['end']+1), axis=1)
Output:
0 [1, 2, 3, 4]
1 [4, 5, 6, 7, 8, 9]
2 [6, 7, 8, 9, 10]
3 [8, 9, 10, 11, 12]
dtype: object
Note: np.arange
and range
are not designed to accept pd.Series, therefore you can use apply rowwise to create ranges.
Upvotes: 2