elajdsha
elajdsha

Reputation: 95

convert pandas dataframe of strings to numpy array of int

My input is a pandas dataframe with strings inside:

>>> data
218.0                 
221.0                 
222.0                 
224.0    71,299,77,124
227.0     50,283,81,72
229.0              
231.0           84,349
233.0                 
235.0                 
240.0           53,254
Name: Q25, dtype: object

now i want a shaped ( .reshape(-1,2) ) numpy array of ints for every row like that:

>>> data
218.0                      [] 
221.0                      []
222.0                      []
224.0    [[71,299], [77,124]]
227.0     [[50,283], [81,72]]
229.0                      []
231.0              [[84,349]]
233.0                      []
235.0                      []
240.0              [[53,254]]
Name: Q25, dtype: object

i dont know how to get there by vector operations. can someone help?

Upvotes: 1

Views: 984

Answers (2)

piRSquared
piRSquared

Reputation: 294516

Not very cool, but accurate.

def f(x):
    if x != '':
        x = list(map(int, x.split(',')))
        return list(map(list, zip(x[::2], x[1::2])))
    else:
        return []

s.apply(f)

0                        []
1    [[71, 299], [77, 124]]
2               [[84, 349]]
dtype: object

Upvotes: 0

Zero
Zero

Reputation: 77027

You can use apply, this isn't vector operation though

In [277]: df.val.fillna('').apply(
         lambda x: np.array(x.split(','), dtype=int).reshape(-1, 2) if x else [])
Out[277]:
0                        []
1                        []
2                        []
3    [[71, 299], [77, 124]]
4     [[50, 283], [81, 72]]
5                        []
6               [[84, 349]]
7                        []
8                        []
9               [[53, 254]]
Name: val, dtype: object

Upvotes: 3

Related Questions