piRSquared
piRSquared

Reputation: 294228

Turn pandas series to series of lists or numpy array to array of lists

I have a series s

s = pd.Series([1, 2])

What is an efficient way to make s look like

0    [1]
1    [2]
dtype: object

Upvotes: 2

Views: 1432

Answers (4)

Divakar
Divakar

Reputation: 221524

Here's one approach that extracts into array and extends to 2D by introducing a new axis with None/np.newaxis -

pd.Series(s.values[:,None].tolist())

Here's a similar one, but extends to 2D by reshaping -

pd.Series(s.values.reshape(-1,1).tolist())

Runtime test using @P-robot's setup -

In [43]: s = pd.Series(np.random.randint(1,10,1000))

In [44]: %timeit pd.Series(np.vstack(s.values).tolist()) # @Nickil Maveli's soln
100 loops, best of 3: 5.77 ms per loop

In [45]: %timeit pd.Series([[a] for a in s]) # @P-robot's soln
1000 loops, best of 3: 412 µs per loop

In [46]: %timeit s.apply(lambda x: [x]) # @mgc's soln
1000 loops, best of 3: 551 µs per loop

In [47]: %timeit pd.Series(s.values[:,None].tolist()) # Approach1
1000 loops, best of 3: 307 µs per loop

In [48]: %timeit pd.Series(s.values.reshape(-1,1).tolist()) # Approach2
1000 loops, best of 3: 306 µs per loop

Upvotes: 4

p-robot
p-robot

Reputation: 4904

Adjusting atomh33ls' answer, here's a series of lists:

output = pd.Series([[a] for a in s])
type(output)
>> pandas.core.series.Series
type(output[0])
>> list

Timings for a selection of the suggestions:

import numpy as np, pandas as pd
s = pd.Series(np.random.randint(1,10,1000))

>> %timeit pd.Series(np.vstack(s.values).tolist())
100 loops, best of 3: 3.2 ms per loop

>> %timeit pd.Series([[a] for a in s])
1000 loops, best of 3: 393 µs per loop

>> %timeit s.apply(lambda x: [x])
1000 loops, best of 3: 473 µs per loop

Upvotes: 1

mgc
mgc

Reputation: 5443

If you want the result to still be a pandas Series you can use the apply method :

In [1]: import pandas as pd

In [2]: s = pd.Series([1, 2])

In [3]: s.apply(lambda x: [x])
Out[3]: 
0    [1]
1    [2]
dtype: object

Upvotes: 2

Lee
Lee

Reputation: 31040

This does it:

import numpy as np

np.array([[a] for a in s],dtype=object)
array([[1],
       [2]], dtype=object)

Upvotes: 1

Related Questions