Reputation: 17164

Create stacked pandas series from series with list elements

I have a pandas series with elements as list:

import pandas as pd
s = pd.Series([ ['United States of America'],['China', 'Hong Kong'], []])
print(s)

0    [United States of America]
1            [China, Hong Kong]
2                            []

How to get a series like the following:

0 United States of America
1 China
1 Hong Kong

I am not sure about what happens to 2.

Upvotes: 4

Answers (4)

Calebe Piacentini

Reputation: 21

There is a simpler and probably way less computationally expensive to do that through pandas function explode. See at here. In your case, the answer would be:

s.explode()

Simple as it is! In a case with more columns you can specify which one you would like to "explode" by adding the name of it in literals, for example s.explode('country').

Upvotes: 1

cs95

Reputation: 403248

The following options all return Series. Create a new frame and listify.

pd.DataFrame(s.tolist()).stack()

0  0    United States of America
1  0                       China
   1                   Hong Kong
dtype: object

To reset the index, use

pd.DataFrame(s.tolist()).stack().reset_index(drop=True)

0    United States of America
1                       China
2                   Hong Kong
dtype: object

To convert to DataFrame, call to_frame()

pd.DataFrame(s.tolist()).stack().reset_index(drop=True).to_frame('countries')

                  countries
0  United States of America
1                     China
2                 Hong Kong

If you're trying to code golf, use

sum(s, [])
# ['United States of America', 'China', 'Hong Kong']

pd.Series(sum(s, []))

0    United States of America
1                       China
2                   Hong Kong
dtype: object

Or even,

pd.Series(np.sum(s))

0    United States of America
1                       China
2                   Hong Kong
dtype: object

However, like most other operations involving sums of lists operations, this is bad in terms of performance (list concatenation operations are inefficient).

Faster operations are possible using chaining with itertools.chain:

from itertools import chain
pd.Series(list(chain.from_iterable(s)))

0    United States of America
1                       China
2                   Hong Kong
dtype: object

pd.DataFrame(list(chain.from_iterable(s)), columns=['countries'])

                  countries
0  United States of America
1                     China
2                 Hong Kong

Upvotes: 4

BENY

Reputation: 323396

Assuming that is list

pd.Series(s.sum())
Out[103]: 
0    United States of America
1                       China
2                   Hong Kong
dtype: object

Upvotes: 2

U13-Forward

Reputation: 71620

Or use:

df = pd.DataFrame(s.tolist())
print(df[0].fillna(df[1].dropna().item()))

Output:

0    United States of America
1                       China
2                   Hong Kong
Name: 0, dtype: object

Upvotes: 2

Create stacked pandas series from series with list elements

Answers (4)

Related Questions