Winand
Winand

Reputation: 2433

Split Series by string length

I have more than 1M rows and want to split a Series of strings like 123456789 (length=9) into 3 Series (like MS Excel can do):

c1  c2  c3
123 456 789
... ... ...

I see .str.split function which needs some separator and .str.slice which gives only one Series at a time. Is there smth. better than this?

s21 = s11.str.slice(0,3)
s22 = s11.str.slice(3,6)
s23 = s11.str.slice(6,9)

Upvotes: 2

Views: 6183

Answers (2)

behzad.nouri
behzad.nouri

Reputation: 77971

You may use str.extract:

>>> df
         s11
0  123456789
1  987654321
>>> df['s11'].str.extract('(.{3,3})' * 3)
     0    1    2
0  123  456  789
1  987  654  321

Though, when something simple like str.slice works, it tends to be faster than using unnecessary regex, even if you need to call it few times manually or using a for loop.

You may do str.slice in one liner as in:

>>> df['a'], df['b'], df['c'] = map(df['s11'].str.slice, [0, 3, 6], [3, 6, 9])
>>> df
         s11    a    b    c
0  123456789  123  456  789
1  987654321  987  654  321

Upvotes: 2

Tom Karzes
Tom Karzes

Reputation: 24052

If all you need to do is split fixed-length strings into smaller, equal-sized fixed-length strings, you can do:

s = "123456789"
x = [s[i:i+3] for i in range(0, 9, 3)]

Upvotes: 1

Related Questions