Reputation: 2433
I have more than 1M rows and want to split a Series of strings like 123456789
(length=9) into 3 Series (like MS Excel can do):
c1 c2 c3
123 456 789
... ... ...
I see .str.split
function which needs some separator and .str.slice
which gives only one Series at a time. Is there smth. better than this?
s21 = s11.str.slice(0,3)
s22 = s11.str.slice(3,6)
s23 = s11.str.slice(6,9)
Upvotes: 2
Views: 6183
Reputation: 77971
You may use str.extract
:
>>> df
s11
0 123456789
1 987654321
>>> df['s11'].str.extract('(.{3,3})' * 3)
0 1 2
0 123 456 789
1 987 654 321
Though, when something simple like str.slice
works, it tends to be faster than using unnecessary regex, even if you need to call it few times manually or using a for loop.
You may do str.slice
in one liner as in:
>>> df['a'], df['b'], df['c'] = map(df['s11'].str.slice, [0, 3, 6], [3, 6, 9])
>>> df
s11 a b c
0 123456789 123 456 789
1 987654321 987 654 321
Upvotes: 2
Reputation: 24052
If all you need to do is split fixed-length strings into smaller, equal-sized fixed-length strings, you can do:
s = "123456789"
x = [s[i:i+3] for i in range(0, 9, 3)]
Upvotes: 1