PointXIV
PointXIV

Reputation: 1288

Splitting members of a series in a pandas dataframe

I felt like I found the answer to this before, but looking back I haven't been able to find anything.

Is there a quick, painless way to split strings in a specific series in a dataframe?

For example, the series df['a'] looks like this:

df['a'] = ['abc 123', 'bcd 2344456jlkj6', 'dfe 456jklj34534', 'akg bg23534535']

What I want at the end is just

df['a'] = ['abc', 'bcd', 'dfe', 'akg']

I originally tried using df['a'] = df['a'].str.split(' ')[0] but that just gave me index errors.

Upvotes: 0

Views: 69

Answers (3)

heinst
heinst

Reputation: 8786

This should work for you:

df = pd.DataFrame({"a": ['abc 123', 'bcd 2344456jlkj6', 'dfe 456jklj34534', 'akg bg23534535']})
print df['a']
df2 = []
for num in df['a']:
    df2.append(num.split(' ')[0])

df['a'] = df2

print df['a']

Which yields:

0             abc 123
1    bcd 2344456jlkj6
2    dfe 456jklj34534
3      akg bg23534535
Name: a, dtype: object
0    abc
1    bcd
2    dfe
3    akg
Name: a, dtype: object

Upvotes: 0

DSM
DSM

Reputation: 353099

You were very close, you simply need an extra str in there:

>>> df = pd.DataFrame({"a": ['abc 123', 'bcd 2344456jlkj6', 'dfe 456jklj34534', 'akg bg23534535']})
>>> df["a"].str.split().str[0]
0    abc
1    bcd
2    dfe
3    akg
Name: a, dtype: object

Upvotes: 2

unutbu
unutbu

Reputation: 879621

In [158]: df
Out[158]: 
                  a
0           abc 123
1  bcd 2344456jlkj6
2  dfe 456jklj34534
3    akg bg23534535

In [159]: df['a'].str.extract(r'^(\w+)')
Out[159]: 
0    abc
1    bcd
2    dfe
3    akg
Name: a, dtype: object

Upvotes: 0

Related Questions