learner22
learner22

Reputation: 39

split before first comma in pandas and output

I want to output the following table in pandas. I only have the description column so far but I want to split on the comma and output the contents before the comma in the commondescrip column.

I have the description column right now, I need the commondescrip column

description commondescrip
00001 00001
00002 00002
00003,Area01 00003
00004 00004
00005,Area02 00005

I tried

splitword = df2["description"].str.split(",", n=1, expand = True)
df2["commondescrip"] = splitword[0]

but it gives me NaN for those rows that have Area.

How can I fix it so that I can achieve the above the table and split it to output before the comma?

Upvotes: 1

Views: 1629

Answers (2)

mozway
mozway

Reputation: 260680

Don't split, this would require to handle several parts while you're only interested in one: remove or extract.

removing everything after the first comma:

df['commondescrip'] = df['description'].str.replace(',.*', '', regex=True)

or extracting everything before the first comma:

df['commondescrip'] = df['description'].str.extract('([^,]+)')

output:

    description commondescrip
0         00001         00001
1         00002         00002
2  00003,Area01         00003
3         00004         00004
4  00005,Area02         00005

Upvotes: 2

Naveed
Naveed

Reputation: 11650

Here is one way to do it

df['description'].apply(lambda x: x.strip().split(',')[0])
0    00001
1    00002
2    00003
3    00004
4    00005
Name: description, dtype: object

Upvotes: 1

Related Questions