Alexis Eggermont
Alexis Eggermont

Reputation: 8145

Split pandas column into two

There are other similar questions, but the difference here is that my dataframe already has a lot of columns, only one of which needs to be split.

I have a large dataframe(hundreds of columns, millions of rows). I would like to split one of these columns when a character ("|") is found in the string.

All values have only one "|".

For a fixed length I would do this: df['StateInitial'] = df['state'].str[:2]

I wish I could replace the 2 by string.index("|"), but how do I call the string?

Upvotes: 3

Views: 11543

Answers (4)

Jimmy Le
Jimmy Le

Reputation: 1

If you have a column of strings, with a delimiter '|' you can use the following line to split the columns:

df['left'], df['right'] = df['combined'].str.split('|', 1).str

LeoRochael has a great in-depth explanation of how this works over on a separate thread: https://stackoverflow.com/a/39358924/11688667

Upvotes: 0

khammel
khammel

Reputation: 2127

Here is a one liner that builds on the answer provided by @santon:

df['left'],df['right'] = zip(*df[0].apply(lambda x: x.split('|')))

>>> df 
     0 left right
0  a|b    a     b
1  c|d    c     d

Upvotes: 2

Alexander
Alexander

Reputation: 109546

First, set you new column values equal to the old column values.

Next, create a new column with values initially equal to None.

Now, update the new column with valid values of the first.

df['new_col1'] = df['old_col']
df['new_col2'] = None
df['new_col2'].update(df.new_col1.apply(lambda x: x.str.split('|')[1] 
                      if len(x.str.split()) == 2 else None))

Upvotes: 1

santon
santon

Reputation: 4605

How about:

df = pd.DataFrame(['a|b', 'c|d'])
s = df[0].apply(lambda x: x.split('|'))
df['left'] = s.apply(lambda x: x[0])
df['right'] = s.apply(lambda x: x[1])

Output:

     0 left right
0  a|b    a     b
1  c|d    c     d

Upvotes: 7

Related Questions