Reputation: 8145
There are other similar questions, but the difference here is that my dataframe already has a lot of columns, only one of which needs to be split.
I have a large dataframe(hundreds of columns, millions of rows). I would like to split one of these columns when a character ("|") is found in the string.
All values have only one "|".
For a fixed length I would do this: df['StateInitial'] = df['state'].str[:2]
I wish I could replace the 2 by string.index("|"), but how do I call the string?
Upvotes: 3
Views: 11543
Reputation: 1
If you have a column of strings, with a delimiter '|' you can use the following line to split the columns:
df['left'], df['right'] = df['combined'].str.split('|', 1).str
LeoRochael has a great in-depth explanation of how this works over on a separate thread: https://stackoverflow.com/a/39358924/11688667
Upvotes: 0
Reputation: 2127
Here is a one liner that builds on the answer provided by @santon:
df['left'],df['right'] = zip(*df[0].apply(lambda x: x.split('|')))
>>> df
0 left right
0 a|b a b
1 c|d c d
Upvotes: 2
Reputation: 109546
First, set you new column values equal to the old column values.
Next, create a new column with values initially equal to None.
Now, update the new column with valid values of the first.
df['new_col1'] = df['old_col']
df['new_col2'] = None
df['new_col2'].update(df.new_col1.apply(lambda x: x.str.split('|')[1]
if len(x.str.split()) == 2 else None))
Upvotes: 1
Reputation: 4605
How about:
df = pd.DataFrame(['a|b', 'c|d'])
s = df[0].apply(lambda x: x.split('|'))
df['left'] = s.apply(lambda x: x[0])
df['right'] = s.apply(lambda x: x[1])
Output:
0 left right
0 a|b a b
1 c|d c d
Upvotes: 7