Reputation: 107
I want to split a column into 3 based on a delimiter ":"
and I am able to do
that using the code below. But now want to change the names of the split
columns from the default of 1,2,3,4. Please advice how I can do that.
from pandas import *
df = DataFrame(
{'CustomerName' : ["Paul", "John"], 'Seatblocks' : ["2:218:10:4,6","2:218:10:4,6"]});
df
df.join(df.Seatblocks.apply(lambda x: Series(x.split(':'))))
Upvotes: 4
Views: 13804
Reputation: 193
Here is a small function for automatically adding column names
def col_splitter(df, orig_col, deli):
#Split the column on the delimiter provided
splitted=df[orig_col].str.split(deli, expand=True)
# make a dictionary of new column names based on the number of splits
col_names={k:f'{orig_col}_{k}' for k in splitted.columns.values}
# Lastly, rename columns ==> 0:'Seatblocks_0', 1:'Seatblocks_1' etc
splitted.rename(columns=col_names,inplace=True)
return splitted
col_splitter(df,'Seatblocks',':')
Upvotes: 0
Reputation: 11
import pandas as pd
import numpy as np
df = pd.DataFrame(
{'CustomerName' : ["Paul", "John"], 'Seatblocks' : ["2:218:10:4,6","2:218:10:4,6"]});
#df
#1.split, rename
df_extraction = df['Seatblocks'].str.split(':', expand=True)
df_extraction.columns = ['seatBlock'+str(i) for i in df_extraction.columns]
# 2.merge extracted to df
df = pd.concat([df, df_extraction], axis=1)
df
CustomerName Seatblocks seatBlock0 seatBlock1 seatBlock2 seatBlock3
0 Paul 2:218:10:4,6 2 218 10 4,6
1 John 2:218:10:4,6 2 218 10 4,6
Upvotes: 1
Reputation: 35109
You can modify df.column
to get the new column names
In [1]: from pandas import *
In [2]: df = DataFrame(
...: {'CustomerName' : ["Paul", "John"], 'Seatblocks' : ["2:218:10:4,6","2:218:10:4,6"]});
In [3]: df2 = df.join(df.Seatblocks.apply(lambda x: Series(x.split(':'))))
In [4]: names = ['foo', 'bar', 'baz', 'booz']
In [5]: df2.columns = [x if str(x).isalpha() else names.pop() for x in df2.columns]
In [6]: df2.columns
Out[6]: Index([u'CustomerName', u'Seatblocks', u'booz', u'baz', u'bar', u'foo'], dtype='object')
In [7]:
Upvotes: 0
Reputation: 353059
People have already given the rename
approach, but I find these things are easier if you avoid the temptation to cram everything into one line. Once you have a frame, you can simply assign to .columns
:
>>> sb = df.Seatblocks.str.split(":").apply(pd.Series)
>>> sb.columns = ["a", "Three Digit", "??", "coord"]
>>> pd.concat([df, sb], axis=1)
CustomerName Seatblocks a Three Digit ?? coord
0 Paul 2:218:10:4,6 2 218 10 4,6
1 John 2:218:10:4,6 2 218 10 4,6
The first line is simply a version of your (df.Seatblocks.apply(lambda x: Series(x.split(':'))))
which takes advantage of the vectorized string operation accessor .str
(docs).
Upvotes: 3
Reputation: 394021
Just rename
them:
df.rename(columns={0:'col_1', 1:'col_2', 2:'col_3', 3:'col_4'},inplace=True)
A more obscure method is to form a union of the new names to the first 2 elements of your columns and assign directly:
In [14]:
df.columns = df.columns[:2] | pd.Index(['col_1', 'col_2', 'col_3', 'col_4'])
df
Out[14]:
CustomerName Seatblocks col_1 col_2 col_3 col_4
0 Paul 2:218:10:4,6 2 218 10 4,6
1 John 2:218:10:4,6 2 218 10 4,6
Upvotes: 3
Reputation: 3871
Columns were renamed to A, B, C, D.
from pandas import *
df = DataFrame(
{'CustomerName' : ["Paul", "John"], 'Seatblocks' : ["2:218:10:4,6","2:218:10:4,6"]});
df = df.join(df.Seatblocks.apply(lambda x: Series(x.split(':'))))
df.rename(columns={0: 'A', 1: 'B', 2: 'C', 3: 'D'}, inplace=True)
df
Upvotes: 1