Reputation: 64084
I have a data frame constructed this way:
import pandas as pd
import io
temp=u"""probegenes,sample2,sample3,sample4
1415777_at Pnliprp1,20,10,11
1415805_at Clps,17,13,55"""
df = pd.read_csv(io.StringIO(temp))
It looks like this:
In [19]: df
Out[19]:
probegenes sample2 sample3 sample4
0 1415777_at Pnliprp1 20 10 11
1 1415805_at Clps 17 13 55
What I want to do is to split probe genes
column so that it becomes like this:
probe genes sample2 sample3 sample4
1415777_at Pnliprp1 20 10 11
1415805_at Clps 17 13 55
How can I achieve that?
Upvotes: 1
Views: 900
Reputation: 863801
You can split
column probegenes
and create columns probe
, genes
and last drop
probegenes
:
df[['probe', 'genes']] = df['probegenes'].str.split(" ", expand=True)
df = df.drop('probegenes', axis=1)
#change ordering of columns
print df[['probe','genes','sample2','sample3','sample4']]
probe genes sample2 sample3 sample4
0 1415777_at Pnliprp1 20 10 11
1 1415805_at Clps 17 13 55
df1= df['probegenes'].str.split(" ", expand=True)
df1.columns = ['probe', 'genes']
print df1
probe genes
0 1415777_at Pnliprp1
1 1415805_at Clps
print df.iloc[:,1:]
sample2 sample3 sample4
0 20 10 11
1 17 13 55
print pd.concat([df1, df.iloc[:,1:]], axis=1)
probe genes sample2 sample3 sample4
0 1415777_at Pnliprp1 20 10 11
1 1415805_at Clps 17 13 55
Upvotes: 2