How to split a column in Pandas data frame into two and keeping the rest intact

Question

I have a data frame constructed this way:

import pandas as pd
import io

temp=u"""probegenes,sample2,sample3,sample4
1415777_at Pnliprp1,20,10,11
1415805_at Clps,17,13,55"""
df = pd.read_csv(io.StringIO(temp))

It looks like this:

In [19]: df
Out[19]:
            probegenes  sample2  sample3  sample4
0  1415777_at Pnliprp1       20       10       11
1      1415805_at Clps       17       13       55

What I want to do is to split probe genes column so that it becomes like this:

       probe     genes     sample2  sample3  sample4
  1415777_at     Pnliprp1   20       10       11
  1415805_at     Clps       17       13       55

How can I achieve that?

jezrael · Accepted Answer

You can split column probegenes and create columns probe, genes and last drop probegenes:

df[['probe', 'genes']] = df['probegenes'].str.split(" ", expand=True)
df = df.drop('probegenes', axis=1)
#change ordering of columns
print df[['probe','genes','sample2','sample3','sample4']]
        probe     genes  sample2  sample3  sample4
0  1415777_at  Pnliprp1       20       10       11
1  1415805_at      Clps       17       13       55

Or use concat and iloc:

df1= df['probegenes'].str.split(" ", expand=True)
df1.columns = ['probe', 'genes']
print df1
        probe     genes
0  1415777_at  Pnliprp1
1  1415805_at      Clps

print df.iloc[:,1:]
   sample2  sample3  sample4
0       20       10       11
1       17       13       55

print pd.concat([df1, df.iloc[:,1:]], axis=1)
        probe     genes  sample2  sample3  sample4
0  1415777_at  Pnliprp1       20       10       11
1  1415805_at      Clps       17       13       55

How to split a column in Pandas data frame into two and keeping the rest intact

Answers (1)

Related Questions