JamesHudson81
JamesHudson81

Reputation: 2273

How can I split a column into 2?

I have the following df:

                    0
0    Fuerte venta (0,00)*
1   Infraponderar (0,00)*
2        Neutral (14,00)*
3   Sobreponderar (2,00)*
4  Fuerte compra (11,00)*

How could I split the column into 2 columns in order to obtain the following output:

           0         1
0    Fuerte venta (0,00)*
1   Infraponderar (0,00)*
2        Neutral (14,00)*
3   Sobreponderar (2,00)*
4  Fuerte compra (11,00)*

Upvotes: 3

Views: 77

Answers (3)

piRSquared
piRSquared

Reputation: 294586

Option 1
list comprehension and str.rsplit
pir2

pd.DataFrame(
    pd.DataFrame([x.rsplit(' ', 1) for x in df['0'].values.tolist()])
)

               0         1
0   Fuerte venta   (0,00)*
1  Infraponderar   (0,00)*
2        Neutral  (14,00)*
3  Sobreponderar   (2,00)*
4  Fuerte compra  (11,00)*

Option 2
Using np.core.defchararray.rsplit pir1

pd.DataFrame(
    np.core.defchararray.rsplit(df['0'].values.astype(str), ' ', 1).tolist()
)

               0         1
0   Fuerte venta   (0,00)*
1  Infraponderar   (0,00)*
2        Neutral  (14,00)*
3  Sobreponderar   (2,00)*
4  Fuerte compra  (11,00)*

Timing
Code below
list comprehension is fastest for big and small datasets.

enter image description here

pir1 = lambda d: pd.DataFrame(np.core.defchararray.rsplit(d['0'].values.astype(str), ' ', 1).tolist())
pir2 = lambda d: pd.DataFrame([x.rsplit(' ', 1) for x in d['0'].values.tolist()])
bos = lambda d: d['0'].str.rsplit(' ', n=1, expand=True)
vai = lambda d: pd.DataFrame(d['0'].str.rsplit(' ', 1).tolist())

results = pd.DataFrame(
    index=pd.Index([10, 30, 100, 300, 1000, 3000]),
    columns='pir1 pir2 bos vai'.split()
)

for i in results.index:
    d = pd.concat([df] * i, ignore_index=True)
    for j in results.columns:
        stmt = '{}(d)'.format(j)
        setp = 'from __main__ import d, {}'.format(j)
        results.set_value(i, j, timeit(stmt, setp, number=100))

results.plot(loglog=True)

Upvotes: 2

Scott Boston
Scott Boston

Reputation: 153560

Use .str.rsplit with expand=True:

df['0'].str.rsplit(' ', n=1, expand=True)

Output:

               0         1
0   Fuerte venta   (0,00)*
1  Infraponderar   (0,00)*
2        Neutral  (14,00)*
3  Sobreponderar   (2,00)*
4  Fuerte compra  (11,00)*

Upvotes: 4

Vaishali
Vaishali

Reputation: 38425

You can use str.rsplit

pd.DataFrame(df['0'].str.rsplit(' ', 1).tolist())

You get

    0               1
0   Fuerte venta    (0,00)*
1   Infraponderar   (0,00)*
2   Neutral         (14,00)*
3   Sobreponderar   (2,00)*
4   Fuerte compra   (11,00)*

Upvotes: 3

Related Questions