Rohit Sharma
Rohit Sharma

Reputation: 6500

Split and type cast columns values using Pandas

How do i add an extra column in a dataframe, so it could split and convert to integer types but np.nan for string types

Col1   
1|2|3
"string"

so

Col1      ExtraCol
1|2|3     [1,2,3]
"string"  nan

I tried long contorted way but failed

df['extracol'] = df["col1"].str.strip().str.split("|").str[0].apply(lambda x: x.astype(np.float) if x.isnumeric() else np.nan).astype("Int32")

Upvotes: 0

Views: 119

Answers (2)

PaulS
PaulS

Reputation: 25353

Another possible solution:

import re

df['ExtraCol'] = df['Col1'].apply(lambda x: [int(y) for y in re.split(
    r'\|', x)] if x.replace('|', '').isnumeric() else np.nan)

Output:

     Col1   ExtraCol
0   1|2|3  [1, 2, 3]
1  string        NaN

Upvotes: 1

SomeDude
SomeDude

Reputation: 14238

You can use regex and Series.str.match to find the rows whose value can be split into integer lists

df['ExtraCol'] = df.loc[df['Col1'].str.match(r'\|?\d+\|?'), 'Col1'].str.split('|')

Upvotes: 1

Related Questions