how to extract each numbers from pandas string column to list?

Question

How to do that?

I have pandas dataframe looks like:

Column_A
11.2 some text 17 some text 21
some text 25.2 4.1 some text 53 17 78
121.1 bla bla bla 14 some text
12 some text

I need to transfer this each row to separated list:

listA[0] = 11.2 listA[1] = 17 listA[2] = 21
listB[0] = 25.2 listB[1] = 4.1 listB[2] = 53 listB[3] = 17 listB[4] = 78
listC[0] = 121.1 listC[1] = 14
listD[0] = 12

ThePyGuy · Accepted Answer

You can use re to find all the occurrences of the numbers either integer or float.

df['Column_A'].apply(lambda x: re.findall(r"[-+]?\d*\.\d+|\d+", x)).tolist()

OUTPUT:

[['11.2', '17', '21'], ['25.2', '4.1', '53', '17', '78'], ['121.1', '14'], ['12']]

If you want, you can type cast them to float/int checking if the extracted string has . in them, something like this:

df['Column_A'].apply(lambda x: re.findall(r"[-+]?\d*\.\d+|\d+", x)).map(lambda x: [int(i) if '.' not in i else float(i) for i in x]).tolist()

OUTPUT:

[[11.2, 17, 21], [25.2, 4.1, 53, 17, 78], [121.1, 14], [12]]

As pointed by @Uts, we can directly call findall over Series.str as:

listA, listB, listC, listD = df.Column_A.str.findall(r"[-+]?\d*\.\d+|\d+")

how to extract each numbers from pandas string column to list?

Answers (2)

Related Questions