Remove empty item from list in a pandas column

Question

I have a pandas df that looks like this:

df = pd.DataFrame(data={
'colA': ["12.456.", "......7", "..34..7"],
'ID': ["idx1", "idx1", "idx2"]})

I'm trying to build a column that contains a list of colA values in order to do operations on it later. So I create it like this :

df['colB'] = df['colA'].str.split("")

But the result looks like this:

['', '1', '2', '.', '4', '5', '6', '.', '']
['', '.', '.', '.', '.', '.', '.', '7', '']
['', '.', '.', '3', '4', '.', '.', '7', '']

You see that for each line, i have successfully created the list I wanted, BUT there's an empty element at the beginning and at the end of the list.

I tried multiple solutions to get rid of this using filteror list comprehension but I haven't succeeded, maybe I did it wrong? Do you have an idea on how to do it efficiently?

ThePyGuy · Accepted Answer

You don't need to split the values, just .apply and pass list, the string will automatically be converted to a list, and those empty values won't be created.

>>> df['colB'] = df['colA'].apply(list)

      colA    ID                   colB
0  12.456.  idx1  [1, 2, ., 4, 5, 6, .]
1  ......7  idx1  [., ., ., ., ., ., 7]
2  ..34..7  idx2  [., ., 3, 4, ., ., 7]

Remove empty item from list in a pandas column

Answers (1)

Related Questions