Reputation: 231
I have a pandas df that looks like this:
df = pd.DataFrame(data={
'colA': ["12.456.", "......7", "..34..7"],
'ID': ["idx1", "idx1", "idx2"]})
I'm trying to build a column that contains a list of colA
values in order to do operations on it later. So I create it like this :
df['colB'] = df['colA'].str.split("")
But the result looks like this:
['', '1', '2', '.', '4', '5', '6', '.', '']
['', '.', '.', '.', '.', '.', '.', '7', '']
['', '.', '.', '3', '4', '.', '.', '7', '']
You see that for each line, i have successfully created the list I wanted, BUT there's an empty element at the beginning and at the end of the list.
I tried multiple solutions to get rid of this using filter
or list comprehension but I haven't succeeded, maybe I did it wrong? Do you have an idea on how to do it efficiently?
Upvotes: 1
Views: 169
Reputation: 18406
You don't need to split the values, just .apply
and pass list
, the string will automatically be converted to a list, and those empty values won't be created.
>>> df['colB'] = df['colA'].apply(list)
colA ID colB
0 12.456. idx1 [1, 2, ., 4, 5, 6, .]
1 ......7 idx1 [., ., ., ., ., ., 7]
2 ..34..7 idx2 [., ., 3, 4, ., ., 7]
Upvotes: 1