Mohammadreza Riahi
Mohammadreza Riahi

Reputation: 602

Convert each row of a dataframe to list

I have a dataframe like this:

df = pd.DataFrame({'A': ['1', '2', '3'], 'B': ['aa', 'b', 'c']})
   A  B
0  1  aa
1  2  b
2  3  c

I want to convert each row of column B to a list. For example, my desired output is something like this:

   df_new
   A  B
0  1  [aa]
1  2  [b]
2  3  [c]

Upvotes: 4

Views: 3825

Answers (3)

jezrael
jezrael

Reputation: 863531

I think solution from comments is very fast:

df['B'] = df['B'].map(lambda i: [i])

Faster is use list comprehension:

df['B'] = [[i] for i in df['B']]

Performance:

df = pd.DataFrame({'A': ['1', '2', '3'], 'B': ['as', 'b', 'c']})

#30k rows
df = pd.concat([df] * 10000, ignore_index=True)


In [93]: %timeit df['B'].apply(lambda x: x.split(','))
11.1 ms ± 963 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [94]: %timeit df['B'].str.split()
13.1 ms ± 788 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [96]: %timeit df['B'].map(lambda i: [i])
7.15 ms ± 54.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [97]: %timeit df['B'].apply(lambda i: [i])
7.21 ms ± 48.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [98]: %timeit df['B'].str.split(',')
13.9 ms ± 1.46 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [99]: %timeit [[i] for i in df['B']]
5.84 ms ± 73.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Upvotes: 8

user6223604
user6223604

Reputation:

You can use split to do stuff.

import pandas as pd
df = pd.DataFrame({'A': ['1', '2', '3'], 'B': ['a', 'b', 'c']})
df['B'] = df['B'].apply(lambda x: x.split(','))
print(df)

Upvotes: 6

sophocles
sophocles

Reputation: 13841

You could use apply:

df['B'] = df['B'].apply(list)

   A    B
0  1  [a]
1  2  [b]
2  3  [c]

Upvotes: 2

Related Questions