Reputation: 1858
I have the following pandas DataFrame:
import numpy as np
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2, 3, 4, 47, 27], 'B': [5, 6, 7, 8, 21, 40],
'C': [9, 10, 11, 12, 45, 33], 'D': [3, 4, 1, 2, 27, 47], 'E': [7, 8, 5, 6, 40, 21],
'F': [[[11, 35], [36, 37]], [[12, 42], [14, 11]], [[9, 37], [10, 43], [12, 28]], [[105, 27]], [], [[45, 2]]]})
print(df1)
## A B C D E F
## 0 1 5 9 3 7 [[11, 35], [36, 37]]
## 1 2 6 10 4 8 [[12, 42], [14, 11]]
## 2 3 7 11 1 5 [[9, 37], [10, 43], [12, 28]]
## 3 4 8 12 2 6 [[105, 27]]
## 4 47 21 45 27 40 []
## 5 27 40 33 47 21 [[45, 2]]
##
The column F is a list of lists. I would like to convert this to a list of tuples.
Normally, the way to convert a list of lists into a list of tuples is via a simple list comprehension, e.g.
foo = [[9, 37], [10, 43], [12, 28]]
foo = [tuple(lst) for lst in foo]
print(foo)
## [(9, 37), (10, 43), (12, 28)]
However, I do not know how to efficiently do this row-by-row in pandas. My first thought was to create a new column as follows:
df1['new_col'] = [tuple(lst) for lst in df1.F]
However, this obviously gives the incorrect result---F is now a tuple of lists, not a list of tuples:
df1
A B C D E F new_col
0 1 5 9 3 7 [[11, 35], [36, 37]] ([11, 35], [36, 37])
1 2 6 10 4 8 [[12, 42], [14, 11]] ([12, 42], [14, 11])
2 3 7 11 1 5 [[9, 37], [10, 43], [12, 28]] ([9, 37], [10, 43], [12, 28])
3 4 8 12 2 6 [[105, 27]] ([105, 27],)
4 47 21 45 27 40 [] ()
5 27 40 33 47 21 [[45, 2]] ([45, 2],)
I'm sorry if this is obvious---my pandas is rusty.
Upvotes: 0
Views: 1356
Reputation: 91
The code for lst in df.F
iterate over every row, that means you are using tuple on the row, not the inner lists like you disire.
A second for to iterate over the inner lists for each line would do the job. Try this:
df1['new_col'] = [[tuple(lst_in) for lst_in in lst] for lst in df1.F]
Output:
A B C D E F new_col
0 1 5 9 3 7 [[11, 35], [36, 37]] [(11, 35), (36, 37)]
1 2 6 10 4 8 [[12, 42], [14, 11]] [(12, 42), (14, 11)]
2 3 7 11 1 5 [[9, 37], [10, 43], [12, 28]] [(9, 37), (10, 43), (12, 28)]
3 4 8 12 2 6 [[105, 27]] [(105, 27)]
4 47 21 45 27 40 [] []
5 27 40 33 47 21 [[45, 2]] [(45, 2)]
Upvotes: 1
Reputation: 742
Try this:
In [8]: df1['new_col'] = [list(map(tuple, lst)) for lst in df1.F]
In [9]: print(df1)
A B C D E F new_col
0 1 5 9 3 7 [[11, 35], [36, 37]] [(11, 35), (36, 37)]
1 2 6 10 4 8 [[12, 42], [14, 11]] [(12, 42), (14, 11)]
2 3 7 11 1 5 [[9, 37], [10, 43], [12, 28]] [(9, 37), (10, 43), (12, 28)]
3 4 8 12 2 6 [[105, 27]] [(105, 27)]
4 47 21 45 27 40 [] []
5 27 40 33 47 21 [[45, 2]] [(45, 2)]
Upvotes: 1