Reputation: 35

Convert lists of tuples from rows in a pandas dataframe into one list of tuples

I have a pandas Dataframe and I want to merge multiple list of tuples in different rows into one list of tuples. The dataset has 10 000+ rows and I want to add all of the list of tuples into one list of tuples.


InvoiceNo      Description    
534            [(AB, AC), (ACBO, PPK)]
415            [(AD, AT), (CBO, PKD), (CBO, PKA)]
315            [(FDC, ATO), (VBO, IKD), (CVB, PKD)]

Desired output:

Edges =  [(AB, AC), (ACBO, PPK), (AD, AT), (CBO, PKD), (CBO, PKA), (FDC, ATO), (VBO, IKD), (CVB, PKD)]

Upvotes: 1

Answers (3)

Derek Eden

Reputation: 4638

for pandas version 1+ you can also use the explode method:

df['Description'].explode().tolist()

output:

[('AB', 'AC'), ('ACBO', 'PPK'), ('AD', 'AT'), ('CBO', 'PKD'), ('FDC', 'ATO'), ('VBO', 'IKD'), ('CVB', 'PKD')]

Upvotes: 2

Elton Clark

Reputation: 156

With the number of rows, does duplicate edges cause problems for you application?

If it does, consider the sets type instead of the list. Then you can use jezrael's beautiful comprehension one liner with {}:

Edges = {y for x in df.Description for y in x}

Upvotes: 0

jezrael

Reputation: 863801

Use list comprehension with flatten nested lists of tuples:

Edges = [y for x in df.Description for y in x]
print (Edges)
[('AB', 'AC'), ('ACBO', 'PPK'), ('AD', 'AT'), ('CBO', 'PKD'), 
 ('CBO', 'PKA'), ('FDC', 'ATO'), ('VBO', 'IKD'), ('CVB', 'PKD')]

Or chain.from_iterable for better performance:

from  itertools import chain

Edges = list(chain.from_iterable(df.Description))
print (Edges)
[('AB', 'AC'), ('ACBO', 'PPK'), ('AD', 'AT'), ('CBO', 'PKD'), 
 ('CBO', 'PKA'), ('FDC', 'ATO'), ('VBO', 'IKD'), ('CVB', 'PKD')]

Upvotes: 6

Convert lists of tuples from rows in a pandas dataframe into one list of tuples

Answers (3)

Related Questions