sam
sam

Reputation: 494

Converting a DataFrame column into list of pairs

  col_1
A   2   
B   8    
C   4    
D   3
E   1

I would like to convert this into something like :

[[1,2],[3,4],...]

I have tried a for loop

def get_pair(col):
    lst = sorted(list(col))
    pairs = []
    for i in range(len(col)):
        for j in range(i+1, len(lst)):
            pair = [lst[i], lst[j]]
            pairs.append(pair)
    return pairs

Is there an efficient way of doing this in Pandas?

Upvotes: 1

Views: 160

Answers (3)

sammywemmy
sammywemmy

Reputation: 28644

The other answers should work fine; this is an alternative, using zip_longest :

from itertools import zip_longest

box = df.col_1.array

list(zip_longest(box[::2], box[1::2]))

[(2, 8), (4, 3), (1, None)]

Upvotes: 1

Joe Ferndz
Joe Ferndz

Reputation: 8508

solution for even or odd # of items; using numpy

Expanding on the wonderful idea that @sammywemmy provided, I am using numpy. The below solution will take care of even or odd numbers:

import pandas as pd
import numpy as np

df = pd.DataFrame({'col_1':[1,2,3,4,5,6,7,8,9]})

#check length of df. If len is odd, get items upto n-1
x = len(df)
y = x if x%2 == 0 else x-1

#reshape only n-1 items if n is odd
z = np.reshape(df.col_1.to_numpy()[:y], (-1, 2)).tolist()

#if n is odd, then append nth item with None.
if x != y: z.append([df.values.tolist()[-1][0],None])

#print result
print (z)

The output will be:

[[1, 2], [3, 4], [5, 6], [7, 8], [9, None]]

If you want the result set to look like this:

[[1, 2], [3, 4], [5, 6], [7, 8], [9]]

then change the z.append line to

if x != y: z.append(df.values.tolist()[-1])

Solution if list is even number of items

Assuming that your DataFrame is an even list of items, you can use iterrows() and list comprehension to get what you want.

import pandas as pd
df = pd.DataFrame({'col_1':[1,2,3,4,5,6,7,8]})
print (df)

a = [[v['col_1'],df.iloc[i+1]['col_1']] for i,v in df.iloc[::2].iterrows()]
print (a)

This will give you:

[[1, 2], [3, 4], [5, 6], [7, 8]]

Upvotes: 2

Mayank Porwal
Mayank Porwal

Reputation: 34046

You can do this using list comprehension:

In [644]: df
Out[644]: 
   col_1
A      1
B      2
C      3
D      4

In [656]: l = df.T.values.tolist()[0]

In [672]: pairs = [l[:c][-2:] for c, i in enumerate(l, 1) if c % 2 == 0]

In [673]: pairs
Out[673]: [[1, 2], [3, 4]]

Upvotes: 1

Related Questions