yingnan liu
yingnan liu

Reputation: 519

Python group two lists

I have two lists:

             A = ['T', 'D', 'Q', 'D', 'D']
             sessionid = [1, 1, 1, 2, 2]

Is there anyway i could group items in A for the same sessionid, so that i could print out the following:

              1: ["T", "D","Q"]
              2: ["D","D"]

Upvotes: 5

Views: 6149

Answers (6)

PM 2Ring
PM 2Ring

Reputation: 55499

The itertools groupby function is designed to do this sort of thing. Some of the other answers here create a dictionary, which is very sensible, but if you don't actually want a dict then you can do this:

from itertools import groupby
from operator import itemgetter

A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]    

for k, g in groupby(zip(sessionid, A), itemgetter(0)):
    print('{}: {}'.format(k, list(list(zip(*g))[1])))

output

1: ['T', 'D', 'Q']
2: ['D', 'D']

operator.itemgetter(0) returns a callable that fetches the item at index 0 of whatever object you pass it; groupby uses this as the key function to determine what items can be grouped together.

Note that this and similar solutions assume that the sessionid indices are sorted. If they aren't then you need to sort the list of tuples returned by zip(sessionid, A) with the same key function before passing them to groupby.


edited to work correctly on Python 2 and Python 3

Upvotes: 4

malbarbo
malbarbo

Reputation: 11197

One liner

{k: list(i for (i, _) in v) for k, v in itertools.groupby(zip(A, sessionid), operator.itemgetter(1))}

Without nested loop

{k: list(map(operator.itemgetter(0), v)) for k, v in itertools.groupby(zip(A, sessionid), operator.itemgetter(1))}

Upvotes: 2

Farhan.K
Farhan.K

Reputation: 3485

You could use a dictionary and zip:

A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]

result = {i:[] for i in sessionid}
for i,j in zip(sessionid,A):
    result[i].append(j)

Or you can use defaultdict:

from collections import defaultdict
result = defaultdict(list)
for k, v in zip(sessionid, A):
   result[k].append(v)

Output:

>>> result
{1: ['T', 'D', 'Q'], 2: ['D', 'D']}

Upvotes: 3

JP1
JP1

Reputation: 733

You could also convert them into numpy arrays, and use the indices of the session ids you need with np.where

import numpy as np

A = np.asarray(['T', 'D', 'Q', 'D', 'D'])
sessionid = np.asarray([1, 1, 1, 2, 2])

Ind_1 = np.where(sessionid == 1)
Ind_2 = np.where(sessionid == 2)

print A[Ind_1]

should return ['T' 'D' 'Q']

you could of course turn this into a function which takes N, the desired session and returns your A values.

Hope this helps!

Upvotes: 1

zaphodef
zaphodef

Reputation: 373

Not using itertools, you can use a dictionary:

index = 0
dict = {}
for i in sessionid:
    if not (i in dict):
        dict[i] = []
    else:
        dict[i].append(A[index])
    index += 1

print(dict) # {1: ['T', 'D', 'Q'], 2: ['D', 'D']}

And based on the remarks below:

from collections import defaultdict
dict = defaultdict(list)
for i, item in enumerate(sessionid):
    dict[item].append(A[i])

Upvotes: 3

Colonel Beauvel
Colonel Beauvel

Reputation: 31181

You can do:

import pandas as pd

A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]

pd.DataFrame({'A':A, 'id':sessionid}).groupby('id')['A'].apply(list).to_dict()

#Out[10]: {1: ['T', 'D', 'Q'], 2: ['D', 'D']}

Upvotes: 1

Related Questions