Reputation: 519
I have two lists:
A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]
Is there anyway i could group items in A for the same sessionid, so that i could print out the following:
1: ["T", "D","Q"]
2: ["D","D"]
Upvotes: 5
Views: 6149
Reputation: 55499
The itertools groupby
function is designed to do this sort of thing. Some of the other answers here create a dictionary, which is very sensible, but if you don't actually want a dict
then you can do this:
from itertools import groupby
from operator import itemgetter
A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]
for k, g in groupby(zip(sessionid, A), itemgetter(0)):
print('{}: {}'.format(k, list(list(zip(*g))[1])))
output
1: ['T', 'D', 'Q']
2: ['D', 'D']
operator.itemgetter(0)
returns a callable that fetches the item at index 0 of whatever object you pass it; groupby
uses this as the key function to determine what items can be grouped together.
Note that this and similar solutions assume that the sessionid
indices are sorted. If they aren't then you need to sort the list of tuples returned by zip(sessionid, A)
with the same key function before passing them to groupby
.
edited to work correctly on Python 2 and Python 3
Upvotes: 4
Reputation: 11197
One liner
{k: list(i for (i, _) in v) for k, v in itertools.groupby(zip(A, sessionid), operator.itemgetter(1))}
Without nested loop
{k: list(map(operator.itemgetter(0), v)) for k, v in itertools.groupby(zip(A, sessionid), operator.itemgetter(1))}
Upvotes: 2
Reputation: 3485
You could use a dictionary and zip
:
A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]
result = {i:[] for i in sessionid}
for i,j in zip(sessionid,A):
result[i].append(j)
Or you can use defaultdict
:
from collections import defaultdict
result = defaultdict(list)
for k, v in zip(sessionid, A):
result[k].append(v)
Output:
>>> result
{1: ['T', 'D', 'Q'], 2: ['D', 'D']}
Upvotes: 3
Reputation: 733
You could also convert them into numpy arrays, and use the indices of the session ids you need with np.where
import numpy as np
A = np.asarray(['T', 'D', 'Q', 'D', 'D'])
sessionid = np.asarray([1, 1, 1, 2, 2])
Ind_1 = np.where(sessionid == 1)
Ind_2 = np.where(sessionid == 2)
print A[Ind_1]
should return ['T' 'D' 'Q']
you could of course turn this into a function which takes N, the desired session and returns your A values.
Hope this helps!
Upvotes: 1
Reputation: 373
Not using itertools
, you can use a dictionary:
index = 0
dict = {}
for i in sessionid:
if not (i in dict):
dict[i] = []
else:
dict[i].append(A[index])
index += 1
print(dict) # {1: ['T', 'D', 'Q'], 2: ['D', 'D']}
And based on the remarks below:
from collections import defaultdict
dict = defaultdict(list)
for i, item in enumerate(sessionid):
dict[item].append(A[i])
Upvotes: 3
Reputation: 31181
You can do:
import pandas as pd
A = ['T', 'D', 'Q', 'D', 'D']
sessionid = [1, 1, 1, 2, 2]
pd.DataFrame({'A':A, 'id':sessionid}).groupby('id')['A'].apply(list).to_dict()
#Out[10]: {1: ['T', 'D', 'Q'], 2: ['D', 'D']}
Upvotes: 1