Reputation: 77
I have got a list holding customer information and each item of the customers list is itself a list of a respective set of information. So:
customers = [
[customerID1, NameOfCustomer1, etc., 01 02 03]
[customerID2, NameOfCustomer2, etc., 02 05]
.
.
.
]
The digits within each customer's info set are categories that I need to assign the customers to. That is, I have got a dictionary with n keys, one for each category:
dict = {
01: [],
02: [],
03: [],
04: [],
05: []
}
Now I need the customers to be allocated to their respective categories so that customer 1 ends up in categories one, two and three - while customer two goes into two and five. Of course I could run n if-statements, one for each existing category, but with increasing number of categories I find that rather disturbing. What I thus wanted to do: get a list of categories from each customer:
for customer in customers:
categories = re.findall(r'[0-9]{2}', customer[3])
So much for the easy part. Now I am looking for a way to basically loop through this 'categories'-list:
for category in categories:
dict[category].append(customer)
However, python doesn't seem to like me using a variable to select a key. There's probably a stupidly easy solution for this one - I am just not aware of it.
Thank you very much everyone!
Upvotes: 1
Views: 193
Reputation: 353059
Step #1 is to turn those flat lists into a dictionary, which is more useful for accessing properties. I've had to imagine what your data actually looks like, but you should get the idea:
>>> customers = [
... ['customerID1', 'NameOfCustomer1', 'e','t','c', '01 02 03'],
... ['customerID2', 'NameOfCustomer2', 'e','t','c', '02 05']
... ]
>>>
>>> cust_keys = ('id', 'name', 'q1','q2','q3','categories')
>>> cdicts = [dict(zip(cust_keys, cust_vals)) for cust_vals in customers]
>>> cdicts
[{'q1': 'e', 'q3': 'c', 'q2': 't', 'name': 'NameOfCustomer1', 'id': 'customerID1', 'categories': '01 02 03'}, {'q1': 'e', 'q3': 'c', 'q2': 't', 'name': 'NameOfCustomer2', 'id': 'customerID2', 'categories': '02 05'}]
Better would be to have the categories as lists of codes, and we don't need regex for that:
>>> for cdict in cdicts:
... cdict['categories'] = cdict['categories'].split()
...
>>> cdicts
[{'q1': 'e', 'q3': 'c', 'q2': 't', 'name': 'NameOfCustomer1', 'id': 'customerID1', 'categories': ['01', '02', '03']}, {'q1': 'e', 'q3': 'c', 'q2': 't', 'name': 'NameOfCustomer2', 'id': 'customerID2', 'categories': ['02', '05']}]
Now, in order to append to a bunch of category lists, we can either check to see each time whether the key exists and make an empty list if not, or we can use a defaultdict
which handles that for us:
>>> from collections import defaultdict
>>> by_categories = defaultdict(list)
>>> for customer in cdicts:
... for category in customer['categories']:
... by_categories[category].append(customer)
...
which produces
>>> for k in sorted(by_categories):
... print 'category', k, 'contains:'
... for v in by_categories[k]:
... print v
...
category 01 contains:
{'q1': 'e', 'q3': 'c', 'q2': 't', 'name': 'NameOfCustomer1', 'id': 'customerID1', 'categories': ['01', '02', '03']}
category 02 contains:
{'q1': 'e', 'q3': 'c', 'q2': 't', 'name': 'NameOfCustomer1', 'id': 'customerID1', 'categories': ['01', '02', '03']}
{'q1': 'e', 'q3': 'c', 'q2': 't', 'name': 'NameOfCustomer2', 'id': 'customerID2', 'categories': ['02', '05']}
category 03 contains:
{'q1': 'e', 'q3': 'c', 'q2': 't', 'name': 'NameOfCustomer1', 'id': 'customerID1', 'categories': ['01', '02', '03']}
category 05 contains:
{'q1': 'e', 'q3': 'c', 'q2': 't', 'name': 'NameOfCustomer2', 'id': 'customerID2', 'categories': ['02', '05']}
Upvotes: 1