Reputation: 11423
I have three column datagram intended to convert in dictionary in the format given:
datagram:
user_id item_id ratings
3 2 3
3 3 4
1 3 1
2 1 4
No of user = 3
NO of item = 3
ratings = 0 to 5
dictionary=
{user_id1:[rating_for_item1, rating_for_item2, rating_for_item3],
user_id2:[.same as previous.],
user_id3:[..same as prev..]}
eg,
{1:[0,0,1], 2:[4,0,0], 3:[0,3,4]}
SO, far I could do is to output like:
{1:{3:1}, 2:{1:4}, 3:{2:3, 3:4}} #{user_id:{item_id:rating}.....}
The code for above output is like:
import pandas as pd
data = {}
cols = ['user_id', 'item_id', 'ratings']
pf = pd.read_csv('filename', sep='\t', names= cols)
for user, item, rate in pf.values: data.setdefault(user,{})[item] = rate
print data
What is missing in my code, or am I completely in wrong path. Please help.
Upvotes: 3
Views: 5044
Reputation: 353059
I would pivot
and then build the dict. For example:
pdf = df.pivot("user_id", "item_id").fillna(0)
d = {k: v.tolist() for k,v in pdf.iterrows()}
produces
>>> d
{1: [0.0, 0.0, 1.0], 2: [4.0, 0.0, 0.0], 3: [0.0, 3.0, 4.0]}
First, the frame:
>>> df
user_id item_id ratings
0 3 2 3
1 3 3 4
2 1 3 1
3 2 1 4
Pivot:
>>> pdf = df.pivot("user_id", "item_id")
>>> pdf
ratings
item_id 1 2 3
user_id
1 NaN NaN 1
2 4 NaN NaN
3 NaN 3 4
Replace the NaN
s by 0:
>>> pdf = df.pivot("user_id", "item_id").fillna(0)
>>> pdf
ratings
item_id 1 2 3
user_id
1 0 0 1
2 4 0 0
3 0 3 4
And build a row-wise dictionary using a dictionary comprehension:
>>> d = {k: v.tolist() for k,v in pdf.iterrows()}
>>> d
{1: [0.0, 0.0, 1.0], 2: [4.0, 0.0, 0.0], 3: [0.0, 3.0, 4.0]}
There are lots of ways to do this last step, including dict(zip(pdf.index, pdf.values.tolist()))
, but many of them don't generalize as easily when you want to tweak it a little.
Upvotes: 2
Reputation: 6543
How about processing what you have into what you want like so:
from collections import defaultdict
processed_data = defaultdict(list)
for k,v in data.items():
for idx in range(1, 4): # Make sure we check each item
# from (1 to 3 inclusive) for each iteration
# of the dictionary
val = v.get(idx, 0)
processed_data[k].append(val)
processed_data
yields:
defaultdict(<type 'list'>, {1: [0, 0, 1], 2: [4, 0, 0], 3: [0, 3, 4]})
If you would like to convert this back to a regular dictionary (from a defaultdict
,) then
do the following:
dict(processed_data)
which yields
{1: [0, 0, 1], 2: [4, 0, 0], 3: [0, 3, 4]}
Upvotes: 1