Reputation: 2014
I have the data (the results of LDA using Gensim), which looks like this:
[(1, 0.97456828373415116)]
[(0, 0.91883125256489728), (1, 0.020225186991467976), (2, 0.020314851937259213), (3, 0.020382294889184499), (4, 0.020246413617191008)]
[(0, 0.93783520386426555), (1, 0.015481826214088806), (2, 0.015545735781026492), (3, 0.015535246185968628), (4, 0.015601987954650424)]
[(2, 0.98493696818505228)]
[(3, 0.99067359305252778)]
[(0, 0.73578249201070511), (3, 0.25197028613750805)]
I would like to convert to the following format:
[(0, 0), (1, 0.97456828373415116), (2, 0), (3, 0), (4, 0)]
[(0, 0.91883125256489728), (1, 0.020225186991467976), (2, 0.020314851937259213), (3, 0.020382294889184499), (4, 0.020246413617191008)]
[(0, 0.93783520386426555), (1, 0.015481826214088806), (2, 0.015545735781026492), (3, 0.015535246185968628), (4, 0.015601987954650424)]
[(0, 0), (1, 0), (2, 0.98493696818505228), (3, 0), (4, 0)]
[(0, 0), (1, 0), (2, 0), (3, 0.96747728928637211), (4, 0)]
[(0, 0), (1, 0), (2, 0), (3, 0.99067359305252778), (4, 0)]
[(0, 0.73578249201070511), (1, 0), (2, 0), (3, 0.25197028613750805), (4, 0)]
Upvotes: 1
Views: 57
Reputation: 13164
One very simple way to do this is to use a constructed dict with your defaults, and then update it:
>>> d = dict([(0,0),(1,0),(2,0),(3,0)])
>>> print(d)
{0: 0, 1: 0, 2: 0, 3: 0}
>>> d.update([(0, 0.73578249201070511), (3, 0.25197028613750805)])
>>> print(d)
{0: 0.7357824920107051, 1: 0, 2: 0, 3: 0.25197028613750805}
Edit
Incorporating hgwell's suggestion to output a list of tuples, here is a complete function (which could probably be done better somehow, but this works anyway):
def listify(l):
res = []
for j in l:
d = dict([(0,0),(1,0),(2,0),(3,0),(4,0)])
d.update(j)
res.append(list(d.items()))
return res
and in action...
>>> z = listify([[(1, 0.97456828373415116)],
[(0, 0.91883125256489728), (1, 0.020225186991467976), (2, 0.020314851937259213), (3, 0.020382294889184499), (4, 0.020246413617191008)],
[(2, 0.98493696818505228)]])
>>> pprint(z)
[[(0, 0), (1, 0.9745682837341512), (2, 0), (3, 0), (4, 0)],
[(0, 0.9188312525648973),
(1, 0.020225186991467976),
(2, 0.020314851937259213),
(3, 0.0203822948891845),
(4, 0.020246413617191008)],
[(0, 0), (1, 0), (2, 0.9849369681850523), (3, 0), (4, 0)]]
Upvotes: 1
Reputation: 49330
You can change each sublist into a dict
with the map()
function:
data = [[(1, 0.97456828373415116)],
[(0, 0.91883125256489728), (1, 0.020225186991467976), (2, 0.020314851937259213), (3, 0.020382294889184499), (4, 0.020246413617191008)],
[(0, 0.93783520386426555), (1, 0.015481826214088806), (2, 0.015545735781026492), (3, 0.015535246185968628), (4, 0.015601987954650424)],
[(2, 0.98493696818505228)],
[(3, 0.99067359305252778)],
[(0, 0.73578249201070511), (3, 0.25197028613750805)]]
results = list(map(dict, data))
Then use the dict.get
method and specify a default of 0
for keys that are not present in the dictionary:
for i in range(5):
print(results[0].get(i, 0))
Result of the above:
0
0.9745682837341512
0
0
0
Upvotes: 1