Reputation: 2140
This is the dictionary I have:
docs = {'computer': {'1': 1, '3': 5, '8': 2},
'politics': {'0': 2, '1': 2, '3': 1}}
I want to create a 9 * 2 tensor
like this:
[
[0, 1, 0, 5, 0, 0, 0, 0, 2],
[2, 2, 0, 1, 0, 0, 0, 0, 0, 0]
]
Here, because the max item is 8 so we have 9 rows. But, the number of rows and columns can increase based on the dictionary.
I have tried to implement this using for-loop
though as the dictionary is big it's not efficient at all and also it implemented using the list I need that to be a tensor
.
maxr = 0
for i, val in docs.items():
for j in val.keys():
if int(j) > int(maxr):
maxr = int(j)
final_lst = []
for val in docs.values():
lst = [0] * (maxr+1)
for j, val2 in sorted(val.items()):
lst[int(j)] = val2
final_lst.append(lst)
print(final_lst)
Upvotes: 4
Views: 4787
Reputation: 30605
If you are ok with using pandas
and numpy
, here's how you can do it.
import pandas as pd
import numpy as np
# Creates a dataframe with keys as index and values as cell values.
df = pd.DataFrame(docs)
# Create a new set of index from min and max of the dictionary keys.
new_index = np.arange( int(df.index.min()),
int(df.index.max())).astype(str)
# Add the new index to the existing index and fill the nan values with 0, take a transpose of dataframe.
new_df = df.reindex(new_index).fillna(0).T.astype(int)
# 0 1 2 3 4 5 6 7
#computer 0 1 0 5 0 0 0 0
#politics 2 2 0 1 0 0 0 0
If you just want the array, you can call array = new_df.values
.
#[[0 1 0 5 0 0 0 0]
# [2 2 0 1 0 0 0 0]]
If you want tensor, then you can use tf.convert_to_tensor(new_df.values)
Upvotes: 2