Reputation: 1655
The reason why I am asking this question is because I am working with huge datas.
In my algorithm, I basically need something like this:
users_per_document = []
documents_per_user = []
As you understand it from the names of the dictionaries, I need users that clicked a specific document and documents that are clicked by a specific user.
In that case I have "duplicated" datas, and both of them together overflows the memory and my script gets killed after a while. Because I use very large data sets, I have to make it in a efficient way.
I think that is not possible but I need to ask it, is there a way to get all keys of a specific value from dictionary?
Because if there is a way to do that, I will not need one of the dictionaries anymore.
For example:
users_per_document["document1"]
obviously returns the appropriate users,
what I need isusers_per_document.getKeys("user1")
because this will basically return the same thing withdocuments_per_user["user1"]
If it is not possible, any suggestion is pleased..
Upvotes: 0
Views: 60
Reputation: 1318
If you are using Python 3.x, you can do the following. If 2.x, just use .iteritems()
instead.
user1_values = [key for key,value in users_per_document.items() if value == "user1"]
Note: This does iterate over the whole dictionary. A dictionary isn't really an ideal data structure to get all keys for a specific value, as it will be O(n^2)
if you have to perform this operation n
times.
Upvotes: 1
Reputation:
I am not very sure about the python, but in general computer science you can solve the problem with the following way;
Basically, you can have three-dimensional array, first index is for users , second index for documents and the third index would be a boolean value.
The boolean value represents if there is relation between the specific user and the specific document.
PS: if you have really sparse matrix, you can make it much more efficient, but it is another story
Upvotes: 0