AlbeyAmakiir
AlbeyAmakiir

Reputation: 2247

How do I get unique values of a column in AWS Dynamo?

Say, in AWS Dynamo, I have a table like this:

ID (HKey)   Date (RKey)   BoxName
0           1/1/2014      Box-1
1           2/1/2014      Box-1
2           3/1/2014      Box-2
3           4/1/2014      Box-3
4           5/1/2014      Box-3
5           5/1/2014      Box-1

I want to, in a single query, get the first row for each unique Box. There could be hundreds of boxes I need the first entry for at once, making individual requests inefficient.

I can't seem to find anything in the API that would allow me to do this. Is it possible? How would I do this?

Upvotes: 12

Views: 19794

Answers (2)

Shimon
Shimon

Reputation: 172

There's no way to query just for the first appearance of each box without creating an index for the boxes as suggested above. However, if you don't mind reading the whole table and then picking the right lines, then read the whole table into an array, and then make it unique by some simple piece of code. For example, suppose you've read the table into an array (note that you might have to make several calls to scan or query until you get them all), and the array is something like this:

l = [
    {"ID": "0", "Date": "1/1/2014", "BoxName": "Box-1"},
    {"ID": "1", "Date": "2/1/2014", "BoxName": "Box-1"},
    {"ID": "2", "Date": "3/1/2014", "BoxName": "Box-2"},
    {"ID": "3", "Date": "4/1/2014", "BoxName": "Box-3"},
    {"ID": "4", "Date": "5/1/2014", "BoxName": "Box-3"},
    {"ID": "5", "Date": "5/1/2014", "BoxName": "Box-1"}
]

Then, a simple code like this in python will give you the list in the variable "out":

    out = []
    seen = []
    for line in l:
        if line["BoxName"] not in seen:
            seen.append(line["BoxName"])
            out.append(line)

Upvotes: 0

Swami Sivasubramanian
Swami Sivasubramanian

Reputation: 51

You might want to consider creating a Global secondary index (GSI) on Boxname (hash key) and date as your range key. This will enable you to use Query API on the secondary index where you can query "Find all IDs with Boxname = $box".

See the documentation for GSI.

Hope this helps, Swami

Upvotes: 4

Related Questions