Abir Chokraborty
Abir Chokraborty

Reputation: 1765

HBase storing data for a particular column with 2 or more values for the same row-key in Scala/Java API

I have a file with following contents:

UserID   Email             
1001     [email protected]     
1001     [email protected]     
1002     [email protected]
1002     [email protected]

I want to store the data like this:

ROW          COLUMN+CELL                                                                                   
1001         column=cf:Email, timestamp=1487917201278, [email protected] 
1001         column=cf:Email, timestamp=1487917201279, [email protected]                                                                                                
1002         column=cf:Email, timestamp=1487917201286, [email protected]
1002         column=cf:Email, timestamp=1487917201287, [email protected]

I am using Put for example: put 'table', '1001', 'cf:Email', '[email protected]' but it is giving me

ROW          COLUMN+CELL                                                                                    
1001         column=cf:Email, timestamp=1487917201279, [email protected]                                                                                                
1002         column=cf:Email, timestamp=1487917201286, [email protected]

It is overriding the previous value. But HBase supposed to store multiple values for a particular column based on timestamp. Is there anyway that I can store both email addresses for particular UserID?

Upvotes: 1

Views: 1168

Answers (2)

Joe Pallas
Joe Pallas

Reputation: 2155

You may want to take a closer look at the HBase documentation on versions. Note especially where it says

By default, i.e. if you specify no explicit version, when doing a get, the cell whose version has the largest value is returned

But I wouldn't pursue using multiple versions to store multiple values this way. You have to explicitly specify the maximum number of versions and it will apply to every column in that family. I would be more inclined to use distinct column names (such as Email1, Email2, ...)

Upvotes: 1

Ashu Pachauri
Ashu Pachauri

Reputation: 1403

You need to specify the number of versions for the "cf" column family. By default, the number of versions is 1. Do the following in HBase shell to modify existing table:

alter 'table', {NAME => 'cf', VERSIONS => 2147483647}

Read more about versions in HBase here.

Upvotes: 1

Related Questions