Reputation: 6466
I have 1.2M records at my MongoDB database. And I want to store all of this data at HBase programmatically. Basically I try to put each retrieved record to HBase in a loop. After the operation is finished, I got only 39912 records on HBase.
Here's what I've tried:
Configuration config = HBaseConfiguration.create();
String tableName = "storedtweet";
String familyName = "msg";
String qualifierName = "msg";
HTable table = new HTable(config, tableName);
// using Spring Data MongoDB to interact with MongoDB
List < StoredTweet > storedTweetList = mongoDAO.getMongoTemplate().findAll(StoredTweet.class);
for (StoredTweet storedTweet: storedTweetList) {
Put p = new Put(Bytes.toBytes(storedTweet.getTweetId()));
p.add(Bytes.toBytes(familyName), Bytes.toBytes(qualifierName), Bytes.toBytes(storedTweet.getMsg()));
table.put(p);
table.flushCommits();
}
Upvotes: 0
Views: 95
Reputation: 20826
If some row key exists and you put it again, HBase Put will override the former. I think there are some records having the same tweet id (you set it to the row key) in your data. That's why some records disappear.
Upvotes: 2