Reputation: 305
https://cloud.google.com/bigtable/docs/go/cbt-reference
As in this reference, I tried the following command
cbt count <table>
for three different tables.
For one of them I got what I expected: the number of rows, a bit shy of 1M.
For the second table, I got the following error:
[~]$ cbt count prod.userprofile
2016/10/23 22:47:48 Reading rows: rpc error: code = 4 desc = Error while reading table 'projects/focal-elf-631/instances/campaign-stat/tables/prod.userprofile'
[~]$ cbt count prod.userprofile
2016/10/23 23:00:23 Reading rows: rpc error: code = 4 desc = Error while reading table 'projects/focal-elf-631/instances/campaign-stat/tables/prod.userprofile'
I tried it several times, but I got the same error every time.
For the last one, I got a different error (the error code is the same as above, but its description is different):
[~]$ cbt count prod.appprofile
2016/10/23 22:45:17 Reading rows: rpc error: code = 4 desc = Error while reading table 'projects/focal-elf-631/instances/campaign-stat/tables/prod.appprofile' : Response was not consumed in time; terminating connection. (Possible causes: row size > 256MB, slow client data read, and network problems)
[~]$ cbt count prod.appprofile
2016/10/23 23:11:10 Reading rows: rpc error: code = 4 desc = Error while reading table 'projects/focal-elf-631/instances/campaign-stat/tables/prod.appprofile' : Response was not consumed in time; terminating connection. (Possible causes: row size > 256MB, slow client data read, and network problems)
I also tried this one several times, and nothing changed.
I googled and searched on stackoverflow with the 'rpc error code 4' as keywords, but did not find anything useful.
I'm really curious why this command would fail, and what I can do to resolve this (by the way, these two tables are being used in production 24/7 and we have several dozens of big table nodes working just fine, so I don't think it has to do with bandwidth or QPS).
Upvotes: 2
Views: 7959
Reputation: 87
You can try the below command :
cbt -project <name_of_project> -instance <name_of_instance> count <name_of_table>
Upvotes: 0
Reputation: 2774
As an alternative a possible (although not the best one) is using atomic counters, that is:
If you design a second table as secondary index of counters in certain conditions it can have good performance (if you don't blast the counters with simultaneous reads and writes, or you fall in heavy counter r/w because of hotspotting).
Nevertheless Map/Reduce is definitively a more robust solution as @solomon-duskis proposed.
Upvotes: 0
Reputation: 2711
Getting a count on a large table requires reading something from every single row in Bigtable. There isn't a notion of just getting a single value that represents a count.
This type of problem requires something like a map/reduce, unfortunately. Fortunately, it's quite straight forward to do count with Dataflow.
Upvotes: 4