GK89
GK89

Reputation: 664

GCP - BigTable to BigQuery

I am trying to query Bigtable data in BigQuery using the external table configuration. I have the following SQL command that I am working with. However, I get an error stating invalid bigtable_options for format CLOUD_BIGTABLE. The code works when I remove the columns field. For context, the raw data looks like this (running query without column field):

rowkey aAA.column.name aAA.column.cell.value
4271 xxx 30
yyy 25

But I would like the table to look like this:

rowkey xxx
4271 30
CREATE EXTERNAL TABLE dev_test.telem_test
OPTIONS (
  format = 'CLOUD_BIGTABLE',
  uris = ['https://googleapis.com/bigtable/projects/telem/instances/dbb-bigtable/tables/db1'],
  bigtable_options =  
    """
    {
      bigtableColumnFamilies: [
        {
          "familyId": "aAA",
          "type": "string",
          "encoding": "string",
          "columns": [
            {
              "qualifierEncoded": string,
              "qualifierString": string,
              "fieldName": "xxx",
              "type": string,
              "encoding": string,
              "onlyReadLatest": false
            }
          ]
        }
      ],
      readRowkeyAsString: true
    }
    """
    );

Upvotes: 0

Views: 571

Answers (3)

Hsing Min Wang
Hsing Min Wang

Reputation: 1

"encoding": "string" ==> "encoding": "TEXT" or "encoding": "BINARY"

https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#bigtableoptions

encoding

Optional. The encoding of the values when the type is not STRING. Acceptable encoding values are: TEXT - indicates values are alphanumeric text strings. BINARY - indicates values are encoded using HBase Bytes.toBytes family of functions. This can be overridden for a specific column by listing that column in 'columns' and specifying an encoding for it.

Upvotes: 0

Kokerboom
Kokerboom

Reputation: 36

The error here is in this part:

bigtableColumnFamilies: [

It should be:

  "columnFamilies": [

Concerning adding columns for string you will only add:

     "columns": [{
              "qualifierString": "name_of_column_from_bt",
              "fieldName": "if_i_want_rename",

               }],

fieldName is not required. However to access your field value you will still have to use such SQL code:

SELECT
aAA.xxx.cell.value as xxx
FROM dev_test.telem_test

Upvotes: 1

guillaume blaquiere
guillaume blaquiere

Reputation: 75705

I think you let the default value for each column attribute. the string is the type of the value to provide, but not the raw value to provide. It makes no sense in JSON here. Try to add double quote like that

CREATE EXTERNAL TABLE dev_test.telem_test
OPTIONS (
  format = 'CLOUD_BIGTABLE',
  uris = ['https://googleapis.com/bigtable/projects/telem/instances/dbb-bigtable/tables/db1'],
  bigtable_options =  
    """
    {
      bigtableColumnFamilies: [
        {
          "familyId": "aAA",
          "type": "string",
          "encoding": "string",
          "columns": [
            {
              "qualifierEncoded": "string",
              "qualifierString": "string",
              "fieldName": "xxx",
              "type": "string",
              "encoding": "string",
              "onlyReadLatest": false
            }
          ]
        }
      ],
      readRowkeyAsString: true
    }
    """
    );

The false is correct because the type is a boolean. More details here. The encoding "string" will be erroneous (use a real encoding type).

Upvotes: 1

Related Questions