Alexander Chandra
Alexander Chandra

Reputation: 669

Google BigQuery - Streaming Data Into BigQuery

I'am using Google BigQuery for my project Right now i'am trying to insert a new row to BQ based on this https://cloud.google.com/bigquery/streaming-data-into-bigquery#bigquery-stream-data-java

private void insertRowsToBQ(MyCustomObject data) {
    String datasetName = "mydatasetname";
    String tableName = "mytablename";
    Map<String, Object> rowContent = new HashMap<>();
    rowContent.put("field_1", data.getdata1());
    rowContent.put("field_2", data.getdata2());
    rowContent.put("field_3", data.getdata3());
    rowContent.put("field_4", data.getdata4());

    try {
        BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
        TableId tableId = TableId.of(datasetName, tableName);
        InsertAllResponse response =
                bigquery.insertAll(
                        InsertAllRequest.newBuilder(tableId)
                                .addRow(rowContent)
                                .build());

        if (response.hasErrors()) {
            for (Map.Entry<Long, List<BigQueryError>> entry : response.getInsertErrors().entrySet()) {
                Logger.error("Response error: \n" + entry.getValue());
            }
        }
        Logger.info("Rows successfully inserted into table");
    } catch (BigQueryException e) {
        Logger.error("Insert operation not performed \n" + e.toString());
    }
}

the code is working fine, got no error log BUT when i'am trying to view it on google console https://console.cloud.google.com/bigquery?project=myprojectname

select * from `myprojectname.mydatasetname.mytablename` where DATE(_PARTITIONTIME) = "2021-03-24"

the data on google BQ console, the data is not displayed. It's turn out that the data is delayed more than 1 hour, until it can be view on the BQ

is this expected? or there are some issue?

i've been trying recreate the dataset and table, but still no luck

Upvotes: 1

Views: 745

Answers (2)

Tomasz Kubat
Tomasz Kubat

Reputation: 168

As @Sunandini Ravanda mentioned, the data loading via STREAMING INSERT goes to the buffer first, and than after a couple of minutes the data is being redistributed to proper partition.

To get the data from the buffer try:

select *
from `myprojectname.mydatasetname.mytablename`
where _PARTITIONTIME is null

Upvotes: 1

Sunandini Ravada
Sunandini Ravada

Reputation: 142

As per documentation on data availability,looks like when streaming to a partitioned table, data in the streaming buffer has a NULL value for the _PARTITIONTIME pseudo column. Also i could see we have some fields through which we can check the streamingBuffer.oldestEntryTime etc.,

For a table where we are doing streaming insert its also mentioned that Data can take up to 90 minutes to become available for copy operations.

https://cloud.google.com/bigquery/streaming-data-into-bigquery#dataavailability

Upvotes: 2

Related Questions