Jake Slattery
Jake Slattery

Reputation: 87

big query python API insert null values through json

I have a big query table with 4 columns : name (string) , age(int) , score(float) , dob (datetime).

#!/usr/bin/env python

import json

from google.cloud import bigquery


def stream_data(json_data):
    bigquery_client = bigquery.Client("myproject")
    dataset = bigquery_client.dataset("ComputedData")
    table = dataset.table("test")
    data = json.loads(json_data)
    table.reload()
    rows = [data]
    errors = table.insert_data(rows)

    if not errors:
        print('Loaded 1 row ')
    else:
        print('Errors: {}'.format(errors))


if __name__ == '__main__':
    mynam = 'mike'
    mage = 212
    mydob='1983-09-01 00:00:00'
    mydob=None
    ds=str(mydob) if mydob else None
    myscore = 0;
    stream_data('["' + str(mynam) + '",' + str(mage) + ',"' + ds + '",'+ str(myscore) +']')

The above is a sample to test whether I can insert null values . For eg. I actually compute the score and date of birth (assume) and insert it. But if the computation does not work I want to insert null into the big query table as datetime supports null.

By default , null is None in python. However I cannot insert this into the as I cannot concat Nonetype as string . If I try to stringify the null statement as :

 ds=str(mydob) if mydob else 'null'

I get 'Invalid datetime string "null"'

I am not sure if I am generating my json the wrong way. enter image description here

I do get nulls but for that I have to leave out the field in the json.

   stream_data('["' + str(mynam) + '",' + str(mage) + ']')

I cannot do this as if I want to enter score but leave dob as null , if I insert

stream_data('["' + str(mynam) + '",' + str(mage) + ',' + str(myscore)+']')

This inserts or tries to insert score in the datetime column. So I have to insert a value for every column in the json. but want to be able to specify nulls while inserting.

Upvotes: 2

Views: 7628

Answers (1)

Willian Fuks
Willian Fuks

Reputation: 11787

Maybe the problem here is that the insert_data method works with a list of tuples and not json.

I just created a table like yours in BQ and used the same code as yours but instead of converting to json I used the function as it's described in the docs, like so:

error = table.insert_data([('name1', 200, None, '1990-09-09 00:00:00'),
                           ('name2', 201, 67.3,)])

Among the tests I did I also tried:

error = table.insert_data([('name3', 301, 35.4,None)])

And also:

error = table.insert_data([(None, None, None,None)])

It all worked:

enter image description here

Maybe what you can do is (if your code receives for some reason the json) to first transform this json into a list of tuples and then send this list to the insert_data method with NULL values as None.

Upvotes: 3

Related Questions