lima
lima

Reputation: 293

how to user insert_rows_from_dataframe?? Python to Bigquery

Example My DataFrame

df = pd.DataFrame({'a': [4,5,4], 'b': ['121233', '45a56', '0000'], 'c':['sas53d1', '1asdf23', '2asdf456']})

my Bigquery Table Path

table = 'tutorial.b'

So i make a Table with

client.load_table_from_dataframe(df,table)

Then i make a bigquery table in tutorial.b So i want to insert another dataframe

df1 = df = pd.DataFrame({'a': [1,2,3], 'b': ['aaa', 'bbb', 'ccc'], 'c':['casd33', 'dfsf12', 'lsdfkj3']})

so user insert_rows_from_dataframe

client.insert_rows_from_dataframe(table = table_path, dataframe = df1)
client.insert_rows_from_dataframe(df1, table_path)

i try this two code but always same error

The table argument should be a table ID string, Table, or TableReference

how to insert my dataframe to already exist table?

Upvotes: 0

Views: 1709

Answers (1)

Mazlum Tosun
Mazlum Tosun

Reputation: 6572

You can for the 2 Dataframe, use the same method : load_table_from_dataframe but with different options :

  • If you want to load the dataframe to table and create the table :
job_config = bigquery.LoadJobConfig(
        schema=[
            bigquery.SchemaField("title", bigquery.enums.SqlTypeNames.STRING),
            bigquery.SchemaField("wikidata_id", bigquery.enums.SqlTypeNames.STRING),
        ],
        create_disposition="CREATE_IF_NEEDED",
        write_disposition="WRITE_TRUNCATE",
    )

    job = client.load_table_from_dataframe(
        dataframe, table_id, job_config=job_config
    )  # Make an API request.
    job.result()  # Wait for the job to complete.
  • If you want to load the dataframe to an existing table without recreate it :
job_config = bigquery.LoadJobConfig(
        schema=[
            bigquery.SchemaField("title", bigquery.enums.SqlTypeNames.STRING),
            bigquery.SchemaField("wikidata_id", bigquery.enums.SqlTypeNames.STRING),
        ],
        create_disposition="CREATE_NEVER",
        write_disposition="WRITE_APPEND",
    )

    job = client.load_table_from_dataframe(
        dataframe, table_id, job_config=job_config
    )  # Make an API request.
    job.result()  # Wait for the job to complete.

I used a fake schema in my examples, but it's not mandatory in your case.

The job_config allows to indicate the options for the ingestion.

Upvotes: 1

Related Questions