CHRD
CHRD

Reputation: 1957

GCP Cloud Functions to BigQuery - parquet support error

I am trying to run a simple test cloud function where I create a BigQuery table and insert a value. The error I receive sounds like I need to import pyarrow, so I've tried doing that but I keep receiving the same error. When I run an equivalent script locally, there are no issues, the table is created, and I don't even need to import pyarrow. What am I missing here?

The error:

ImportError: Unable to find a usable engine; tried using: 'pyarrow', 'fastparquet'. pyarrow or fastparquet is required for parquet support

The main.py:

import pandas as pd
from google.cloud import bigquery
import pyarrow

def main_func(data, context):
    df = pd.DataFrame({'Test': ['Success']})

    client = bigquery.Client()

    dataset_id = #removed here but specified in the real code
    dataset = bigquery.Dataset(dataset_id)
    dataset.location = #removed here but specified in the real code
    dataset = client.create_dataset(dataset, exists_ok=True)
    print("Created dataset {}.{}".format(client.project, dataset.dataset_id))

    table_id = #removed here but specified in the real code

    job_config = bigquery.LoadJobConfig(
        schema=[
            bigquery.SchemaField("Test", bigquery.enums.SqlTypeNames.STRING),
        ],
        write_disposition="WRITE_TRUNCATE",
    )

    job = client.load_table_from_dataframe(
        df, table_id, job_config = job_config
    )

    job.result()

The requirements.txt:

pandas
google-cloud-bigquery
pyarrow

Upvotes: 1

Views: 1546

Answers (1)

marian.vladoi
marian.vladoi

Reputation: 8074

You have an issue with pyarrow version. Pandas does not detect any pyarrow<0.4 because of compatibility issues, therefore you should try adding pyarrow>=0.4 in your requirements.txt.

Pyarrow is not properly detected after importing ray

Upvotes: 1

Related Questions