user546298
user546298

Reputation: 61

Replacement index 1 out of range for positional args tuple on Astro sql load_file package

I have installed astr cli recently and tried to load a csv file to gcp bigquery. I get a error

File "/usr/local/lib/python3.11/site-packages/astro/databases/base.py", line 788, in create_schema_if_applicable
    statement = self._create_schema_statement.format(schema)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IndexError: Replacement index 1 out of range for positional args tuple

As per the astro documentation it asks for only one argument. But for some reason it throw index 1 out of range error. Attached screenshot reference from the astro documentation. enter image description here

My code below

from airflow.decorators import dag, task
from datetime import datetime
from airflow.operators.python_operator import PythonOperator
from airflow.models.baseoperator import chain
from airflow.providers.google.cloud.transfers.local_to_gcs import LocalFilesystemToGCSOperator
from airflow.providers.google.cloud.operators.bigquery import \
BigQueryCheckOperator, BigQueryCreateEmptyDatasetOperator, BigQueryCreateEmptyTableOperator, \
BigQueryDeleteDatasetOperator

from astro import sql as aql
from astro.files import File
from astro.sql.table import Table, Metadata
from astro.constants import FileType

@dag(
    start_date = datetime(2023,1,1),
    schedule = None,
    catchup=False,
    tags = ['retail'],
)

def retail():
        gcs_to_raw = aql.load_file(
        task_id='gcs_to_raw',
        input_file=File(
            'gs://cloudbuild_3445455665/raw/online_Retail.csv',
            conn_id='gcp',
            filetype=FileType.CSV,
        ),
        output_table=Table(
            name='raw_invoices',
            conn_id='gcp',
            metadata=Metadata(schema="retail",)),
            use_native_support=False,
    )

The command I use to execute the DAG task

airflow tasks test retail gcs_to_raw 2023-01-01

My error screen shot

enter image description here

Any help is appriciated.

Upvotes: 2

Views: 60

Answers (1)

Cadu Magalhães
Cadu Magalhães

Reputation: 11

I got the same error, but when trying to run a transform, not a load.

I managed to make it work by doing the following:

  1. Add the assume_schema_exists=True to the function (I think this might work for you)
  2. Since I was doing a transform with a temp table, I had to manually set an output_table, with the schema (in our case, the bigquery dataset) and make sure the dataset already exists in the project.

With this I got it running, hope it helps you

Upvotes: 1

Related Questions