Chaouki
Chaouki

Reputation: 465

how to force bigquery to load csv file all column as string without schema and autodetect=False

I have multiple csv files stored in gcs, I want to load them to bigquery using cloud run.

The problem I don't know the schema and the schema is variable always change also I don't want to use autodetect option when load files. I want to load csv files to bigquery using bigquery api loadconfig without schema and autodetect=False, all columns considered of type string.

is that possible ?

I tried to use pandas dataframe , but files are too large so always there is memory problems.

Upvotes: 0

Views: 1667

Answers (1)

Aalok Kamble
Aalok Kamble

Reputation: 1

Use the following function to generate schema with all columns as STRING type.

def getschema(file_path):
'''Get schema from CSV with all columns as string'''
schema = []
with open(file_path, 'r') as read_obj:
    # pass the file object to DictReader() to get the DictReader object
    csv_dict_reader = DictReader(read_obj)
    # get column names from a csv file
    column_names = csv_dict_reader.fieldnames
for c in column_names:
    schema.append(bigquery.SchemaField(c,"STRING"))
return schema

Upvotes: 0

Related Questions