billiout
billiout

Reputation: 705

Exporting from Cloud SQL to CSV with column headers

I'm trying to export a table from Google Cloud SQL into a CSV file using the gcloud sql export csv command from Gcloud SDK but I don't have the option to export on top of the file also the names of the columns. Is there any workaround for this?

Thanks

Upvotes: 7

Views: 5986

Answers (2)

Stephan Schielke
Stephan Schielke

Reputation: 2843

Building on top of @philipp-sh answer. Please upvote the linked ticket for a native solution.

If you are using Python with SQLAlchemy (and PostgreSQL in this example) you can convert a Selectable to a raw SQL string. The following helper function casts all columns to varchar. The first row will be the selected column names.

def convert_query_to_raw_gcp_csv_export_sql(query: Select) -> str:
    # Get the column names from the query
    column_names = [c.name for c in query.selected_columns]

    # Create a SQL string that selects the column names as the first row
    column_names_sql = "SELECT " + ", ".join(f"'{c}'" for c in column_names)

    # Cast all columns to varchar/string
    casted_query = query.with_only_columns(*[cast(c, String) for c in query.selected_columns])

    # Compile the casted query to get the SQL string, while keeping the original query
    psycopg2_dialect = dialects.postgresql.psycopg2.dialect()
    casted_sql = str(casted_query.compile(dialect=psycopg2_dialect, compile_kwargs={"literal_binds": True}))

    # Use UNION ALL to combine the column names with the original query
    raw_gcp_csv_export_sql = f"{column_names_sql} UNION ALL {casted_sql}"

    return raw_gcp_csv_export_sql

Note: The order of result rows of a UNION ALL is not guaranteed! This means the row containing the column names might not end up as the first row. To bypass this in Postgres, you need to disable the enable_parallel_append flag.

I hope this helps someone.

Upvotes: 0

Philipp Sh
Philipp Sh

Reputation: 997

I believe that this command should produce the necessary functionality:

gcloud sql export csv instance_name gs://bucket_name/ --query="SELECT 'columnname1' , 'columnname2', 'columnname3' UNION SELECT columnname1, columnname2, columnname3 FROM table_name" --database=database_name

One downside of doing it this way is you have to specify all columns. If there is a large number of those, it might be better to write some script to write the SQL query part.

A feature request has been created on your behalf. Please star it so that you could receive updates about this feature request and do not hesitate to add additional comments to provide details of the desired implementation. You can track the feature request by following this link.

Upvotes: 5

Related Questions