Reputation: 27
I exported my BigQuery data to CSV but can't figure out how to clean up the data as the headers are all appended on the backend in the same row.
Here's my code:
from google.cloud import bigquery
import pandas as pd
project = 'project1'
client = bigquery.Client(project=project)
defineQuery = """
SELECT *
FROM table
LIMIT 5;
"""
df = pd.DataFrame(client.query(defineQuery))
df.to_csv(r'C:/file-name.csv')
The BigQuery CSV file export came out with just 2 columns like the following:-
| | 0 |
|-----|-------|
| 1|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
| 2|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
| 3|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
| 4|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
| 5|Row((123,'data','data2'), {'Field 1':0, 'Field 2':1, 'Field 3':2})
How do I clean it up so it's like the following?
| | Field 1 | Field 2 | Field 3 |
|-----|---------|---------|---------|
| 1| 123 | data | data2 |
| 2| 123 | data | data2 |
| 3| 123 | data | data2 |
| 4| 123 | data | data2 |
| 5| 123 | data | data2 |
Thanks in advance!
Upvotes: 1
Views: 197
Reputation: 12254
Try this:
df = client.query(defineQuery).to_dataframe()
instead of
df = pd.DataFrame(client.query(defineQuery))
Upvotes: 1