Reputation: 293
I want my bigquery table data saving GCS to an arrow file(.arrow):
import pyarrow as pa
query = f"""
SELECT * FROM `{table_path}.{table_id}`
"""
query_results = b_client.query(query).result()
table = query_results.to_arrow()
serialized_table = pa.serialize(table).to_buffer().to_pybytes()
This is my original method but after upgrading my pyarrow version I get error for serialize()
:
AttributeError: module 'pyarrow' has no attribute 'serialize'
How can I resolve this?
Also in GCS my arrow file has 130000 rows and 30 columns And .arrow
file size is 60MB. When I receive a request then I return data that reads GCS. But it is too slow. It takes like 30 seconds.
How to make it faster? like 10 seconds
Upvotes: 0
Views: 2164
Reputation: 33
The custom serialization functionality is deprecated in pyarrow 2.0. I solve this problem by downgrading the version.
conda install -c conda-forge pyarrow=1.0.1
Note: the version of pyarrow should be matched with the version of python.
Upvotes: 1