lima
lima

Reputation: 293

latest version pyarrow can't serialize and deserialize? (module 'pyarrow' has no attribute 'serialize')

I want my bigquery table data saving GCS to an arrow file(.arrow):

import pyarrow as pa    
query = f"""
    SELECT * FROM `{table_path}.{table_id}`
    """
    query_results = b_client.query(query).result()


table = query_results.to_arrow()
serialized_table = pa.serialize(table).to_buffer().to_pybytes()

This is my original method but after upgrading my pyarrow version I get error for serialize():

AttributeError: module 'pyarrow' has no attribute 'serialize'

How can I resolve this?

Also in GCS my arrow file has 130000 rows and 30 columns And .arrow file size is 60MB. When I receive a request then I return data that reads GCS. But it is too slow. It takes like 30 seconds. How to make it faster? like 10 seconds

Upvotes: 0

Views: 2164

Answers (1)

Zhiwei Fang
Zhiwei Fang

Reputation: 33

The custom serialization functionality is deprecated in pyarrow 2.0. I solve this problem by downgrading the version.

conda install -c conda-forge pyarrow=1.0.1

Note: the version of pyarrow should be matched with the version of python.

Upvotes: 1

Related Questions