Chipmunk_da
Chipmunk_da

Reputation: 507

SQLAlchemy read_sql() into Pandas dataframe - large column value gets truncated

I am trying to read data from a MySQL table, and one of the columns contains large varchar values e.g. of length 49085. When I read the results of the query into a dataframe, the column value is truncated at 87 characters. Please see the code below and the output. Does anyone know how I can read the entire string without truncation?

In the code below table test contains a column description where one of the rows has a string of length 49085.

Code:

import sys
import os
from sqlalchemy import create_engine
import pandas as pd

db_connection_str = 'mysql+pymysql://username:password@host/db_name'
db_connection = create_engine(db_connection_str)

#this returns 1 row where the value in the description field is of length 49085
df = pd.read_sql("select id, description, length(description) as len from myTable where length(description) = 49085", con=db_connection)

#this returns the truncated value of length 87
print(df)
len(str(df['description']))

Output:

   id                                             description    len
0  1  This document is for the testing Team.\n\nThe attach...  49085
87

Upvotes: 0

Views: 637

Answers (1)

Gord Thompson
Gord Thompson

Reputation: 123654

You are being misled by len(str(df['description'])). df['description'] returns a <class 'pandas.core.series.Series'> object and if we call str() on it we get

'0    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...\nName: description, dtype: object'

The length of that string will be 87 for any arbitrarily large string in the Series. To test the actual length of the string, use

print(len(df['description'][0]))

or similar.

Upvotes: 1

Related Questions