Reputation: 399
Hey I am pulling a date field from Oracle DB using the cx_Oracle module. The redacted query and connection module are:
def getInitialData():
print("Gathering... ")
dsn_tns = cx_Oracle.makedsn('xyz.com', '1234', service_name='DB')
conn = cx_Oracle.connect(user=r'me', password='password', dsn=dsn_tns)
SQLquery = ("""
SELECT REPORTDATE,
FROM LONGDESCRIPTION
WHERE
REPORTDATE > TO_DATE('01/01/2015 0:00:00', 'MM/DD/YYYY HH24:MI:SS'))""")
datai = pd.read_sql(SQLquery, conn)
datai['REPORTDATE'] = pd.to_datetime(datai['REPORTDATE'], format='%m-%d-%Y')
print("Data Retrieved")
return datai
However, when I try to manipulate this later via:
writer = index.writer()
print("Adding Data, this may take a moment... ")
for i in range(len(initialData)):
writer.add_document(docId=initialData.iloc[i]['CONTENTUID'], \
content=initialData.iloc[i]['LOWER(LDTEXT)'], \
date=initialData.iloc[i]['REPORTDATE'])
writer.commit()
I get:
ValueError: <cx_Oracle.LOB object at 0x000001CB4819E5A0> is not unicode or sequence
Has anyone seen this error? Nothing in documentation/Google about it. How does it happen? It is weird to me because I am able to get this to work using a different datefield. Both show dtype of datetime64[ns]
Upvotes: 0
Views: 1462
Reputation: 10506
Since this is a data conversion issue, knowing the character sets in use would have been useful info.
Some thoughts:
Set the character set when you connect. Use the appropriate character set for your data.:
connection = cx_Oracle.connect(connectString, encoding="UTF-8", nencoding="UTF-8")
You only need to use nencoding
if you have NCHAR / NVARCHAR / NCLOB columns.
For 'small' LOBs (that are < 1GB and fit in cx_Oracle memory), you probably want to fetch them directly as strings, since this is faster. Add a type handler:
def OutputTypeHandler(cursor, name, defaultType, size, precision, scale):
if defaultType == cx_Oracle.CLOB:
return cursor.var(cx_Oracle.LONG_STRING, arraysize=cursor.arraysize)
if defaultType == cx_Oracle.BLOB:
return cursor.var(cx_Oracle.LONG_BINARY, arraysize=cursor.arraysize)
Check if you have corrupted data that can't be handled in the character sets.
Upvotes: 2