Fogo Fortitude
Fogo Fortitude

Reputation: 47

How to load LOB files from Teradata's DBC.QRYLOGSQL into a Pandas Dataframe

Problem: When running a Teradata query against DBC.QRYLOGSQL via Python code to then load into a pandas dataframe; I encounter the following issues:

  1. only appears to display the first few lines of the LOB's contained within the DBC.QRYLOGSQL SqlTextInfo column.

  2. it looks to riddled with regex

An Example of a df.SqlTextInfo value:

SEL\rA.*,\r--B.ACCT_KEY AS CUST_ACCT_KEY....

Question:

  1. How do I validate that I have loaded the entirety of each LOB from Teradata's DBC.QRYLOGSQL.SqlTextInfo into a pandas data frame?
  2. How do I validate that it is readable for a normal person; i.e. it's not represented in RegEx?
  3. How do I view the entirety of a LOB recorded as long string into my pandas dataframe column?

My Code

import teradata;
 
#Establish the connection to the Teradata database
udaExec = td.UdaExec (appName="HelloWorld", version="1.0", logConsole=False)
  
session = udaExec.connect(method="odbc",
                          system="HOSTNAME",
                          username="USERNAME",
                          password="PASSWORD",
                          driver="Teradata")

#query to fetch the row of data from the DBC.QRYLOGSQL table in Teradata
query = "SELECT QueryID, SqlTextInfo FROM DBC.QRYLOGSQL WHERE QueryID = 12345"

#Fetch the data from Teradata using Pandas Dataframe
df = pd.read_sql(query, session)
counter = len(df)

Upvotes: 1

Views: 154

Answers (1)

Fogo Fortitude
Fogo Fortitude

Reputation: 47

STEP 1 - Validate the DBC.QRYLOGSQL.SqlTextInfo LOB's full string length was loaded, not just what is seen from running df.head().

Run the following:

df['SqlTextInfo'].str.len()

Results:

enter image description here

NOTE: The length of strings is far longer than what had appeared in the column after running df.head()

STEP 2: Remove the RegEx from df['SqlTextInfo']

# Take dataframe column as a series and convert 
# to String Data Type the load as a new column to the existing data frame

df['SQL_Cleaned'] = pd.Series(df['SqlTextInfo'], dtype="string")

STEP 3: Show the full text of SQL when viewing the data frame:

pd.options.display.max_colwidth = 300

Upvotes: 1

Related Questions