Reputation: 65
I'm pretty new to uploading data to teradata. The method I know works is inserting row by row using insert statements but would like to avoid that. I am trying to directly upload my panda's dataframe to teradata but have not been successful yet. I've tried 2 methods and my preference is to get method 1 to work but want to get a working solution first.
I've tried 2 methods.
1.Teradataml module - copy_to_sql
2.Teradata module - using insert statement
method 1: Create table using copy_to_sql function
from teradataml.dataframe.copy_to import copy_to_sql
from teradataml import create_context, remove_context
df # some dataframe
table_name="db.table"
copy_to_sql(df = df_new, table_name = "db.table", primary_index="index", if_exists="replace")
method 2: Add to already created table using insert statement
import teradata
udaExec = teradata.UdaExec (appName=appname, version="1.0", logConsole=False)
connect = udaExec.connect(method="odbc",system=host, username=user,
password=passwrd)
num_of_chunks=100
table_name="db.table"
query='INSERT INTO '+table_name+' values(?,?,?,?,?);'
df_chunks=np.array_split(df_new2, num_of_chunks)
for i,_ in enumerate(df_chunks):
data = [tuple(x) for x in df_chunks[i].to_records(index=False)]
connect.executemany(query, data,batch=True)
**method 1** get the following error related to access. Not sure while the SQL statement is adding quotes for the bolded table below:
OperationalError: (teradatasql.OperationalError) [Version 16.20.0.48] [Session 5229096] [Teradata Database] [Error 3524] The user does not have CREATE TABLE access to database U378597.
[SQL:
CREATE multiset TABLE **"db.table"** (
"PBP" VARCHAR(1024) CHAR SET UNICODE,
recon VARCHAR(1024) CHAR SET UNICODE,
date2 TIMESTAMP(6),
"CF" FLOAT,
"index" VARCHAR(1024) CHAR SET UNICODE
)
primary index( "index" )
]
**method 2** get a error about inserting dates. Assume datetime needs to be converted in someway to work in teradata table but unsure how
DatabaseError: (6760, '[HY000] [Teradata][ODBC Teradata Driver][Teradata Database] Invalid timestamp. ')
Upvotes: 2
Views: 7653
Reputation: 11
Here is my preferred way to connect to Teradata:
import teradataml as tdml # TD python library
conn = tdml.create_context(host = "hostname:port", username="USERNAME", password = getpass.getpass('Password:'), logmech='LDAP')
Use copy_to_sql for small datasets and fastload() for large ones: https://docs.teradata.com/r/Teradata-Package-for-Python-User-Guide/May-2022/teradataml-General-Functions/Data-Transfer-Utility/Saving-DataFrame-to-Vantage/fastexport
tdml.copy_to_sql(df, table_name='TableName', if_exists='replace')
from teradataml.dataframe.fastload import fastload
fastload(df = df, table_name = 'TableName')
Upvotes: 1
Reputation: 2080
The table_name
is an unqualified name. To specify the Teradata "database" in which the table should be created, use the separate schema_name
parameter.
And for "method 2", consider using the teradatasql
package instead of teradata
. Or I suppose you could .isoformat(' ')
the timestamp.
Upvotes: 1