Reputation: 6835
The following command:
from odo import odo
odo(target='postgresql://{user}:{pass}@localhost/{server}::odo_dest_table',source='/home/username/Downloads/large_csv.csv')
Produces the following error:
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/site-packages/odo/odo.py", line 91, in odo
return into(target, source, **kwargs)
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/site-packages/multipledispatch/dispatcher.py", line 278, in __call__
return func(*args, **kwargs)
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/site-packages/odo/into.py", line 43, in wrapped
return f(*args, **kwargs)
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/site-packages/odo/into.py", line 149, in into_string_string
return into(a, resource(b, **kwargs), **kwargs)
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/site-packages/multipledispatch/dispatcher.py", line 278, in __call__
return func(*args, **kwargs)
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/site-packages/odo/into.py", line 43, in wrapped
return f(*args, **kwargs)
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/site-packages/odo/into.py", line 138, in into_string
dshape = discover(b)
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/site-packages/multipledispatch/dispatcher.py", line 278, in __call__
return func(*args, **kwargs)
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/site-packages/odo/backends/csv.py", line 377, in discover_csv
df = csv_to_dataframe(c, nrows=nrows, **kwargs)
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/site-packages/odo/backends/csv.py", line 295, in csv_to_dataframe
return _csv_to_dataframe(c, dshape=dshape, chunksize=chunksize,
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/site-packages/odo/backends/csv.py", line 346, in _csv_to_dataframe
kwargs = keyfilter(keywords(pd.read_csv).__contains__, kwargs)
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/site-packages/odo/utils.py", line 130, in keywords
return inspect.getargspec(func).args
File "/home/username/anaconda3/envs/odosimple/lib/python3.8/inspect.py", line 1083, in getargspec
raise ValueError("Function has keyword-only parameters or annotations"
ValueError: Function has keyword-only parameters or annotations, use inspect.signature() API which can support them
Process finished with exit code 1
I install into a conda env from the git repository using pip (git clone
-> pip install .
).
Upvotes: 2
Views: 1118
Reputation: 33
It is likely that your CSV file has too few lines for odo to make an educated guess about the data types.
https://odo.readthedocs.io/en/latest/datashape.html:
When odo loads this file into a new container (DataFrame, new SQL Table, etc.) it needs to know the datatypes of the source so that it can create a matching target. If the CSV file is large then it looks only at the first few hundred lines and guesses a datatype from that. In this case it might incorrectly guess that the balance column is of integer type because it doesn’t see a decimal value until very late in the file with the line Zelda,100.25. This will cause odo to create a target with the wrong datatypes which will foul up the transfer.
Try providing the dshape
keyword. E.g if your table has two columns of type varchar and bigint, then
odo(target='postgresql://{user}:{pass}@localhost/{server}::odo_dest_table',source='/home/username/Downloads/large_csv.csv', dshape='var * {a: string, b: int64}')
The number of columns in the table and CSV must be the same and match so you may have to create the table keeping in mind the structure of the CSV file.
Took me a few hours to make that discovery :P
Upvotes: 0