ls666
ls666

Reputation: 1

spark sql load parqet with special character in path

I am using pyspark sql to load files into table following

LOAD DATA LOCAL INPATH '/user/hive/warehouse/students' OVERWRITE INTO TABLE test_load;

https://spark.apache.org/docs/latest/sql-ref-syntax-dml-load.html

It complains:

pyspark.sql.utils.AnalysisException: load data input path does not exist when the path string has timestamp in the directory structure like XX/XX/2021-03-02T20:04:27+00:00/file.parquet

It works with path without timestamp. How to work it around?

Upvotes: 0

Views: 269

Answers (1)

eshirvana
eshirvana

Reputation: 24633

I haven't seen any file system that support '2021-03-02T20:04:27+00:00' as folder name or file name. usually ":" and "+" signs are considered as reserved characters and you can't use them in file/folder naming.

read file system manual you are using for "reserved words"

change your datetime format to something that is supported by the operating system file system like 'yyyy-mm-ddThhMMSS' ex: '2021-03-02T200427'

Upvotes: 1

Related Questions