Reputation: 1
I am using pyspark sql to load files into table following
LOAD DATA LOCAL INPATH '/user/hive/warehouse/students' OVERWRITE INTO TABLE test_load;
https://spark.apache.org/docs/latest/sql-ref-syntax-dml-load.html
It complains:
pyspark.sql.utils.AnalysisException: load data input path does not exist when the path string has timestamp in the directory structure like XX/XX/2021-03-02T20:04:27+00:00/file.parquet
It works with path without timestamp. How to work it around?
Upvotes: 0
Views: 269
Reputation: 24633
I haven't seen any file system that support '2021-03-02T20:04:27+00:00' as folder name or file name. usually ":" and "+" signs are considered as reserved characters and you can't use them in file/folder naming.
read file system manual you are using for "reserved words"
change your datetime format to something that is supported by the operating system file system like 'yyyy-mm-ddThhMMSS' ex: '2021-03-02T200427'
Upvotes: 1