Reputation: 1099
I am facing file not found exception when i am trying to move the file with * in DBFS. Here both source and destination directories are in DBFS. I have the source file named "test_sample.csv" available in dbfs directory and i am using the command like below from notebook cell,
dbutils.fs.mv("dbfs:/usr/krishna/sample/test*.csv", "dbfs:/user/abc/Test/Test.csv")
Error:
java.io.FileNotFoundException: dbfs:/usr/krishna/sample/test*.csv
I appreciate any help. Thanks.
Upvotes: 19
Views: 50198
Reputation: 744
If you run your code in a Databricks cluster, you could access DBFS using the nodes file system. I'm not sure if in the background it requests all the objects and then filters, but at least you can use wildcards. E.g. from a databricks notebook
%sh
ls /dbfs/cluster-logs/*/driver/log4j-2021-09-01*
Upvotes: 1
Reputation: 1305
Since the wildcards are not allowed, we need to make it work in this way (list the files and then move or copy - slight traditional way)
import os
def db_list_files(file_path, file_prefix):
file_list = [file.path for file in dbutils.fs.ls(file_path) if os.path.basename(file.path).startswith(file_prefix)]
return file_list
files = db_list_files('dbfs:/your/src_dir', 'foobar')
for file in files:
dbutils.fs.cp(file, os.path.join('dbfs:/your/tgt_dir', os.path.basename(file)))
Upvotes: 5
Reputation: 82
dbutils.fs.mv("file:/<source>", "dbfs:/<destination>", recurse=True)
Use the above command to move a local folder to dbfs.
Upvotes: -1
Reputation: 3182
Wildcards are currently not supported with dbutils. You can move the whole directory:
dbutils.fs.mv("dbfs:/tmp/test", "dbfs:/tmp/test2", recurse=True)
or just a single file:
dbutils.fs.mv("dbfs:/tmp/test/test.csv", "dbfs:/tmp/test2/test2.csv")
As mentioned in the comments below, you can use python to implement this wildcard-logic. See also some code examples in my following answer.
Upvotes: 22