Reputation: 2781
-put
and -copyFromLocal
are documented as identical, while most examples use the verbose variant -copyFromLocal. Why?
Same thing for -get
and -copyToLocal
Upvotes: 61
Views: 60189
Reputation: 13539
Latest
There is no difference between -copyFromLocal
and the -put
command.
Reference: Hadoop's documentation.
Earlier
-copyFromLocal
is similar to -put
command, except that the source is restricted to a local file reference.
So basically, you can do with put, all that you do with -copyFromLocal
, but not vice-versa.
Similarly,
-copyToLocal
is similar to get command, except that the destination is restricted to a local file reference.
Hence, you can use get instead of -copyToLocal
, but not the other way round.
Upvotes: 70
Reputation: 29307
They're the same. This can be seen by printing usage for hdfs
(or hadoop
) on a command-line:
$ hadoop fs -help
# Usage: hadoop fs [generic options]
# [ . . . ]
# -copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst> :
# Identical to the -put command.
# -copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst> :
# Identical to the -get command.
Same for hdfs
(the hadoop
command specific for HDFS filesystems):
$ hdfs dfs -help
# [ . . . ]
# -copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] <localsrc> ... <dst> :
# Identical to the -put command.
# -copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst> :
# Identical to the -get command.
Upvotes: 1
Reputation: 794
-copyFromLocal
is restricted to copy from local while -put
can take file from any (other HDFS/local filesystem/..)Upvotes: 4
Reputation: 1
Both -put
& -copyFromLocal
commands work exactly the same. You cannot use -put
command to copy files from one HDFS directory to another. Let's see this with an example: say your root has two directories, named 'test1' and 'test2'. If 'test1' contains a file 'customer.txt' and you try copying it to test2 directory
$ hadoop fs -put /test1/customer.txt /test2
It will result in 'no such file or directory'
error since 'put' will look for the file in the local file system and not hdfs.
They are both meant to copy files (or directories) from the local file system to HDFS, only.
Upvotes: 0
Reputation: 51990
Despite what is claimed by the documentation, as of now (Oct. 2015), both -copyFromLocal
and -put
are the same.
From the online help:
[cloudera@quickstart ~]$ hdfs dfs -help copyFromLocal
-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst> :
Identical to the -put command.
And this is confirmed by looking at the sources, where you can see that the CopyFromLocal class extends the Put class, but without adding any new behavior:
public static class CopyFromLocal extends Put {
public static final String NAME = "copyFromLocal";
public static final String USAGE = Put.USAGE;
public static final String DESCRIPTION = "Identical to the -put command.";
}
public static class CopyToLocal extends Get {
public static final String NAME = "copyToLocal";
public static final String USAGE = Get.USAGE;
public static final String DESCRIPTION = "Identical to the -get command.";
}
As you might notice it, this is exactly the same for get
/copyToLocal
.
Upvotes: 21
Reputation: 20969
Let's make an example:
If your HDFS contains the path: /tmp/dir/abc.txt
And if your local disk also contains this path then the hdfs API won't know which one you mean, unless you specify a scheme like file://
or hdfs://
. Maybe it picks the path you did not want to copy.
Therefore you have -copyFromLocal
which is preventing you from accidentally copying the wrong file, by limiting the parameter you give to the local filesystem.
Put
is for more advanced users who know which scheme to put in front.
It is always a bit confusing to new Hadoop users which filesystem they are currently in and where their files actually are.
Upvotes: 41