Reputation: 3784
I am little confused about HDFS Java API, especially the role of hadoop Configuration against the config we put on hadoop server installation (/etc/hadoop/core-site.xml, etc).
Upvotes: 1
Views: 1218
Reputation: 1
Example:
public class HdfsTest {
//download file from hdfs
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://yourHadoopIP:9000/");
conf.set("dfs.blocksize", "64");
//to get a client of the hdfs system
FileSystem fs = FileSystem.get(conf);
fs.copyToLocalFile(new Path("hdfs://yourHadoopIP:9000/jdk-7u65-linux-i586.tar.gz"), new Path("/root/jdk.tgz"));
fs.close();
}
}
Upvotes: 0
Reputation: 2221
You can set values for your parameters either in core-site.xml or through configuration in your driver code. The one set in the program will overwrite the one set in the xml file. So for example if you have to set a compression code. Then either you could add these to core-site.xml
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>org.apache.hadoop.io.compress.GzipCodec</value>
</property>
or add this line to your driver code.
Configuration conf = new Configuration();
conf.set("mapred.compress.map.output", "true");
conf.set("mapred.map.output.compression.codec", "org.apache.hadoop.io.compress.GzipCodec");
And you dont need to install hadoop on every machine/node. Just install it in your master node and add datanodes by adding IP to the list. This would help you in understanding how a multi node cluster has to be set up.
Upvotes: 1