Reputation: 399
I am trying to create a Writeable External Table in Greenplum (postgresql) and unload data from a Greenplum Table into HDFS using it. Here is the code:
CREATE WRITABLE EXTERNAL TABLE test_writable
( LIKE awc_merged.delivery )
LOCATION ('gphdfs://10.63.33.201-1:8081/path')
FORMAT 'TEXT' (DELIMITER ',')
DISTRIBUTED RANDOMLY;
INSERT INTO test_writable SELECT * FROM awc_merged.delivery;
However, I'm getting the following error:
ERROR: could not write to external resource: Broken pipe (fileam.c:1386) (seg3 sdw2:40001 pid=21676) (cdbdisp.c:1457)
********** Error **********
ERROR: could not write to external resource: Broken pipe (fileam.c:1386) (seg3 sdw2:40001 pid=21676) (cdbdisp.c:1457)
SQL state: XX000
The Greenplum Database and the HDFS are on different servers and I know that the command should atleast include a username and password for the HDFS server. Can anyone help me out with the correct command for this task?
Regards,
Jones
Upvotes: 0
Views: 1400
Reputation: 5018
First, try to setup readable external table. Here's the guide how it can be done: https://support.pivotal.io/hc/en-us/articles/202635496-How-to-access-HDFS-data-via-GPDB-external-table-with-gphdfs-protocol
This example is for PHD distribution, but can be customized for any other distribution. The general idea is that on each GPDB host you should have HDFS client libraries installed and HDFS client configuration is performed (i.e. you can access HDFS from this machine under gpadmin usign "hdfs dfs -ls /", for example)
In general, the setup procedure is described in "GPDB Database Administrator Guide" and can be found here: http://gpdb.docs.pivotal.io/4330/index.html#admin_guide/load.html
Upvotes: 0