Vinny Pem
Vinny Pem

Reputation: 77

Open Source Greenplum: GPFDIST error 'Segmentation fault' when selecting from external table

I'm trying to simply setup an Open Source Greenplum instance and have been hitting the same issue regarding GPFDIST for days! Simply put, I do a full installation from scratch on CentOS 7.6 (can provide further details regarding setup if needed) installing OS GPDB software version 5.18 with GPORCA disabled. Full command for the compile is:

./configure --prefix=/usr/local/gpdb --with-perl --with-python --with-libxml --with-gssapi --with-includes=/usr/local/gpdb/include --with-libs=/usr/local/gpdb/lib --disable-orca

This compiles successfully, and the following make/make install commands too complete without issue. The initialisation of the Greenplum database itself also succeeds, and I can then go into a database and create tables, insert data and run queries like normal.

But if I try to select from an external table, such as the following:

create external table test_external_table
(testing smallint
)
location ('gpfdist://mdw:8080/test_data.csv')
format 'csv' (header delimiter '|')
;

with GPFDIST run as follows:

gpfdist -d /home/gpadmin/test/ -p 8080 -l /home/gpadmin/greenplum/logs/gpfdist_log 2>&1 &

then I get two errors; one from the external table, and one from GPFDIST. These are as follows:

External Table Returns:
ERROR:  connection with gpfdist failed for gpfdist://mdw:8080/test_data.csv. effective url: http://127.0.0.1:8080/test_data.csv. error code = 104 (Connection reset by peer)  (seg0 slice1 127.0.0.1:6000 pid=27962)

GPFDIST Returns:
[1]+  Segmentation fault      gpfdist -d /home/gpadmin/test -p 8080 -l /home/gpadmin/greenplum/logs/gpfdist_log 2>&1

I have removed everything that isn't on the OS GPDB GitHub installation guide (for a 'bare-bones' setup), so I don't think that is causing the issue. I have tried everything to do with the hostname and network firewall, and it's all perfect as far as I can see.

I have also downloaded the same version of GPDB (5.18) from Pivotal and installed that version on the same instance simultaneously, and GPFDIST works perfectly fine.

I have also tried OS GPDB 5.17, 6 beta and 7 beta, and I get the same issue for all of them.

Any ideas at all on what might be causing this is VERY much appreciated, as I'm slowly going insane trying to figure this out now.

Thank you very much in advance for any help.

-- Edit --

Okay.. Having nearly chewed my own arm off in sheer frustration at trying to install debuginfo stuff on CentOS 7, I've finally generated a core dump with gdb. I then run:

gdb -c core_dump.<pid>

and get the following output:

Core was generated by `gpfdist -d /home/gpadmin/test -p 8080 -l /home/gpadmin/greenplum/logs/gpfdist_log'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f4f2c07bdff in ?? ()

But I have absolutely no idea what that means... Totally honest, I'm a little over my head with this now and really am stuck on how to proceed.

Upvotes: 1

Views: 950

Answers (2)

Vinny Pem
Vinny Pem

Reputation: 77

I've finally managed to solve this issue. Should anyone come looking with a similar problem, make sure you are installing Libevent version 1.4[.15], and nothing above that.

I had 2.2.0 installed, and whilst Greenplum sees this as fine, it actually doesn't work for it. Unfortunately, I did have to do an entire system installation from scratch to seemingly get it to work, as just installing Libevent 1.4 on the old system with Greenplum already compiled did not work for me.

Upvotes: 0

Brendan Stephens
Brendan Stephens

Reputation: 227

The connection reset by peer only indicates that the other end of the socket had dropped (...in this case, gpfdist because it crashed out).

Setup your gpfdist and try a wget to a hosted file adding:

--header='X-GP-PROTO:0'

You will need to add the header to avoid having the request rejected.

Are you able to retrieve a file there? Or does that also crash out?

If that crashes out, it's nothing to do with the database - and you will likely need a core dump to determine what the segfault is about (r/w permissions, memory, ...).

Upvotes: 1

Related Questions