Murali
Murali

Reputation: 381

Postgres crashes with Segmentation Fault when using LEFT JOIN

My Postgres version is 9.6.12. When I run the query below, Postgres crashes with the error below. The query works fine when I replace LEFT JOIN with just a JOIN

proddb=# SELECT 1 - count(event_date) AS result FROM (SELECT now()::date AS run_date) p 
JOIN historic.audit_event ON event_code = 5199
 AND event_param1 = 'fullscriptpendingorders' AND event_date > run_date 
AND event_date < (run_date + '1 day'::interval);
 result
--------
      0
(1 row)

-- When I change the JOIN to LEFT JOIN
proddb=# SELECT 1 - count(event_date) AS result FROM (SELECT now()::date AS run_date) p
**LEFT JOIN** historic.audit_event ON event_code = 5199 
AND event_param1 = 'fullscriptpendingorders' 
AND event_date > run_date 
AND event_date < (run_date + '1 day'::interval);
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

The postgres log shows the error below showing that Postgres crashed. Postgres automaticaly recovers after a few minutes, but this is impacting critical Production. Can you please help? Is this a Postgres bug?

2019-04-08 21:32:56 PDT [26911]: [19631-1] [] [] LOG:  00000: server process (PID 23981) was terminated by signal 11: Segmentation fault
2019-04-08 21:32:56 PDT [26911]: [19632-1] [] [] DETAIL:  Failed process was running: SELECT 1 - count(event_date) AS result FROM (SELECT now()::date AS run_date) p LEFT JOIN historic.audit_event ON event_code = 5199 AND event_param1 = 'amsa.fullscriptpendingorders' AND event_date > run_date AND event_date < (run_date + '1 day'::interval);
2019-04-08 21:32:56 PDT [26911]: [19633-1] [] [] LOCATION:  LogChildExit, postmaster.c:3574
2019-04-08 21:32:56 PDT [26911]: [19634-1] [] [] LOG:  00000: terminating any other active server processes
2019-04-08 21:32:56 PDT [26911]: [19635-1] [] [] LOCATION:  HandleChildCrash, postmaster.c:3294
2019-04-08 21:32:56 PDT [24633]: [1-1] [[unknown]] [[unknown]] LOG:  00000: connection received: host=[local]
2019-04-08 21:32:56 PDT [24633]: [2-1] [[unknown]] [[unknown]] LOCATION:  BackendInitialize, postmaster.c:4192
2019-04-08 21:32:56 PDT [24633]: [3-1] [postgres] [emr_prod] FATAL:  57P03: the database system is in recovery mode
2019-04-08 21:32:56 PDT [24633]: [4-1] [postgres] [emr_prod] LOCATION:  ProcessStartupPacket, postmaster.c:2230
2019-04-08 21:32:56 PDT [26911]: [19636-1] [] [] LOG:  00000: all server processes terminated; reinitializing
2019-04-08 21:32:56 PDT [26911]: [19637-1] [] [] LOCATION:  PostmasterStateMachine, postmaster.c:3818

I don't see this behavior when I have the exact dataset on a QA database which is version 9.6.9

postgres@emr_qa07=# select version();
                                                 version
----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.6.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit
(1 row)

postgres@emr_qa07=# SELECT 1 - count(event_date) AS result FROM (SELECT now()::date AS run_date) p JOIN historic.audit_event ON event_code = 5199 AND event_param1 = 'fullscriptpendingorders' AND event_date > run_date AND event_date < (run_date + '1 day'::interval);
 result
--------
      1
(1 row)

postgres@emr_qa07=# SELECT 1 - count(event_date) AS result FROM (SELECT now()::date AS run_date) p LEFT JOIN historic.audit_event ON event_code = 5199 AND event_param1 = 'fullscriptpendingorders' AND event_date > run_date AND event_date < (run_date + '1 day'::interval);
 result
--------
      1
(1 row)

Upvotes: 2

Views: 6069

Answers (1)

Philip Kendall
Philip Kendall

Reputation: 4314

Yes, this is a PostgreSQL bug - it should never segfault no matter what query you give it. If you can reduce it to a minimal test case, the developers would probably be interested in a bug report.

The other possibility here is that the underlying data files for your database have become corrupt. It's still a bug that PostgreSQL crashes, but going to be much harder to determine what happened. Potentially dumping and restoring the database may fix it.

Upvotes: 5

Related Questions