François M.
François M.

Reputation: 4278

Fastest way to do SELECT * WHERE not null

I'm wondering what is the fastest way to get all non null rows. I've thought of these :

SELECT * FROM table WHERE column IS NOT NULL

SELECT * FROM table WHERE column = column

SELECT * FROM table WHERE column LIKE '%'

(I don't know how to measure execution time in SQL and/or Hive, and from repeatedly trying on a 4M lines table in pgAdmin, I get no noticeable difference.)

Upvotes: 4

Views: 2529

Answers (1)

leftjoin
leftjoin

Reputation: 38290

You will never notice any difference in performance when running those queries on Hive because these operations are quite simple and run on mappers which are running in parallel.

Initializing/starting mappers takes a lot more time than the possible difference in execution time of these queries and adds a lot of heuristics to the total execution time because mappers may be waiting resources and not running at all.

But you can try to measure time, see this answer about how to measure execution time: https://stackoverflow.com/a/44872319/2700344

SELECT * FROM table WHERE column IS NOT NULL is more straightforward (understandable/readable) though all of queries are correct.

Upvotes: 6

Related Questions