user1949387
user1949387

Reputation: 1275

Postgres 9.4 which type of index would be ideal for a float column

I was in mysql and now have joined postgres and I have a table that is getting up to 300,000 new records a day but also has many reads. I have 2 columns that I think would be ideal for indexes: latitudes and longitudes and I Know that postgres has different types of indexes and my question is which type of index would be best for a table that has many writes and reads? This is the query for the reads

SELECT p.fullname,s.post,to_char(s.created_on, 'MON DD,YYYY'),last_reply,s.id,
r.my_id,s.comments,s.city,s.state,p.reputation,s.profile_id 
FROM profiles as p INNER JOIN streams as s ON (s.profile_id=p.id) Left JOIN 
reputation as r ON (r.stream_id=s.id and r.my_id=?) where s.latitudes >=? 
AND ?>= s.latitudes AND s.longitudes>=? AND ?>=s.longitudes order by 
s.last_reply desc limit ?"

As you can see the 2 columns in the where clause are latitudes and longitudes

Upvotes: 3

Views: 2937

Answers (2)

grill
grill

Reputation: 1170

If the latitude or longitude columns are sorted, you would probably want to use a B-tree index.

From the Postgres documentation page on indices:

B-trees can handle equality and range queries on data that can be sorted into some ordering. In particular, the PostgreSQL query planner will consider using a B-tree index whenever an indexed column is involved in a comparison using one of the [greater than / lesser than-type operators]

You can read more about indices here.

Edit: Some of the G* indices look like they might be of use if you need to index on both latitude and longitude, since they appear to allow multi-dimensional (e.g. 2d) indexing.

Edit2: In order to actually create the index, you'd want to do something along the lines of (although you may need to change the table name to suite your needs):

CREATE INDEX idx_lat ON s(latitudes);

Take note that B-tree indices are default so you don't need to specify the type.

Read more about index creation here.

Upvotes: 2

Patrick
Patrick

Reputation: 32179

PostgreSQL has the point data type with many operators that have good support from the gist index. So if at all possible change your table definition to use a point rather than 2 floats.

Inserting point data is very easy, just use point(longitudes, latitudes) for the column, instead of putting the two values in separate columns. Same with getting data out: lnglat[0] is the longitude and lnglat[1] is the latitude.

The index would be something like this:

CREATE INDEX idx_mytable_lnglat ON streams USING gist (lnglat pointops);

There is also the box data type, which would be great for grouping all your parameters and finding a point in a box is highly optimized in the gist index.

With a point in the table and a box to search on, your query reduces to this:

SELECT p.fullname, s.post, to_char(s.created_on, 'MON DD,YYYY'), last_reply, s.id,
       r.my_id, s.comments, s.city, s.state, p.reputation, s.profile_id 
FROM profiles AS p
JOIN streams AS s ON (s.profile_id = p.id)
LEFT JOIN reputation AS r ON r.stream_id = s.id AND r.my_id = ?
WHERE s.lnglat && box(?, ?, ?, ?)
ORDER BY s.last_reply DESC
LIMIT ?;

The phrase s.lnglat && box(?, ?, ?, ?) means "the value of column lnglat overlaps with (meaning: is inside) the box".

Upvotes: 3

Related Questions