Reputation: 1275
I was in mysql and now have joined postgres and I have a table that is getting up to 300,000 new records a day but also has many reads. I have 2 columns that I think would be ideal for indexes: latitudes and longitudes and I Know that postgres has different types of indexes and my question is which type of index would be best for a table that has many writes and reads? This is the query for the reads
SELECT p.fullname,s.post,to_char(s.created_on, 'MON DD,YYYY'),last_reply,s.id,
r.my_id,s.comments,s.city,s.state,p.reputation,s.profile_id
FROM profiles as p INNER JOIN streams as s ON (s.profile_id=p.id) Left JOIN
reputation as r ON (r.stream_id=s.id and r.my_id=?) where s.latitudes >=?
AND ?>= s.latitudes AND s.longitudes>=? AND ?>=s.longitudes order by
s.last_reply desc limit ?"
As you can see the 2 columns in the where clause are latitudes and longitudes
Upvotes: 3
Views: 2937
Reputation: 1170
If the latitude or longitude columns are sorted, you would probably want to use a B-tree index.
From the Postgres documentation page on indices:
B-trees can handle equality and range queries on data that can be sorted into some ordering. In particular, the PostgreSQL query planner will consider using a B-tree index whenever an indexed column is involved in a comparison using one of the [greater than / lesser than-type operators]
You can read more about indices here.
Edit: Some of the G* indices look like they might be of use if you need to index on both latitude and longitude, since they appear to allow multi-dimensional (e.g. 2d) indexing.
Edit2: In order to actually create the index, you'd want to do something along the lines of (although you may need to change the table name to suite your needs):
CREATE INDEX idx_lat ON s(latitudes);
Take note that B-tree indices are default so you don't need to specify the type.
Read more about index creation here.
Upvotes: 2
Reputation: 32179
PostgreSQL has the point
data type with many operators that have good support from the gist
index. So if at all possible change your table definition to use a point
rather than 2 float
s.
Inserting point
data is very easy, just use point(longitudes, latitudes)
for the column, instead of putting the two values in separate columns. Same with getting data out: lnglat[0]
is the longitude and lnglat[1]
is the latitude.
The index would be something like this:
CREATE INDEX idx_mytable_lnglat ON streams USING gist (lnglat pointops);
There is also the box
data type, which would be great for grouping all your parameters and finding a point in a box is highly optimized in the gist
index.
With a point
in the table and a box
to search on, your query reduces to this:
SELECT p.fullname, s.post, to_char(s.created_on, 'MON DD,YYYY'), last_reply, s.id,
r.my_id, s.comments, s.city, s.state, p.reputation, s.profile_id
FROM profiles AS p
JOIN streams AS s ON (s.profile_id = p.id)
LEFT JOIN reputation AS r ON r.stream_id = s.id AND r.my_id = ?
WHERE s.lnglat && box(?, ?, ?, ?)
ORDER BY s.last_reply DESC
LIMIT ?;
The phrase s.lnglat && box(?, ?, ?, ?)
means "the value of column lnglat overlaps with (meaning: is inside) the box".
Upvotes: 3