Eric Kim
Eric Kim

Reputation: 2696

Postgres - Fastest way to fetch all rows in multiple tables

SELECT table_1.time, table_1.time, table_2.time FROM table_1 
INNER JOIN table_2 on table_1.time = table_2.time 
INNER JOIN table_3 on table_1.time = table_3.time
...;

I am using the above query syntax to query all rows in multiple tables, join columns from different tables, and return. However, as the number of rows in a table increases & number of tables increases, the performance drops in a large scale. Is there any way to optimize the query performance? There will be about 0.1 - 1 million rows for each table.

I've heard terms like indexing, partitioning, and SSD, but I'm really of a novice in Postgres, and not sure which one to look in to. Can anyone provide some query command syntax that's better than what I currently have, or give some detailed advice on editing the structure of my database?

Edit: Fetching all data happens only once when loading a page. So I'm trying to load all the data that's present in DB to visualize plots. After the initial plot is generated, the page will be querying only the last rows of each table to update the plots. The table structures are very simple.

Table 1: SPM1

        time         | spm1  |
------------------------------
 2018-09-05 22:23:52 | 43.21 |

Table 2: SPM2
        time         | spm2  |
------------------------------
 2018-09-05 22:23:52 | 43.21 |

... and there are about 30 tables of these

Thanks,

Upvotes: 0

Views: 1983

Answers (2)

pewpewlasers
pewpewlasers

Reputation: 3225

Have your tried indexing the time fields? You can create indexes like this:

CREATE INDEX "table_1_time" ON "table_1" ("time");
CREATE INDEX "table_2_time" ON "table_2" ("time");
CREATE INDEX "table_3_time" ON "table_3" ("time");
... and so on ...

Run the above query in the database once. Once the indexes are created, adding new or removing old data is indexed automatically. Indexing makes queries much faster.

After indexing, try your query again. But be reminded that you are pulling the whole of table_1 anyway. The indexing might help with join.

Upvotes: 1

Schwern
Schwern

Reputation: 165606

Here's a few idea to improve things based on what you've mentioned.

Could all the tables be turned into one with three columns?

create table spm (
    id serial primary key,
    time datetime not null,
    spm numeric(5,2) not null,
    number smallint not null
);

insert into spm (time, spm, number)
    values ('2018-09-05 22:23:52', 43.21, 1),
           ('2018-09-05 22:23:52', 43.21, 2)

That would greatly simplify the queries. Joins are efficient, but 20 joins is a bit much.

select time, spm, number from spm;

Fetching all data happens only once when loading a page. So I'm trying to load all the data that's present in DB to visualize plots. After the initial plot is generated, the page will be querying only the last rows of each table to update the plots.

If these plots be cached and periodically regenerated or updated, that would be a huge performance improvement not just for the database query, but for all that time generating the plots.

Upvotes: 2

Related Questions