Reputation: 3
I am working on a project that includes a lot of measurement data. The goal is to store this data in a database. There are 5 large arrays of data (1 million float numbers each).
We use PostgreSQL 11.3 for our database and we thought it would be a great idea to make use of the arrays in postgres. Saving and retrieving the data works fine so far, but we want to build a small Web- App that displays these values in a graph. Of course such large arrays are impractical and would make the whole process painfully slow. So our idea was to select only every 10,000th value and send that. This would be enough to draw a simple graph with enough detail.
So is there some way to write a SQL- Query that does this? The only documented feature we found was slicing an array, but this will only select the data from a start index to an end index. Or do you have any tips for handling such a problem. We have complete freedom of our database structure and are in the very early stages of development, so creating a new schema would also work.
Here is our table structure so far:
CREATE TABLE public."DataPoints"
(
"Id" integer NOT NULL DEFAULT nextval('"DataPoints_Id_seq"'::regclass),
"TLP_Voltage" double precision NOT NULL,
"Delay" double precision NOT NULL,
"Time_Resolution" double precision NOT NULL,
"Time_Values" double precision[] NOT NULL,
"Voltage_Offset" double precision NOT NULL,
"Voltage_Resolution" double precision NOT NULL,
"Voltage_Values" double precision[] NOT NULL,
"Current_Offset" double precision NOT NULL,
"Current_Resolution" double precision NOT NULL,
"Current_Values" double precision[] NOT NULL,
"Aux_1_Offset" double precision,
"Aux_1_Resolution" double precision,
"Aux_1_Values" double precision[],
"Aux_2_Offset" double precision,
"Aux_2_Resolution" double precision,
"Aux_2_Values" double precision[],
"Measurement_Id" integer NOT NULL,
"Sequence_Id" integer NOT NULL,
CONSTRAINT "DataPoints_pkey" PRIMARY KEY ("Id"),
CONSTRAINT "DataPoints_Measurement_Id_fkey" FOREIGN KEY ("Measurement_Id")
REFERENCES public."Measurements" ("Id") MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)
Upvotes: 0
Views: 1944
Reputation: 6723
You could also use generate_series.
create table test_array (c1 int[]);
insert into test_array (c1) VALUES (ARRAY[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]);
select x, c1[x]
FROM test_array,
-- Get every third element. Change 3 to whatever the step should be.
generate_series(1, array_length(c1, 1), 3) as g(x);
x | c1
----+----
1 | 1
4 | 4
7 | 7
10 | 10
13 | 13
(5 rows)
EDIT: After a little bit of testing, it looks like Gordon's solution is much faster, which makes sense.
-- Create a 1 million element array
insert into test_array(c1) select array_agg(x) from generate_series(1,1000000) g(x);
-- My approach with generate_series:
explain analyze select x, c1[x] FROM test_array, generate_series(1, array_length(c1, 1), 1000) as g(x);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.01..27223.60 rows=1360000 width=8) (actual time=3.929..910.291 rows=1000 loops=1)
-> Seq Scan on test_array (cost=0.00..23.60 rows=1360 width=32) (actual time=0.016..0.032 rows=1 loops=1)
-> Function Scan on generate_series g (cost=0.01..10.01 rows=1000 width=4) (actual time=1.378..9.647 rows=1000 loops=1)
Planning Time: 0.063 ms
Execution Time: 919.515 ms
(5 rows)
-- Gordon's approach using unnest with ordinality
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2077.20 rows=1360 width=4) (actual time=109.685..246.758 rows=1000 loops=1)
-> Seq Scan on test_array (cost=0.00..23.60 rows=1360 width=32) (actual time=0.035..0.049 rows=1 loops=1)
-> Function Scan on unnest x (cost=0.00..1.50 rows=1 width=4) (actual time=109.603..233.817 rows=1000 loops=1)
Filter: ((n % '1000'::bigint) = 1)
Rows Removed by Filter: 999000
Planning Time: 0.131 ms
Execution Time: 256.515 ms
Upvotes: 1
Reputation: 1269873
One method is to unaggregate and re-aggregate:
select (select array_agg(x.a)
from unnest(v.ar) with ordinality x(a, n)
where x.n % 1000 = 1
)
from v;
Upvotes: 2