Reputation: 39
I wanna return table. The function get an array(the query is 'select function_name(array_agg(column_name)) from table_name')
I coded below:
create type pddesctype as(
count float,
mean float,
std float,
min float
);
create function pddesc(x numeric[])
returns pddesctype
as $$
import pandas as pd
data=pd.Series(x)
count=data.describe()[0]
mean=data.describe()[1]
std=data.describe()[2]
min=data.describe()[3]
return count, mean, std, min
$$ language plpython3u;
This code results only array on one column. (float, float, float...)
I tried
create function pddesc(x numeric[])
returns table(count float, mean float, std float, min float)
as $$
import pandas as pd
data=pd.Series(x)
count=data.describe()[0]
mean=data.describe()[1]
std=data.describe()[2]
min=data.describe()[3]
return count, mean, std, min
$$ language plpython3u;
But there is an error:
ERROR: key "count" not found in mapping
HINT: To return null in a column, add the value None to the mapping with the key named after the column.
CONTEXT: while creating return value.
I want to show the result in columns (like a table) without creating type in advance.
How to change the RETURN / RETURNS syntax?
Upvotes: 3
Views: 2652
Reputation: 9542
Here are the steps I tried to get a table of one row with four columns as the Output. The last step has the solution, the first is another way to reproduce your error.
create or replace function pddesc(x numeric[])
returns table(count float, mean float, std float, min float)
as $$
import pandas as pd
import numpy as np
data=pd.Series(x)
count=data.describe()[0]
mean=data.describe()[1]
std=data.describe()[2]
min=data.describe()[3]
## print an INFO of the output:
plpy.info(np.array([count, mean, std, min]))
return np.array([count, mean, std, min])
$$ language plpython3u;
Test fails (with the error of the question reproduced):
postgres=# SELECT * FROM pddesc(ARRAY[1,2,3]);
INFO: [3 3 Decimal('1') 1]
ERROR: key "count" not found in mapping
HINT: To return null in a column, add the value None to the mapping with the key named after the column.
CONTEXT: while creating return value
PL/Python function "pddesc"
np.array([...]).reshape(1,-1)
You need to reshape the array so that it is of the dimension that you want to get. In this case, it is of dim (1 row x 4 cols), and the .reshape(1,-1)
means 1 row and -1 (= whatever needed) cols
create or replace function pddesc(x numeric[])
returns table(count float, mean float, std float, min float)
as $$
import pandas as pd
import numpy as np
data=pd.Series(x)
count=data.describe()[0]
mean=data.describe()[1]
std=data.describe()[2]
min=data.describe()[3]
## print an INFO of the output:
plpy.info(np.array([count, mean, std, min]).reshape(1,-1))
return np.array([count, mean, std, min]).reshape(1,-1)
## or with the same result:
# return np.hstack((count, mean, std, min)).reshape(1,-1)
$$ language plpython3u;
Test:
postgres=# SELECT * FROM pddesc(ARRAY[1,2,3]);
INFO: [[3 3 Decimal('1') 1]]
count | mean | std | min
-------+------+-----+-----
3 | 3 | 1 | 1
(1 row)
Upvotes: 1