Reputation: 149
I need to create a python UDF(user-defined function) in redshift, that would be called in some other procedure. This python UDF takes two date values and compares those dates within the given start and end date, and check for the occurrence of these intermediate dates in some list
.
This list needs to collect it's values from another table's column. Now the issue is, python UDF are defined in plpythonplu
language and they don't recognize any sql. What should I do to make this list out of the table's column?
This is my function:
create or replace function test_tmp (ending_date date, starting_date date)
returns integer
stable
as $$
def get_working_days(ending_date , starting_date ):
days=0
if start_date is not None and end_date is not None:
for n in range(int ((ending_date - starting_date).days)):
btw_date= (start_date + timedelta(n)).strftime('%Y-%m-%d')
if btw_date in date_list:
days=days+1
return days
return 0
return get_working_days(ending_date,starting_date)
$$ language plpythonu;
Now, I need to create this date_list as something like:
date_list = [str(each["WORK_DATE"]) for each in (select WORK_DATE from public.date_list_table).collect()]
But, using this line in the function obviously gives an error, as select WORK_DATE from public.date_list_table
is SQL.
Following is the structure of table public.date_list_table
:
CREATE TABLE public.date_list
(
work_date date ENCODE az64
)
DISTSTYLE EVEN;
Some sample values for this table (actually this table stores only the working days values for the entire year):
insert into date_list_table values ('2021-07-01'),('2021-06-30'),('2021-06-29');
Upvotes: 0
Views: 1384
Reputation: 270154
An Amazon Redshift Scalar SQL UDF - Amazon Redshift cannot access any tables. It needs to be self-contained by passing all the necessary information into the function. Or, you could store the date information inside the function so it doesn't need to access the table (not unreasonable, since it only needs to hold exceptions such as public holidays on weekdays).
It appears that your use-case is to calculate the number of working days between two dates. One way that this is traditionally solved is to create a table calendar with one row per day and columns providing information such as:
You can then JOIN or query the table to identify the desired information, such as:
SELECT COUNT(*) FROM calendar WHERE work_day AND date BETWEEN start_date AND end_date
Upvotes: 1