Wizuriel
Wizuriel

Reputation: 3747

Working with a table in a DB2 User Defined Function

So I've made a UDF in DB2 that tokenizes a String called Tokenize and returns a table with an ID and word (so each row is a word that was in the initial string). example

tokenize('University of Toronto', ' ') returns
ID   Word
1    University
2    of
3    Toronto

What I'm trying to do is make another function that compares 2 strings to see how many words they have in common based on the length of the first string.

So for example 'University of Toronto' and 'University of Guelph' should return 0.66 I've gotten this to work with this code

CREATE OR REPLACE FUNCTION Parsed_Match(STRING1 VARCHAR(256), STRING2 VARCHAR(256))
RETURNS DECIMAL(5,2)

NO EXTERNAL ACTION
BEGIN
DECLARE SCORE DECIMAL(5,2);
DECLARE mymatches int;
DECLARE len int;

set mymatches = (
    select count(*)
    from (
        select word
        from table(tokenize(STRING1, ' '))
        intersect all
        select word
        from table(tokenize(STRING2, ' '))
    )
);

set len = (
    select count(*)
    from table(tokenize(STRING1, ' '))
);

set score = decimal(mymatches) / decimal(len);  
RETURN SCORE;
END

Having to recall the tokenize code though to get the length of string1 just strikes me as wrong. Is there a way in DB2 I can store the calculated table in a variable to reuse that later on?

like I ideally want to do

set t1 = tokenize(String1);
set t2 = tokenize(String2);
set matches = (
    select count(*)
    from (
          select word
          from t1
          intersect all
          select word
          from t2
    )
 );
 set len = ( select count(*) from t1 );

but just can't find a way to get that to work :(

Upvotes: 0

Views: 1603

Answers (1)

Clockwork-Muse
Clockwork-Muse

Reputation: 13106

It'd probably be easiest to just create the table inline; that is, with a CTE:

WITH T1 AS (SELECT word
            FROM TABLE(tokenize(STRING1, ' ')))
SELECT COUNT(*) / (SELECT COUNT(*) FROM T1) AS score
FROM (SELECT word
      FROM T1
      INTERSECT ALL
      SELECT word
      FROM TABLE(tokenize(STRING2, ' '))
)

... if you'll notice, this also allowed me to calculate the score in the same statement.

This won't help if you actually the to work with the tables outside this single statement.

Upvotes: 1

Related Questions