Reputation: 3747
So I've made a UDF in DB2 that tokenizes a String called Tokenize and returns a table with an ID and word (so each row is a word that was in the initial string). example
tokenize('University of Toronto', ' ') returns
ID Word
1 University
2 of
3 Toronto
What I'm trying to do is make another function that compares 2 strings to see how many words they have in common based on the length of the first string.
So for example 'University of Toronto' and 'University of Guelph' should return 0.66 I've gotten this to work with this code
CREATE OR REPLACE FUNCTION Parsed_Match(STRING1 VARCHAR(256), STRING2 VARCHAR(256))
RETURNS DECIMAL(5,2)
NO EXTERNAL ACTION
BEGIN
DECLARE SCORE DECIMAL(5,2);
DECLARE mymatches int;
DECLARE len int;
set mymatches = (
select count(*)
from (
select word
from table(tokenize(STRING1, ' '))
intersect all
select word
from table(tokenize(STRING2, ' '))
)
);
set len = (
select count(*)
from table(tokenize(STRING1, ' '))
);
set score = decimal(mymatches) / decimal(len);
RETURN SCORE;
END
Having to recall the tokenize code though to get the length of string1 just strikes me as wrong. Is there a way in DB2 I can store the calculated table in a variable to reuse that later on?
like I ideally want to do
set t1 = tokenize(String1);
set t2 = tokenize(String2);
set matches = (
select count(*)
from (
select word
from t1
intersect all
select word
from t2
)
);
set len = ( select count(*) from t1 );
but just can't find a way to get that to work :(
Upvotes: 0
Views: 1603
Reputation: 13106
It'd probably be easiest to just create the table inline; that is, with a CTE:
WITH T1 AS (SELECT word
FROM TABLE(tokenize(STRING1, ' ')))
SELECT COUNT(*) / (SELECT COUNT(*) FROM T1) AS score
FROM (SELECT word
FROM T1
INTERSECT ALL
SELECT word
FROM TABLE(tokenize(STRING2, ' '))
)
... if you'll notice, this also allowed me to calculate the score in the same statement.
This won't help if you actually the to work with the tables outside this single statement.
Upvotes: 1