Reputation: 868
I am currently working on a project that involves developing a UDF for IBM DB2 v10 in C. All of my C code is bundled in a single file called rcdudf.c
and it has the following contents:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sqlca.h>
#include <sqludf.h>
typedef unsigned int (*hash_function)(char*, unsigned int len);
unsigned int FNVHash (char* str, unsigned int len);
typedef struct hash_record
{
void* value;
struct hash_record* next;
}hash_record;
typedef struct hash_bucket
{
void* key;
hash_record* head_record;
}hash_bucket;
typedef struct hash_table
{
int bucket_num;
hash_bucket** hash_entry;
}hash_table;
int hash_init(hash_table** h_table, int bucket_num);
int hash_destroy(hash_table** h_table);
int hash_contains_key(hash_table** h_table, void* key, unsigned int key_len);
void hash_put_value(hash_table** h_table, void* key, unsigned int key_len, void* value, unsigned int value_len);
void hash_remove_value(hash_table** h_table, void* key, unsigned int key_len, void* value, unsigned int value_len);
void hash_print_contents(hash_table** h_table);
int hash_calculate_data_size(hash_table** h_table);
char* hash_concatenate_data_to_char_array(hash_table** h_table);
struct SCRATCHDATA
{
int num_buckets;
hash_table *h_table;
};
#ifdef __cplusplus
extern "C"
#endif
void SQL_API_FN GatherDistinctValues( SQLUDF_VARCHAR *inputAttrKey, SQLUDF_VARCHAR *inputAttrValue,
SQLUDF_INTEGER *out, SQLUDF_SMALLINT *inputAttrKeyNullInd, SQLUDF_SMALLINT *inputAttrValueNullInd,
SQLUDF_SMALLINT *outInd, SQLUDF_TRAIL_ARGS_ALL)
{
struct SCRATCHDATA *sp;
sp = (struct SCRATCHDATA *) SQLUDF_SCRAT->data;
switch (SQLUDF_CALLT)
{
case SQLUDF_FIRST_CALL:
hash_init(&(sp->h_table), 7);
break;
case SQLUDF_NORMAL_CALL:
if( *inputAttrKeyNullInd == 0 && *inputAttrValueNullInd == 0 )
{
/**
* If the provided value is not NULL and it is not contained in the
* Global hash Table, it is going to be added. Otherwise,
* nothing is done (it already exists).
*/
if( hash_contains_key(&(sp->h_table), (void*) inputAttrKey, strlen(inputAttrKey)) == 0 )
{
hash_put_value(&(sp->h_table), (void*) inputAttrKey, strlen(inputAttrKey),
(void*) inputAttrValue, strlen(inputAttrValue));
}
}
break;
case SQLUDF_FINAL_CALL:
break;
}
*out = 0;
*outInd = 0;
return;
}
In the rest of the file the bodies of the defined methods exist. In order to "install" this UDF to DB2, I execute the provided script bldrtn
as follows:
~$./bldrtn rcdudf
and the executable is stored inside the ~/sqllib/function
directory. Next, I execute the following script in DB2:
CREATE OR REPLACE FUNCTION GatherDistinctVal( VARCHAR(255), VARCHAR(255) )
RETURNS INTEGER
EXTERNAL NAME 'rcdudf!GatherDistinctValues'
NOT FENCED
CALLED ON NULL INPUT
NOT VARIANT
NO SQL
PARAMETER STYLE SQL
LANGUAGE C
NO EXTERNAL ACTION;
By issuing the command ~$db2 -tvsf create-udf.sql
. After that I try to call the function on the Sample database of DB2 as follows:
~$ db2 "select gatherdistinctval('job',job) from employee"
1
-----------
SQL0444N Routine "*TINCTVAL" (specific name "SQL140520113052600") is
implemented with code in library or path "...function/rcdudf", function
"GatherDistinctValues" which cannot be accessed. Reason code: "5".
SQLSTATE=42724
When I define the aforementioned function as FENCED
I receive the following error:
~$ db2 "select GatherDistinctVal('Job',job) from Employee"
1
-----------
SQL1646N A routine failed because the fenced user ID cannot access required
files in the sqllib directory or other instance or database directories.
What am I doing wrong? I am pretty sure that the file rcdudf
exists in the appropriate directory. Also, when I execute the create-udf.sql
script, I get a success message from DB2.
When I execute ~$ls -l ~/sqllib/function/
I get the following:
lrwxrwxrwx 1 root db2iadm1 36 May 15 17:54 db2psmds -> /opt/ibm/db2/V10.1/function/db2psmds
drwxrwsr-t 2 db2inst1 db2iadm1 4096 May 15 17:54 db2rdf
lrwxrwxrwx 1 root db2iadm1 35 May 15 17:54 db2rtsc -> /opt/ibm/db2/V10.1/function/db2rtsc
lrwxrwxrwx 1 root db2iadm1 34 May 15 17:54 fpeevm -> /opt/ibm/db2/V10.1/function/fpeevm
-rw-r--r-- 1 db2inst1 db2iadm1 3256 May 20 17:58 GeneralHashFunctions.o
-rw-r--r-- 1 db2inst1 db2iadm1 11688 May 20 17:58 hash_table.o
-rwxr-xr-x 1 db2inst1 db2iadm1 17090 May 21 08:39 hopeless
drwxrwxr-x 3 db2inst1 db2iadm1 4096 May 14 16:07 jar
lrwxrwxrwx 1 root db2iadm1 37 May 15 17:54 libdb2u.a -> /opt/ibm/db2/V10.1/function/libdb2u.a
-rwxr-xr-x 1 db2inst1 db2iadm1 17144 May 20 18:08 rcdudf
drwxrwsr-t 2 db2inst1 db2iadm1 4096 May 8 21:24 routine
lrwxrwxrwx 1 root db2iadm1 33 May 15 17:54 tblpd -> /opt/ibm/db2/V10.1/function/tblpd
-rwxr-xr-x 1 db2inst1 db2iadm1 22280 May 20 16:27 testudf
-rwxr-xr-x 1 db2inst1 db2iadm1 34510 May 20 09:49 udfcli
-rwxr-xr-x 1 db2inst1 db2iadm1 13009 May 20 10:42 udfsrv
drwxrwsr-t 2 db2inst1 db2iadm1 4096 May 15 17:54 unfenced
As can be seen, the rcdudf executable is there and it has read/execute permissions. Also, I am logged in as authorised user db2inst1
when I perform all of the operations.
As an added experiment, I create the ScalarUDF
function from the samples of DB2. The sql
creation script of ScalarUDF is exactly the same as create-udf.sql
with only the name changed. The executable file created in the ~/sqllib/function/
folder is udfsrv
.
When I try to execute scalarudf
it works perfectly. I still do not understand why the fenced user is able to execute scalarudf
but he is not able to execute gatherdistinctvalues
.
I have posted the outcome of db2 dbm get cfg
for future reference:
~$ db2 get dbm cfg
Database Manager Configuration
Node type = Enterprise Server Edition with local and remote clients
Database manager configuration release level = 0x0f00
CPU speed (millisec/instruction) (CPUSPEED) = 2.637255e-07
Communications bandwidth (MB/sec) (COMM_BANDWIDTH) = 1.000000e+02
Max number of concurrently active databases (NUMDB) = 32
Federated Database System Support (FEDERATED) = NO
Transaction processor monitor name (TP_MON_NAME) =
Default charge-back account (DFT_ACCOUNT_STR) =
Java Development Kit installation path (JDK_PATH) = /opt/ibm/java-x86_64-71/
Database monitor heap size (4KB) (MON_HEAP_SZ) = AUTOMATIC(90)
Java Virtual Machine heap size (4KB) (JAVA_HEAP_SZ) = 2048
Audit buffer size (4KB) (AUDIT_BUF_SZ) = 0
Size of instance shared memory (4KB) (INSTANCE_MEMORY) = AUTOMATIC(807642)
Instance memory for restart light (%) (RSTRT_LIGHT_MEM) = AUTOMATIC(10)
Agent stack size (AGENT_STACK_SZ) = 1024
Sort heap threshold (4KB) (SHEAPTHRES) = 0
Directory cache support (DIR_CACHE) = YES
Application support layer heap size (4KB) (ASLHEAPSZ) = 15
Max requester I/O block size (bytes) (RQRIOBLK) = 32767
Workload impact by throttled utilities(UTIL_IMPACT_LIM) = 10
Priority of agents (AGENTPRI) = SYSTEM
Agent pool size (NUM_POOLAGENTS) = AUTOMATIC(100)
Initial number of agents in pool (NUM_INITAGENTS) = 0
Max number of coordinating agents (MAX_COORDAGENTS) = AUTOMATIC(200)
Max number of client connections (MAX_CONNECTIONS) = AUTOMATIC(MAX_COORDAGENTS)
Keep fenced process (KEEPFENCED) = YES
Number of pooled fenced processes (FENCED_POOL) = AUTOMATIC(MAX_COORDAGENTS)
Initial number of fenced processes (NUM_INITFENCED) = 0
Index re-creation time and redo index build (INDEXREC) = RESTART
Transaction manager database name (TM_DATABASE) = 1ST_CONN
Transaction resync interval (sec) (RESYNC_INTERVAL) = 180
Thank you.
Upvotes: 1
Views: 1389
Reputation: 868
The problem was that the bldrtn
script of DB2 does not handle the case where code is used from more than one files. Also, it seems that adding code for defining the Hash Table and the Hash Functions in the beginning of my file did not work.
Create two different files hash_table.h
and hash_table.c
and type the code of my structures in there. Also, add the reference of hash_table.h
in the rcdudf.c
file.
Afterwards change the bldrtn
script to first create the object files of each source file, and then link them together.
In detail, the bldrtn
file would have the following changed lines of code:
$CC $EXTRA_C_FLAGS -I$DB2PATH/include -c hash_table.c -D_REENTRANT
$CC $EXTRA_C_FLAGS -I$DB2PATH/include -c GeneralHashFunctions.c -D_REENTRANT
$CC $EXTRA_C_FLAGS -I$DB2PATH/include -c rcdudf.c -D_REENTRANT
$CC $LINK_FLAGS -o rcdudf rcdudf.o GeneralHashFunctions.o hash_table.o $EXTRA_LFLAG -L$DB2PATH/$LIB -ldb2 -lpthread
rm -f $DB2PATH/function/hopeless
cp hopeless $DB2PATH/function
This way, my code is bundled in rcdudf
and the DB2 can execute it completely.
Note: It seems that every time a new UDF is registered, DB2 needs to be restarted
Upvotes: 0
Reputation: 18955
I know this is not an answer, but it's too big for a comment.
SQL0444N RC 5 can mean any of these things:
There is insufficient memory to load the library containing the function or one or more symbols could not be resolved. This reason code indicates one of the following situations:
1. One or more symbols might not have been resolved. The
routine library might be dependent on a shared library that
cannot be located (using the concatenation of directories
specified in the LIBPATH environment variable in UNIX-based
systems, the PATH environment variable in INTEL systems).
2. The routine has a 64-bit library which is not supported
within a 32-bit DB2 instance, or, the routine has a 32-bit
library or DLL within a 64-bit DB2 instance that is
incompatible with the routine definition.
3. There was insufficient memory to load the library containing
the function.
Since you don't seem to use any extra libraries, I'd check the object file bitness vs. the instance bitness.
Upvotes: 1