Johan
Johan

Reputation: 76547

Can I write a UDF for MySQL in Python?

I would like to have the option to call Python via MySQL stored procs.
For this reason I was wondering if I can use Python to write a UDF that I can call from MySQL.

If that's not possible what alternative way could I use to make this happen.

Upvotes: 4

Views: 3422

Answers (1)

samplebias
samplebias

Reputation: 37909

The MySQL 5.0 docs say that a trigger can call a UDF, so that part seems possible. Reviewing the source code of some UDF extensions would give you a good idea of the complexity involved. You can find open source packages on the MySQL UDF repository.

I wish I had an easy answer to the "create a MySQL UDF in Python" part, but I don't know of a simple off-the-shelf way of doing it.

Postgres has a PL/Python language which lets you write procedures and functions directly using Python inside the database. This code may be the best guide for figuring out how to plug Python into MySQL; however, to my knowledge nobody has accomplished this yet (but I'd love to be wrong).

A few messy details of how this might be done:

UDFs are shared objects which are loaded into the MySQL daemon, so to create one you need to be able to generate C stubs which (among other things) initialize the Python interpreter, load and compile your Python script into bytecode, and then translate MySQL UDF arguments int a Python function call, then translate the return value back.

For a simple UDF string function named myfunc the shared object would have the following functions:

// initialize state when 'myfunc' is loaded.
my_bool myfunc_init(UDF_INIT *initid, UDF_ARGS *args, ...)

// call myfunc, this would need to translate the args, invoke the
// python function, then return the string, may need to create and cache
// python sub-interpreters on the fly, etc
char *myfunc(UDF_INIT *initid, UDF_ARGS *args, ...)

// clean up the state when 'myfunc' is unloaded.
void myfunc_deinit(UDF_INIT *initid);

Since there may be several invocations of your UDF running at once on separate threads, so you'd need to find a way to either efficiently create and cache sub-interpreters on-demand within the context of a single function call, or safely reuse a single interpreter across multiple threads (with locking which may slow things down unacceptably).

Upvotes: 7

Related Questions