aaronsteers
aaronsteers

Reputation: 2571

Programmatically submit a U-SQL job with code-behind

I'm currently submitting my U-SQL jobs via the Python library and I want to add additional code in a C# or Python code-behind file. Are code-behind files supported, either in python or in a CLI-based method that I could easily automate?

Ideally I'd like to use the Azure CLI or the Python library so this can run on both Linux and Windows (i.e. not relying on Visual Studio). I've check the documentation for both PowerShell and Python, but I don't see any instructions on how to submit jobs with code-behind logic.

Here is my python code:

from azure.mgmt.datalake.analytics.job import DataLakeAnalyticsJobManagementClient

adlaJobClient = get_client_from_cli_profile(
    DataLakeAnalyticsJobManagementClient,
    adla_job_dns_suffix='azuredatalakeanalytics.net')

def submit_usql_job(script):
    job_id = str(uuid.uuid4())
    job_result = adlaJobClient.job.create(
        ADLA_ACCOUNT_NAME,
        job_id,
        JobInformation(
            name='Sample Job',
            type='USql',
            properties=USqlJobProperties(script=script)
        )
    )
    print("Submitted job ID '{}'".format(job_id))
    return job_id

Upvotes: 0

Views: 325

Answers (2)

aaronsteers
aaronsteers

Reputation: 2571

Once compiled, the DLL file for your code behind can be serialized into hexadecimal string and then imported inline via a few extra lines of code. This avoids the need to separately upload and register the DLL.

CREATE ASSEMBLY [__TMP_inline_dll] FROM 0x4D5A900003000...;
WITH ADDITIONAL_FILES = (0x2A543C... AS "__TMP_inline_dll.pdb");
REFERENCE ASSEMBLY [__TMP_inline_dll];

/* Your USQL Code Here... */

DROP ASSEMBLY [__TMP_inline_dll];

The files can be serialized to hexadecimal using this Python code:

import binascii

def get_file_hex_string(filepath: str):
    """Open file in binary mode and return as a hex string."""
    with open(filepath, 'rb') as f:
        hexdata = binascii.hexlify(f.read())
    return hexdata.upper()

Notes:

  • The above assumes you have already compiled the dll.
  • This boilerplate code includes a pdb file noted as "additional" which should be optional.
  • The DROP ASSEMBLY statement at the end is needed to "clean up" the process afterwards, although I've been informed that in a future version of USQL this will no longer be necessary.
  • I received this method via the very helpful support team of the VS Code USQL add-in.

Upvotes: 0

Jason
Jason

Reputation: 621

Likely you're going to have to manage creating and registering the assembly yourself as an additional step in your job. Then reference the assembly as you normally would. If you need an example of what this might look like, submit a job from Visual Studio, for a query that has an accompanying code-behind file, and look at the script that it generates for you. You'll see that it is adding the above steps for you, transparently. Now, you can try applying this same approach/pattern in your own code.

Either that or move your code-behind logic to a dedicated library which you can upload and register separately, one-time, then reference it to your heart's content from your python-submitted jobs.

Upvotes: 1

Related Questions