John
John

Reputation: 21

How to determine the source of a job submission

I am searching for a way to determine how a job was initiated on the HPCC cluster. There are several ways to submit a job. For example: 1- a manual submission via the ECL IDE / ECL Watch 2- an external cron submission 3- an ECL submission of dynamically built code 4- if a file lands in a directory, it triggers a submission etc. I can retrieve some important information by executing a STD.System.Workunit.WorkunitList, but I cannot find any function that would give me an attribute indicating the source of that submission.

HPCC is a data-centric platform and ECL reflects that approach. So I am attempting to build a matrix that defines the code in relation to that data. A product is technically a bunch of data (files) that is the result of source input -> scrub and transformation processes -> to the final base files. Then those files are then prepped / indexed for external use: 1- Roxie queries 2- PowerBI 3- webpage 4- reports ftp'd or emailed etc.

I want to build this matrix that defines (by product) the initiating job(s), where they were initiated, any schedule (?), the associated input/output files (flagging whether they are source/intermediate/base/output). I am trying to design this so that the matrix can be dynamically built, because as we all know: (1) nowhere does this type of documentation exist so that if someone new comes in to work on a product, they can go and see the scope and life cycle of the data, (2) nobody likes to document, (3) the second any manual documentation is actually created and saved; it is out of sync with reality

So far, the design will be a collection of files (defined by the level of detail) which would then be JOINed together to yield the final matrix. Not sure if this would end up as a PowerBI report or a webpage...still tossing that around. Still, this might prove to be something useful for anyone using HPCC who wants a 30,000 ft view of their product.

I have attempted to programmatically scan a WUID output, looking for the necessary attributes but I have had little success.

I appreciate any assistance / comments.

Upvotes: 1

Views: 55

Answers (1)

Schmoo
Schmoo

Reputation: 584

No matter what component submits ECL to execute on the platform, they all ultimately end up going through the same WsWorkunits API, which is the public SOAP / REST interface.

While some client applications will leave a fingerprint so you can deduce where it came from, it is not a foolproof mechanism...

For Example: In http://play.hpccsystems.com:8010/esp/files/index.html#/workunits/W20221115-075604/xml you can see the ECL IDE appends some meta information into the Workunit (it stores the IDE version number in the "Application" section)

Upvotes: 1

Related Questions