Lomefin
Lomefin

Reputation: 1262

Databricks: run python coverage inside databricks jobs

I am using Azure Databricks, running a python script instead of using Notebooks.

Given the way Databricks is implemented, it's hard to test the code in my local machine, so I wanted to create another job which could run coverage for it.

The coverage package says you should just run coverage my_script.py arg1 arg2 but the Databricks Job creation dialog does not allow my to run shell commands. Only python scripts (or wheels, or other "shelled" methods) are available.

Most of the recommendations about running coverage say the solution is to use the CLI, but when it's not available, which is the best way to go?

Upvotes: 1

Views: 1056

Answers (1)

Lomefin
Lomefin

Reputation: 1262

According to the coverage documentation page you can use the library directly on the code.

What I did was to import my script file's main method into a new script file, check_coverage.py and wrote down

from coverage import Coverage
from my_script import main

cov = Coverage()
cov.start()
# my imported code
main()

cov.stop()
cov.report()

I didn't use html_report so I wont be able to access the file system once the server is gone, but I had access to its logs.

Upvotes: 1

Related Questions