Sathya
Sathya

Reputation: 299

psutil library installation issue on databricks

I am using psutil library on my databricks cluster which was running fine for last couple of weeks. When I started the cluster today, this specific library failed to install. I noticed there was a different version of psutil got updated in the site.

Currently my python script fails with 'No module psutil'

Tried installing previous version of psutil using pip install but still my code fails with the same error.

Is there any alternative to psutil or is there a way to install it in databricks

Upvotes: 0

Views: 505

Answers (2)

CHEEKATLAPRADEEP
CHEEKATLAPRADEEP

Reputation: 12788

In additional to @Peter response, you can also use "Library utilities" to install Python libraries.

Library utilities allow you to install Python libraries and create an environment scoped to a notebook session. The libraries are available both on the driver and on the executors, so you can reference them in UDFs. This enables:

  • Library dependencies of a notebook to be organized within the notebook itself.
  • Notebook users with different library dependencies to share a cluster without interference.

enter image description here

Example: To install "psutil" library using library utilities:

dbutils.library.installPyPI("psutil")

enter image description here

**Reference: **Databricks - library utilities

Hope this helps.

Upvotes: 0

Peter Pan
Peter Pan

Reputation: 24148

As I known, there are two ways to install a Python package in Azure Databricks cluster, as below.

  1. As the two figures below, move to the Libraries tab of your cluster and click the Install New button to type the package name of you want to install, then wait to install successfully

    enter image description here

    enter image description here

  2. Open a notebook, type the shell command as below to install a Python package via pip. Note: At here, for installing in the current environment of databricks cluster, not in the system environment of Linux, you must use /databricks/python/bin/pip, not only pip.

    %sh
    /databricks/python/bin/pip install psutil
    

    enter image description here

Finally, I run the code below, it works for the two ways above.

import psutil
for proc in psutil.process_iter(attrs=['pid', 'name']):
  print(proc.info)

psutil.pid_exists(<a pid number in the printed list above>)

enter image description here

Upvotes: 1

Related Questions