Processing a mainframe file using cobrix in databricks - Pyspark python 3

Question

Does anyone know on how to integrate cobrix in azure databricks - pyspark for processing a mainframe file , having comp-3 columns(Python 3 )

Please find the below link for detailed issue. https://github.com/AbsaOSS/cobrix/issues/236#issue-550885564

CHEEKATLAPRADEEP · Accepted Answer

To make third-party or locally-built code available to notebooks and jobs running on your clusters, you can install a library. Libraries can be written in Python, Java, Scala, and R. You can upload Java, Scala, and Python libraries and point to external packages in PyPI, Maven, and CRAN repositories.

Steps to install third party libraries:

Step1: Create Databricks Cluster.

Step2: Select the cluster created.

Step3: Select Libraries => Install New => Select Library Source = "Maven" => Coordinates => Search Packages => Select Maven Central => Search for the package required. Example: (spark-cobol, cobol-parser, scodec) => Select the version required => Install

For more details, refer "Azure Databricks - libraries" and "Cobrix: A Mainframe Data Source for Spark SQL and Streaming".

Hope this helps. Do let us know if you any further queries.

Processing a mainframe file using cobrix in databricks - Pyspark python 3

Answers (1)

Related Questions