Ali Raza
Ali Raza

Reputation: 1

I am using Spark-XML to read a xml file but i am facing this error "FileNotFoundError: [WinError 2] The system cannot find the file specified"

I am using windows 11 when i run this code in Python. I am trying to read an xml file which has datasetof gardening question answers,

*from pyspark.sql import SparkSession
def main():
gardening_raw_path = "D:\dbfs\dbdemos\product\llm\gardening\raw"
print(f"loading raw xml dataset under {gardening_raw_path}")

Create a SparkSession
spark = SparkSession.builder \
>     .appName("module1") \
>     .getOrCreate()
shell=true

Read XML file into a DataFrame
xml_df = spark.read \
>     .format("xml") \
>     .option("rowTag", "row") \
>     .load(f"{gardening_raw_path}/Posts.xml")

Show the DataFrame schema and contents
xml_df.printSchema()
xml_df.show()

Stop the SparkSession
spark.stop()
pass

if __name__ == '__main__':
main()*

It gives this error


Traceback (most recent call last):
  File "C:\Users\hp\Documents\module1.py", line 38, in <module>
    main()
  File "C:\Users\hp\Documents\module1.py", line 20, in main
    .getOrCreate()
     ^^^^^^^^^^^^^
  File "D:\Ali\python\Lib\site-packages\pyspark\sql\session.py", line 477, in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Ali\python\Lib\site-packages\pyspark\context.py", line 512, in getOrCreate
    SparkContext(conf=conf or SparkConf())
  File "D:\Ali\python\Lib\site-packages\pyspark\context.py", line 198, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "D:\Ali\python\Lib\site-packages\pyspark\context.py", line 432, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
                                       ^^^^^^^^^^^^^^^^^^^^
  File "D:\Ali\python\Lib\site-packages\pyspark\java_gateway.py", line 99, in launch_gateway
    proc = Popen(command, **popen_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\Ali\python\Lib\subprocess.py", line 1024, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "D:\Ali\python\Lib\subprocess.py", line 1509, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 2] The system cannot find the file specified

I am trying to read XML file in my Python programm using spark-XML

Upvotes: 0

Views: 83

Answers (0)

Related Questions