Esat Becco
Esat Becco

Reputation: 51

I get a "Fatal Python error: Aborted" and no explanatory error message I can work with when I try to open a simple .orc file with pyarrow

I am using: Win 10 Pro Intel(R) Xeon(R) W-1250 CPU @ 3.30GHz / 16 GB RAM

Anaconda Navigator 2.5.0, Python 3.10.13 in venv pyarrow 11.0.0 pandas 2.1.1 Running scripts in Spyder IDE 5.4.3

I want to open/process .orc files (.csv is not possible in my case) and to make a simple test why my neural network using tensorflow doesn't work properly, I wrote simple scripts CreateORC and OpenORC. It should be a very easy task to open and read ocr in my case, because I created a very simple ocr file, but reading leads to a crash.

CreateORC.py: (THIS PART WORKS FINE AND ORC FILE WITH DATA IN IT IS CREATED)

import pandas as pd
import pyarrow as pa
import pyarrow.orc as orc

# Create a sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
print(df.head())

# Convert the DataFrame to a PyArrow Table
table = pa.Table.from_pandas(df)

# Write the table to an ORC file
orc_file_path = 'sample_data.orc'

# Write the table to an ORC file
orc_file_path = 'sample_data.orc'
with open(orc_file_path, 'wb') as f:
    orc.write_table(table, f)

print(f"ORC file '{orc_file_path}' created successfully.")


OpenORC.py (THIS PART LEADS TO CRASH)

import pyarrow.orc as orc

def read_orc_file(file_path):
    # Open the ORC file
    with open(file_path, 'rb') as f:
        orc_file = orc.ORCFile(f)

    # Read the ORC file as a PyArrow Table
    table = orc_file.read()

    return table

# Specify the path to your ORC file
orc_file_path = r'C:\Users\WEY\Desktop\KIT_NN\Notebooks\sample_data.orc'

# Read the ORC file
try:
    table = read_orc_file(orc_file_path)
    print(table)
except Exception as e:
    print("Error reading ORC file:", e)


When I run it, I get this:

runfile('C:/Users/WEY/Desktop/KIT_NN/Notebooks/OpenORC.py', wdir='C:/Users/WEY/Desktop/KIT_NN/Notebooks')


Fatal Python error: Aborted


Main thread:
Current thread 0x000039ec (most recent call first):
  File "C:\Users\WEY\anaconda3\envs\KIT\lib\site-packages\pyarrow\orc.py", line 187 in read
  File "C:/Users/WEY/Desktop/KIT_NN/Notebooks/OpenORC.py", line 16 in read_orc_file
  File "C:/Users/WEY/Desktop/KIT_NN/Notebooks/OpenORC.py", line 25 in <module>
  File "C:\Users\WEY\anaconda3\envs\KIT\lib\site-packages\debugpy\_vendored\pydevd\_pydev_bundle\_pydev_execfile.py", line 14 in execfile
  File "C:\Users\WEY\anaconda3\envs\KIT\lib\site-packages\debugpy\_vendored\pydevd\_pydev_bundle\pydev_umd.py", line 175 in runfile
  File "C:\Users\WEY\AppData\Local\Temp\ipykernel_17556\526212489.py", line 1 in <module>


Restarting kernel...

`

  1. I opened the orc file with an online orc viewer and it worked without a problem.
  2. Made a new environment in Anaconda and installed pyarrow again in anaconda.

Still didn't work

Upvotes: 1

Views: 978

Answers (1)

Esat Becco
Esat Becco

Reputation: 51

I have found a solution!

It doesn't make any sense, but it works.

Everytime I get the error Fatal Python error: "Aborted....Restarting kernel..." I need to run this script:

import os
import pandas as pd

path1 = "res_32.orc"

res_file_path = os.path.join("C:\\", "Users", "WEY", "Desktop", "KIT_NN", "gr_32", f"{path1}")

df = pd.read_orc(res_file_path)

OR

import os
import pyorc

path1 = "tspectra32_0_0.orc"

res_file_path = os.path.join("C:\\", "Users", "WEY", "Desktop", "KIT_NN", "gr_32", f"{path1}")

with open(res_file_path, "rb") as orc_file:
    reader = pyorc.Reader(orc_file)

If I run one of the both scripts, my other codes in different modules also work without a problem. For me it's too complex to understand what's going on, but it works!

Upvotes: 0

Related Questions