Reputation: 51
I am using: Win 10 Pro Intel(R) Xeon(R) W-1250 CPU @ 3.30GHz / 16 GB RAM
Anaconda Navigator 2.5.0, Python 3.10.13 in venv pyarrow 11.0.0 pandas 2.1.1 Running scripts in Spyder IDE 5.4.3
I want to open/process .orc files (.csv is not possible in my case) and to make a simple test why my neural network using tensorflow doesn't work properly, I wrote simple scripts CreateORC and OpenORC. It should be a very easy task to open and read ocr in my case, because I created a very simple ocr file, but reading leads to a crash.
CreateORC.py: (THIS PART WORKS FINE AND ORC FILE WITH DATA IN IT IS CREATED)
import pandas as pd
import pyarrow as pa
import pyarrow.orc as orc
# Create a sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
print(df.head())
# Convert the DataFrame to a PyArrow Table
table = pa.Table.from_pandas(df)
# Write the table to an ORC file
orc_file_path = 'sample_data.orc'
# Write the table to an ORC file
orc_file_path = 'sample_data.orc'
with open(orc_file_path, 'wb') as f:
orc.write_table(table, f)
print(f"ORC file '{orc_file_path}' created successfully.")
OpenORC.py (THIS PART LEADS TO CRASH)
import pyarrow.orc as orc
def read_orc_file(file_path):
# Open the ORC file
with open(file_path, 'rb') as f:
orc_file = orc.ORCFile(f)
# Read the ORC file as a PyArrow Table
table = orc_file.read()
return table
# Specify the path to your ORC file
orc_file_path = r'C:\Users\WEY\Desktop\KIT_NN\Notebooks\sample_data.orc'
# Read the ORC file
try:
table = read_orc_file(orc_file_path)
print(table)
except Exception as e:
print("Error reading ORC file:", e)
When I run it, I get this:
runfile('C:/Users/WEY/Desktop/KIT_NN/Notebooks/OpenORC.py', wdir='C:/Users/WEY/Desktop/KIT_NN/Notebooks')
Fatal Python error: Aborted
Main thread:
Current thread 0x000039ec (most recent call first):
File "C:\Users\WEY\anaconda3\envs\KIT\lib\site-packages\pyarrow\orc.py", line 187 in read
File "C:/Users/WEY/Desktop/KIT_NN/Notebooks/OpenORC.py", line 16 in read_orc_file
File "C:/Users/WEY/Desktop/KIT_NN/Notebooks/OpenORC.py", line 25 in <module>
File "C:\Users\WEY\anaconda3\envs\KIT\lib\site-packages\debugpy\_vendored\pydevd\_pydev_bundle\_pydev_execfile.py", line 14 in execfile
File "C:\Users\WEY\anaconda3\envs\KIT\lib\site-packages\debugpy\_vendored\pydevd\_pydev_bundle\pydev_umd.py", line 175 in runfile
File "C:\Users\WEY\AppData\Local\Temp\ipykernel_17556\526212489.py", line 1 in <module>
Restarting kernel...
`
Still didn't work
Upvotes: 1
Views: 978
Reputation: 51
I have found a solution!
It doesn't make any sense, but it works.
Everytime I get the error Fatal Python error: "Aborted....Restarting kernel..." I need to run this script:
import os
import pandas as pd
path1 = "res_32.orc"
res_file_path = os.path.join("C:\\", "Users", "WEY", "Desktop", "KIT_NN", "gr_32", f"{path1}")
df = pd.read_orc(res_file_path)
OR
import os
import pyorc
path1 = "tspectra32_0_0.orc"
res_file_path = os.path.join("C:\\", "Users", "WEY", "Desktop", "KIT_NN", "gr_32", f"{path1}")
with open(res_file_path, "rb") as orc_file:
reader = pyorc.Reader(orc_file)
If I run one of the both scripts, my other codes in different modules also work without a problem. For me it's too complex to understand what's going on, but it works!
Upvotes: 0