Reputation: 746
I'm using saxonche
python library for XPath3.1 ops in python. I have created a FastAPI that just accepts the XML filename, opens it, processes and returns the response.
It worked fine during development on an Intel MacBook, but in production on an Amazon m7g.2xlarge instance (Debian 12 ARM64), it fails with the following error when processing multiple files.
Fatal error: StackOverflowError: Enabling the yellow zone of the stack did not make any stack space available. Possible reasons for that: 1) A call from native code to Java code provided the wrong JNI environment or the wrong IsolateThread; 2) Frames of native code filled the stack, and now there is not even enough stack space left to throw a regular StackOverflowError; 3) An internal VM error occurred.
XML File size: 5 to 8 MB Production env: m7g.2xlarge(AWS) with Debian 12 ARM64
Questions:
Does saxonche have a limitation with processing multiple files simultaneously?
Could upgrading Java on the server potentially resolve this issue?
Any suggestions for troubleshooting or resolving this error would be greatly appreciated. Thank you for your help!
Upvotes: 1
Views: 194
Reputation: 167401
In my Python Flask app I use
from saxonche import *
saxon_proc = PySaxonProcessor()
and
@app.teardown_appcontext
def teardown_saxonche_thread(exception):
if saxon_proc is not None:
saxon_proc.detach_current_thread
I don't know whether there is a similar way for FastAPI to make sure the detach_current_thread is done but look at its documentation whether/how you can run a clean-up routine.
And avoid creating more than one PySaxonProcessor.
Based on https://fastapi.tiangolo.com/advanced/events/ I would suggest to try along the lines of e.g.
from contextlib import asynccontextmanager
from fastapi import FastAPI
from saxonche import PySaxonProcessor
# create single Saxon processor
saxon_proc = PySaxonProcessor()
@asynccontextmanager
async def lifespan(app: FastAPI):
# use single Saxon processor
#
yield
# clean up Saxon processor
if saxon_proc is not None:
saxon_proc.detach_current_thread
app = FastAPI(lifespan=lifespan)
# put your API methods here
# e.g.
@app.get("/hello/{name}")
async def say_hello(name: str):
xdm_doc = saxon_proc.parse_xml(xml_file_name=name)
xpath_proc = saxon_proc.new_xpath_processor()
xpath_proc.set_context(xdm_item=xdm_doc)
xpath_result = xpath_proc.evaluate('root/foo ! map { local-name() : string() }')
api_result = [{key.string_value : map.get(key).head.string_value} for map in xpath_result for key in map.keys() ]
return api_result
Try that approach in your environment and let us know whether that improves things.
Upvotes: 1