Reputation: 842
I'm trying to convert a synchronous flow in Python code which is based on callbacks to an A-syncronious flow using asyncio. Basically the code interacts a lot with TCP/UNIX sockets. It reads data from the sockets, manipulates it to make decisions and writes stuff back to the other side. This is going on over multiple sockets at once and data is shared between the contexts to make decisions sometimes.
EDIT :: The code currently is mostly based on registering a callback to a central entity for a specific socket, and having that entity run the callback when the relevant socket is readable (something like "call this function when that socket has data to be read"). Once the callback is called - a bunch of stuff happens, and eventually a new callback is registered for when new data is available. The central entity runs a select over all sockets registered to figure out which callbacks should be called.
I'm trying to do this without refactoring my entire code and making this as seamless as possible to the programmer - so I was trying to think about it like so - all code should run the same way as it does today - but whenever the current code does a socket.recv() to get new data - the process would yield execution to other tasks. When the read returns, it should go back to handling the data from the same point using the new data it got.
To do this, I wrote a new class called AsyncSocket - which interacts with the IO streams of asyncIO and placed the Async/await statements almost solely in there - thinking that I would implement the recv method in my class to make it look like a "regular IO socket" to the rest of my code. So far - this is my understanding of what A-sync programming should allow.
Now to the problem :
My code awaits for clients to connect - when it does, each client's context is allowed to read and write from it's own connection. I've simplified to flow to the following to clarify the problem:
class AsyncSocket():
def __init__(self,reader,writer):
self.reader = reader
self.writer = writer
def recv(self,numBytes):
print("called recv!")
data = self.read_mitigator(numBytes)
return data
async def read_mitigator(self,numBytes):
print("Awaiting of AsyncSocket.reader.read")
data = await self.reader.read(numBytes)
print("Done Awaiting of AsyncSocket.reader.read data is %s " % data)
return data
def mit2(aSock):
return mit3(aSock)
def mit3(aSock):
return aSock.recv(100)
async def echo_server(reader, writer):
print ("New Connection!")
aSock = AsyncSocket(reader,writer) # create a new A-sync socket class and pass it on the to regular code
while True:
data = await some_func(aSock) # this would eventually read from the socket
print ("Data read is %s" % (data))
if not data:
break
writer.write(data) # echo everything back
async def main(host, port):
server = await asyncio.start_server(echo_server, host, port)
await server.serve_forever()
asyncio.run(main('127.0.0.1', 5000))
mit2() and mit3() are synchronous functions that do stuff with the data on the way back before returning to the main client's loop - but here I'm just using them as empty functions. The problem starts when I play with the implementation of some_func().
A pass through implementation (edit: kind-of-works) - but still has issues :
def some_func(aSock):
try:
return (mit2(aSock)) # works
except:
print("Error!!!!")
While an implementation which reads the data and does something with it - like adding a suffix before returning, throws an error:
def some_func(aSock):
try:
return (mit2(aSock) + "something") # doesn't work
except:
print("Error!!!!")
The error (as far as I understand it) means it's not really doing what it should:
New Connection!
called recv!
/Users/user/scripts/asyncServer.py:36: RuntimeWarning: coroutine 'AsyncSocket.read_mitigator' was never awaited
return (mit2(aSock) + "something") # doesn't work
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Error!!!!
Data read is None
And the echo server obviously doesn't work. Obviously my code looks more like option #2 with a lot more stuff in some_func(),mit2() and mit3() - but I can't get this to work. I'm fairly new in using asyncio/async/await - so what (rather basic concept I guess) am I missing?
Upvotes: 3
Views: 13907
Reputation: 154916
This code won't work as envisioned:
def recv(self,numBytes):
print("called recv!")
data = self.read_mitigator(numBytes)
return data
async def read_mitigator(self,numBytes):
...
You cannot call an async function from a sync function and get the result, you must await it, which ensures that you return to the event loop in case the data is not yet ready. This mismatch between async and sync code is sometimes referred to as the issue of function color.
Since your code is already using non-blocking sockets and an event loop, a good approach to porting it to asyncio might be to first switch to the asyncio event loop. You can use event loop methods like sock_recv
to request data:
def start():
loop = asyncio.get_event_loop()
sock = make_socket() # make sure it's non-blocking
future_data = loop.sock_recv(sock, 1024)
future_data.add_done_callback(continue_read)
# return to the event loop - when some data is ready
# continue_read will be invoked
def continue_read(future):
data = future.result()
print('got', data)
# ... do something with data, e.g. process it
# and call sock_sendall with the response
asyncio.get_event_loop().call_soon(start())
asyncio.get_event_loop().run_forever()
Once you have the program working in that mode, you can start moving to coroutines, which allow the code to look like sync code, but work in exactly the same way:
async def start():
loop = asyncio.get_event_loop()
sock = make_socket() # make sure it's non-blocking
data = await loop.sock_recv(sock, 1024)
# data is available "immediately", meaning the coroutine gets
# automatically suspended when awaiting data that is not yet
# ready, and automatically re-scheduled when the data is ready
print('got', data)
asyncio.run(start())
The next step can be eliminating make_socket
and switching to asyncio streams.
Upvotes: 8