Reputation: 1512
I need to process a massive amount of data rows and it only makes sense to do this asynchronously.
I need to see list processing status, i.e. Done processing 1/3
, but when I increment the counter, it always stays at 1. This makes sense since I send the counter into the function. I needed to do this because without it, I would get:
UnboundLocalError: local variable 'processed' referenced before assignment
Using Python 3.8
Any help would be appreciated!
Here's a link to test: https://ideone.com/gRjrf2
I've abstracted my code below:
import os, logging
import asyncio
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)-8s [%(filename)s:%(lineno)d] %(message)s', datefmt='%d-%b-%y %H:%M:%S')
logger = logging.getLogger(__name__)
items = [{"name": "A"}, {"name": "B"}, {"name": "C"}]
processed = 0
async def increment(item):
count = item.get('count', 0)
count += 1
return count
async def get_and_update(item, processed):
item['count'] = await increment(item)
# Show progress now, but how?
processed += 1
logger.info(f"You can't see me {processed}")
async def run():
logger.info(f"Processing {len(items)} items...")
await asyncio.gather(*[
asyncio.create_task(
get_and_update(item, processed)
) for item in items
])
loop = asyncio.get_event_loop()
loop.run_until_complete(run())
The output I get is:
28-Aug-20 11:19:22 INFO [prog.py:23] Processing 3 items...
28-Aug-20 11:19:22 INFO [prog.py:20] You can't see me 1
28-Aug-20 11:19:22 INFO [prog.py:20] You can't see me 1
28-Aug-20 11:19:22 INFO [prog.py:20] You can't see me 1
Upvotes: 0
Views: 2648
Reputation: 312788
Your basic problem is that by declaring processed
as a parameter to get_and_update
, you're shadowing the global processed
variable. You need to drop the parameter and then declare processed
as global within that function, like this:
import os, logging
import asyncio
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)-8s [%(filename)s:%(lineno)d] %(message)s', datefmt='%d-%b-%y %H:%M:%S')
logger = logging.getLogger(__name__)
items = [{"name": "A"}, {"name": "B"}, {"name": "C"}]
processed = 0
async def increment(item):
count = item.get('count', 0)
count += 1
return count
async def get_and_update(item):
global processed
item['count'] = await increment(item)
# Show progress now, but how?
processed += 1
logger.info(f"You can't see me {processed}")
async def run():
logger.info(f"Processing {len(items)} items...")
await asyncio.gather(*[
asyncio.create_task(
get_and_update(item)
) for item in items
])
loop = asyncio.get_event_loop()
loop.run_until_complete(run())
The output of the above is:
28-Aug-20 08:15:00 INFO [asynctest.py:25] Processing 3 items...
28-Aug-20 08:15:00 INFO [asynctest.py:22] You can't see me 1
28-Aug-20 08:15:00 INFO [asynctest.py:22] You can't see me 2
28-Aug-20 08:15:00 INFO [asynctest.py:22] You can't see me 3
Upvotes: 2