Reputation: 9065
I do a multiprocessing with python's multiprocessing.Pool
module, but got TypeError: list indices must be integers, not str
Error:
Here is my code:
def getData(qid):
r = requests.get("http://api.xxx.com/api?qid=" + qid)
if r.status == 200:
DBC.save(json.loads(r.text))
def getAnotherData(qid):
r = requests.get("http://api.xxxx.com/anotherapi?qid=" + qid)
if r.status == 200:
DBC.save(json.loads(r.text))
def getAllData(qid):
print qid
getData(str(qid))
getAnotherData(str(qid))
if __name__ == "__main__":
pool = Pool(processes=200)
pool.map(getAllData, range(10000, 700000))
After running the code for some time (not instantly), a Exception will be thrown out
pool.map(getAllData, range(10000, 700000))
File "/usr/lib/python2.7/multiprocessing/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 567, in get
raise self._value
TypeError: list indices must be integers, not str
What could be wrong? Is it a bug of the Pool
module?
Upvotes: 2
Views: 1828
Reputation: 155674
When a worker task raises an exception, Pool
catches it, sends it back to the parent process, and reraises the exception, but this doesn't preserve the original traceback (so you just see where it was reraised in the parent process, which isn't very helpful). At a guess, something in DBC.save
expects a value loaded from the JSON to be an int
, and it's actually a str
.
If you want to see the real traceback, import traceback
at top level, and change the top level of your worker function to:
def getAllData(qid):
try:
print qid
getData(str(qid))
getAnotherData(str(qid))
except:
traceback.print_exc()
raise
so you can see the real traceback in the worker, not just the neutered, mostly useless traceback in the parent.
Upvotes: 3