Reputation: 111
I'm trying to get data posts from twitter using twint, and to make it faster i'm using multiprocessing to get all data from all users in parallel. For some reason, I always get the error:
cannot pickle '_thread.lock' object
I can't find how to fix it.. I tried using 'threading' but for some reason it doesn't give me full data and every time give me different result.
Thank you!
for loop than creating processes like that:
p1 = multiprocessing.Process(target=my_func, args=(current_user, collection, posts_list,))
proc.append(p1)
p1.start()
after that for loop the doing for each process 'join'.
Upvotes: 7
Views: 22167
Reputation: 1
With the limited description you post, we cannot tell more about the error. That should relate to the lock
thing as the others mentioned.
But I guess you're running your program on Windows. And maybe there is some error of twint running on Windows with multiprocessing. Windows system cannot do fork which Unix system uses to create a new process. If you also see error info like spawn xxx
, then, I think, it could probably be an error of multiprocessing program running on Windows. But I currently don't have solution for this. Wait for others to provide solutions.
Upvotes: 0
Reputation: 2724
You have to remove the lock you probably have inside one of the arguments, because lock
is an object that can't be pickled. Multiprocessing uses (as the name suggests) multiple processes, and the argument passing to a process is made with pickle
.
Therefore you can either remove the lock object, or ignore it while pickling. You can refer to this post to do that
Upvotes: 2