Reputation: 373
I am kind of working on text processing,
suppose that i have one document and use it to compare with many other document.
I call the first document with txt
and other with pat
.
this is my main procedure
#read the document
txt = doc_gettext()
#read filename of other documents
filenames = doc.get_pat()
# iteration
d = int((len(txt) - 5 + 1) / k)
for i in range(1, len(filenames)):
# open pattern one by one through the loop by name
patname = filenames[i].replace('\n', '')
with open (patname, 'r') as pattern:
pattern = pattern.read().replace('\n', ' ').replace('\t', ' ')
pattern = pattern.split()
for j in range(k - 1):
p = Process(target=all_position, args=(int(j * d), int((j+1) * d) + 5 - 1, pattern, txt, i, R,))
processes.append(p)
p.start()
p = Process(target=all_position, args=(int(d * (k-1)), len(txt) + 5 - 1, pattern, txt, i, R,))
processes.append(p)
p.start()
for pr in processes:
pr.join()
and i try to print them here, because i want to do some algorithm later on,
def all_position(x, y, pat, txt, i, R):
#print pat
print txt
#print R.put(pat)
if __name__ == '__main__':
main()
suppose i saved my txt
on list with token length = 20
, and want to print them on procedure all_position
, the output is :
['pe[[n''sppieelnn'ss, ii'llb''a, , k''abbraa'kk, aar'r'a', l, 'a'asal'la, as's'
r', a, 'm'rbrauamtmb'b, uu'ttt''a, , n''gttaaannn'gg, aa'nnm''a, , k''ammnaa'kk,
aa'nnl''e, , m''allreeimm'aa, rr'iil''a, , n''tllaaainn'tt, aa'iis''e, , n''dss
aeelnn'dd, aa'llk''a, , k''ik'ka, ak'kiki'u', k, 'u'k'ku, uk'kupu'i', n, 't'pupi
'i, nn'ttpuue''l, , a''nppgeeill'aa, nn'ggmiii''n, , u''mmm'ii, nn'uummme''j, ,
a'''mm, ee'jjbaau''k, , u'''bb, uu'kkbuua''j, , 'ub''ab, ja'ujc'ue, l'acneal'a,
n''a, p'', lc'aespltlaianksa't', i, 'k'k'pe, lr'atksaetsir'k]t
'a, s''k]e
rtas']
['pensil', 'bakar', 'alas', 'rambut['', p'etnasnigla'n, '', b'amkaakra'[n, '''p,
ae'lnlasesim'la, 'r', ir''ab, ma'bkluaatrn''t, , a''ita'al, na'gssa'en, n''d, r
a'almm'ab, ku'atkn'a', k, 'i't'la, en'mgkaaurnki'u', ', ', 'm'lapakinantnta'ui,
''', , l''epsmeealnradina'gl, i''', l, 'a'knmatikaniiu''m, , ''', ks'uemkneudj'a
a, l''', p, 'i'bnkutakuku'i', ', ', 'p'bekalujakunu'g', i, '''c, pe'ilmnaitnnuau
''m, , ''', pp'elmlaeasjntagi'ik, ''', , b''umkkieunr'ut, ma''sb, 'a']jm
ue'j, a''c, e'lbaunkau'', , ''bapjlua's, t'icke'l, a'nkae'r, t'apsl'a]s
tik', 'kertas']
Why something like this happen? This is very confusing me. Can somebody please help me to fix this?
Upvotes: 0
Views: 365
Reputation: 31339
If you need safe printing you can use Lock objects.
Let's look at some code...
from multiprocessing import Lock, Process
import sys
# NOT SAFE
def not_safe_print(x):
for i in range(10):
# problem!
print range(20)
# pool of 10 workers
processes = []
for i in range(10):
processes.append(Process(target=not_safe_print, args=(i,)))
for p in processes:
p.start()
for p in processes:
p.join()
As we can see, two processes can be on the print
statement at the same time. This is not "safe".
Suppose we have two processes (numbered 1 and 2) that run a single instruction each time the scheduler gives them some time to run. The processes will end up writing only some of the lists they intend to write to stdout
. Then, the system will flush the stdout buffer and mangled output will show.
Hopefully when you run this script (you may have to run it a few times) - you'll see mangled text like in your program.
To make the script safe we have to limit access to shared resources like the stdout
buffer (what you end up seeing on the terminal - could be a file as well). This is also called mutual exclusion. To do that we can use the Lock
objects that provide means to solve the problem of mutual exclusion.
# used to implement a SAFE print
lock = Lock()
def safe_print(x):
# when a process reaches this point it acquires the lock.
# none goes in without the lock - only a single process can pass
lock.acquire()
for i in range(10):
print range(20)
# when the process is done it releases the lock for other processes to grab
# meaning another process can now use stdout (used by print...)
lock.release()
Don't forget to change this line:
processes.append(Process(target=safe_print, args=(i,)))
Upvotes: 2