Reputation: 309
I try to use python to handle text replace problem. There is a file of Little-endian UTF-16 format, I want to replace the ip address in this file. First, I read this file by line, then replace the target string, last, I write the new string to the file. But with multi thread operate this file, the file will be garbled. Here is my code.
import re
import codecs
import time
import thread
import fcntl
ip = "10.200.0.1"
searchText = r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}"
def replaceFileText(fileName,searchText,replaceText,encoding):
lines = []
with codecs.open(fileName,"r",encoding) as file:
fcntl.flock(file,fcntl.LOCK_EX)
for line in file:
lines.append(re.sub(searchText,replaceText,line))
fcntl.flock(file,fcntl.LOCK_UN)
with codecs.open(fileName,"w",encoding) as file:
fcntl.flock(file,fcntl.LOCK_EX)
for line in lines:
file.write(line)
fcntl.flock(file,fcntl.LOCK_UN)
def start():
replaceFileText("rdpzhitong.rdp",searchText,ip,"utf-16-le")
thread.exit_thread()
def test(number):
for n in range(number):
thread.start_new_thread(start,())
time.sleep(1)
test(20)
I can't understand why the file is garbled, I have use the fcntl flock to keep the read/write sequence, where is the problem?
Upvotes: 2
Views: 1143
Reputation: 58681
It's garbled because an fcntl lock is owned by a process, not by a thread, so a process cannot use fcntl to serialize its own access. See this answer, for example.
You'll need to use a threading construct like a Lock instead.
Upvotes: 6
Reputation: 8202
I imagine it's garbled cause you lock it after you open it. In this situation the seek position might be wrong.
BTW the threading in Python is not so useful in this context (look around for the python GIL problem). I suggest you, to maximize performance in a task like that, to use the multiprocessing module and to change the logic using queues/pipes, making worker processes which analyze data and the main process responsible of I/O from input and output files.
Upvotes: 0