Reputation: 437
I have a very long file containing data ("text.txt") and a single file that contains exactly 1 line that is the last line of text.txt. This single line should be overwritten every 10 minutes (done by a simple chronjob) as text.txt receives another line every 10 minutes.
So based on other code snippets I found on stackoverflow I currently run this code:
#!/usr/bin/env python
import os, sys
file = open(sys.argv[1], "r+")
#Move the pointer (similar to a cursor in a text editor) to the end of the file.
file.seek(0, os.SEEK_END)
#This code means the following code skips the very last character in the file -
#i.e. in the case the last line is null we delete the last line
#and the penultimate one
pos = file.tell() - 1
#Read each character in the file one at a time from the penultimate
#character going backwards, searching for a newline character
#If we find a new line, exit the search
while pos > 0 and file.read(1) != "\n":
pos -= 1
file.seek(pos, os.SEEK_SET)
#So long as we're not at the start of the file, delete all the characters ahead of this position
if pos > 0:
file.seek(pos, os.SEEK_SET)
w = open("new.txt",'w')
file.writelines(pos)
w.close()
file.close()
With this code I get the error:
TypeError: writelines() requires an iterable argument
(of course). When using file.truncate()
I can get rid of the last line in the original file; but I want to keep it there and just extract that last line to new.txt. But I don't comprehend how this works when working with file.seek. So I'd need help for the last part of the code.
file.readlines()
with lines[:-1]
does not work properly with such huge files.
Upvotes: 1
Views: 761
Reputation: 2374
Here's how to tail the last 2 lines of a file into a list:
import subprocess
output = subprocess.check_output(['tail', '-n 2', '~/path/to/my_file.txt'])
lines = output.split('\n')
Now you can get the info you need out of the list lines
.
Upvotes: 0
Reputation: 46759
How about the following approach:
max_line_length = 1000
with open(sys.argv[1], "r") as f_long, open('new.txt', 'w') as f_new:
f_long.seek(-max_line_length, os.SEEK_END)
lines = [line for line in f_long.read().split("\n") if len(line)]
f_new.write(lines[-1])
This will seek to almost the end of the file and read the remaining part of the file in. It is then split into non-empty lines and the last entry is written to new.txt
.
Upvotes: 0
Reputation: 90909
According to your code, pos
is an integer which is used to denote the position of first \n
from the end of the file.
You cannot do - file.writelines(pos)
, as writelines requires a list of lines. But pos
is a single integer.
Also you want to write to new.txt
, so you should use w
file to write, not file
. Example -
if pos > 0:
file.seek(pos, os.SEEK_SET)
w = open("new.txt",'w')
w.write(file.read())
w.close()
Upvotes: 1
Reputation: 76194
Not sure why you're opening w
, only to close it without doing anything with it. If you want new.txt
to have all the text from file
starting at pos
and ending at the end, how about:
if pos > 0:
file.seek(pos, os.SEEK_SET)
w = open("new.txt",'w')
w.write(file.read())
w.close()
Upvotes: 1