PROTOCOL
PROTOCOL

Reputation: 371

Insert content as the first line a big file in python

I am reading a big file(10M - 30M records) line by line, performs some data manipulation to each line and writes to another file. I have a scenario where first line manipulation depends on some of the data on other lines and hence I can write the first line to the output file only after traversing through all other lines.

I have tried with fileinput like below:

with fileinput.input(temp_file_name, inplace=True) as file_inp:
    
    for file_line in file_inp:
        
        sys.stdout.write(file_line.replace('Header', f"{transformed_header_line}"))

Here, temp_file_name is the transformed file with first line as Header with all other transformed lines and using fileinput I am replacing string Header with new line and again writing to the file. This process is taking time. Are there any other alternative methods like writing to stream of bytes or generator and later modify the data and write to the file.

Upvotes: 2

Views: 66

Answers (1)

Galtkaam
Galtkaam

Reputation: 11

If you are on ubuntu then you can use sed command in a subprocess as that is much faster.

subprocess.run(f"sed -i 's/Header/{transformed_header_line}/g' {temp_file_name}", shell=True)

Upvotes: 1

Related Questions