Peepo
Peepo

Reputation: 3

How to extract/delete parts of a line and save to a new file in python?

I have this data here:

'**Otolemur_crassicaudatus**_/7977-8746 gi|238809369|dbj|**AB371093.1**|':0.00000000,'**Otolemur_crassicaudatus**/7977-8746 gi|238866848|ref|**NC_012762.1**|':

It is all on one line in a .txt file. I was wondering how I would go about extracting the Names (i.e the Otolemur and the AB and NC numbers (bold) to print to a new file but without all the other columns. This is a tiny, tiny snippet of what I have, and to be able to do this would be such a time saver.

Upvotes: 0

Views: 59

Answers (1)

Henry Keiter
Henry Keiter

Reputation: 17188

Assuming there's some predictability to the stuff you want to keep, you want a regex of some kind to match the good stuff. Then you can grab your list of match objects and write that all to a new file however you want. I don't what your data looks like well enough to make the regex pattern for you, but the basic conversion looks something like this:

import re
infile = open('input.txt', 'r')
outfile = open('output.txt', 'w')
for line in infile:
    # Write each matching piece to its own line in the new file
    outfile.write('\n'.join(re.findall('PATTERN', line)))
infile.close()
outfile.close()

Upvotes: 1

Related Questions