Anekdotin
Anekdotin

Reputation: 1591

Python arranging a list and writing it back to file

I am trying to make a script which finds everything between a symbol {} in a text document. It takes the .txt documents specific part in the {} and organizes it alphabetically, then writing it inplace back to the text document. Example of text document..

bla bla bla 
bla ba bl bla ba bl {apple:banana, this: something else, airplane:hobby}
bla bla bla 
bla bla bla 

Desired output(sorted alphabetically)..

bla bla bla 
bla ba bl bla ba bl {airplane:hobby, apple:banana, this: something else}
bla bla bla 
bla bla bla 

What its still printing..

    bla bla bla 
    bla ba bl bla ba bl {apple:banana, this: something else, airplane:hobby}
    bla bla bla 
    bla bla bla 

My code..

def openFind():
    f = open(inFile, 'r')
    lines = f.read()
    match = re.findall(r'{(.*?)}', lines)
    before = str(match)
    n=1
    for i in xrange(0, len(match), n):
        mydict =  match[i:i+n]
        for x in sorted(mydict):
            c = x.split(',')
            newmatch = sorted(c)
            final =  str(newmatch)
            print final

            # NOT WORKING BELOW!!! Stuck in loop?
            with open(outFile,'w') as new_file:
                with open(inFile) as old_file:
                    for line in old_file:
                        new_file.write(line.replace(before, after))

It prints the sorted/alphabetical list as [airplane:hobby, apple:banana, this: something else], but how do I get it to replace the original text in the text document? Has to be inplace, but can make a new txt.

Upvotes: 1

Views: 79

Answers (4)

C Panda
C Panda

Reputation: 3405

The entire program can be written succinctly as follows,

with open("file.txt") as fr:
    content = fr.read()

matches = (match.group(1) for match in re.finditer(r"{(.*?)}", content))
for match in matches:
    repl = ", ".join(sorted(match.split(", ")))
    content = content.replace(match, repl)

with open("file.txt", "w") as f:
    fw.write(content)

Upvotes: 1

Wayne Werner
Wayne Werner

Reputation: 51817

I would approach this problem in pieces. First, you want to be able to read from one file and write to a new file. You could do this a multitude of ways. If your file is small you can just use readlines(), truncate your original file, and then write it back out.

But I'm going to assume the possibility of huge files (i.e. larger than will easily fit in RAM/swap space. Currently several GB in size).

import os
import tempfile

with tempfile.NamedTemporaryFile(delete=False) as temp:
    with open(filename) as infile:
        for line in infile:
            temp.write(line)
    os.unlink(infile)
    os.rename(temp.name, infile.name)

Now we're reading each line and writing it out to the destination. Now all you need to do is intercept the line and change it up if that's necessary:

 for line in infile:
     match = re.search('{{.*?}}')
     if match:
          # Assumes you only have one "dictionary" per line
          first_part, rest = line.split('{', maxsplit=1)
          # allows for trailing data
          data, last_part = rest.split('}', maxsplit=1)
          data = [_.split(':') for _ in data.split(',')]
          data.sort()
          line = '{}{{{}}}{}'.format(first_part, ', '.join(':'.join(_) for _ in data))
     temp.write(line)

You might have to tweak with the exact algorithm, but that's the approach that I would take when confronted with a problem like this.

Upvotes: 1

Bharel
Bharel

Reputation: 26900

This should work:

import re

def openFind():
    with open("test.txt", "r") as in_file:
        data = in_file.read()

    def sub(m):
        l = [s.strip() for s in m.group(1).split(",")]
        l.sort()
        return "{%s}" % (", ".join(l),)

    replacement = re.sub(r'{(.*?)}', sub, data)
    with open("out.txt", "w") as out_file:
        out_file.write(replacement)

I have used re.sub() in order to replace with the sorted match in-place.

Upvotes: 2

niemmi
niemmi

Reputation: 17263

Following code will sort items between { & } and write the result to same file:

import re

with open('test.txt', 'r+') as f:
    s = f.read()
    r = list(s)
    for mo in re.finditer('{(.*?)}', s):
        d = sorted(mo.group(1).split(', '))
        r[mo.start(1):mo.end(1)] = list(', '.join(d))

    f.seek(0)
    f.write(''.join(r))

Upvotes: 1

Related Questions