kennechu
kennechu

Reputation: 1422

Extract data from file with python and write new file

I'm trying to extract data from a file with this structure

        //Side Menu
        market: 'Market',
        store: 'Store',
        stores: 'Stores',
        myNotes: 'My Notes',
        logout: 'Logout',
        //Toast
        activeUserHasChanged: 'Resetting app - the active user has changed.',
        loginHasExpired: 'Your login has expired.',
        appIsReseting: 'The app is resetting.',

what I want is to extract the all the text that is between single quotation marks and put it in a new file, I think Python could be a good option but I new to programming and Python, I tried something but no luck and for what I've read it shouldn't be a small script.

My expected output is:

         Market,
         Store,
         Stores,
         My Notes,
         Logout,
         Resetting app - the active user has changed,
         Your login has expired,
         The app is resetting,

So any help on this will be appreciated.

Regards.

Upvotes: 2

Views: 179

Answers (2)

SuWon
SuWon

Reputation: 23

Assuming you have input as a text file

import re
fid = open('your input file','rb')
output = open('output file','wb')
for i in fid:
    m = re.match(r"['\"](.*?)['\"]",i)
    if m is not None:
        output.write(m.group(1)+'\r\n')
fid.close()
output.close()

r"'\"['\"]" this regex will let you find anything between single quotation. If nothing found, then skip. Hope this is helpful.

Upvotes: 1

Aske Doerge
Aske Doerge

Reputation: 1391

A simple solution is something like:

in_string = False
with open('infile.txt','r') as fr, open('outfile.txt','w') as fw:
  for char in fr.read():
    if char == "'":
      in_string = in_string != True  # XOR
    elif in_string:
      fw.write(char)

The intuition is that we read the file character-by-character and keep track of any ' we see along the way. When we encounter the first, we write the next characters to the output file until we encounter the second, etc.

It does not handle invalid input, and doesn't do buffering or anything fancy. But if you just have small files, which are well-formed this is should do it. It also doesn't format your output in lines with commas, but that shouldn't be too hard to do from here.

Upvotes: 2

Related Questions