valeria
valeria

Reputation: 131

check if word fits to a key from dictionary and write the value in another file if it's not already there

I want the script to search for keywords that are inside shortkey="(here)" from some file, then check if such word exists in dictionary, and if it does, take the value ​of that word from the dictionary and write it in new file. for some reason,​ my code below does not work ...

Also, how can I check if the values aren't repeated? i.e if there's already goodmorning written in newfile.txt, then do not rewrite it the second time.

keyword = {
      "shortkey":"longer sentence",
      "gm":"goodmorning",
      "etc":"etcetera"
}

with open('file.txt', 'r') as file:
   with open('newfile.txt', 'a') as newfile:
      lines = file.readlines()
      for line in lines:
         if 'shortkey="' in line:
            x = line.split('"')[1].split()
            if x == keyword.keys():
               for x, replacement in keyword.items():
                 newfile.write(replacement)

text inside the file.txt:

shortkey="gm gm gm etc shortkey novalue"
shortkey="gm"

expected output in newfile.txt:

goodmorning etcetera longer sentence

and when i run the code the n-th time, it shouldn't rewrite all these values again as they are already in the file.

Upvotes: 0

Views: 174

Answers (2)

John Gordon
John Gordon

Reputation: 33335

The first sample line in your file will yield ['gm', 'gm', 'gm', 'etc', 'shortkey', 'novalue'] after splitting. This is not equal to keyword.keys(), for several reasons:

  1. gm is repeated several times in the word list, but only appears once in the dict. (You could work around this by wrapping both sides of the comparison in a set() to remove duplicate values.)
  2. novalue is in the word list but not in the dict.
  3. The word list is very likely not in the same order as the dict keys. (Again, you could work around this by using set(), as sets are unordered.)

What is your intent here?

  1. Process only lines where each word is a dict key, and each dict key is in the line.
  2. Process only lines where each word is a dict key (it's okay if the dict has extra unused keys.)
  3. Process only lines that contain all the dict keys (it's okay if the line has extra words that aren't dict keys).
  4. Process all lines, replacing words if a replacement is available, otherwise use the original word.

Upvotes: 1

typedef struct James
typedef struct James

Reputation: 116

An OrderedDict will allow you to maintain the order, whereas a set will not. After we write a key to the new file we can set the keys value to an empty string to avoid re-writing it.


from collections import OrderedDict

keyword = {
      "shortkey":"longer sentence",
      "gm":"goodmorning",
      "etc":"etcetera"
}

with open('file.txt', 'r') as file:
   with open('newfile.txt', 'a') as newfile:
      lines = file.readlines()
      for line in lines:
         if 'shortkey="' in line:
            to_replace = line.split('"')[1].split()
            to_replace = OrderedDict.fromkeys(to_replace)
            for key in to_replace:
                if key in keyword:
                    newfile.write(keyword[key] + ' ')
                    keyword[key] = ''

Upvotes: 1

Related Questions