comparing strings: file and lists

Question

I'm doing a application that will the user enter a string, then all possible permutations and delete repeated.

The words of the permutations obtained should be compared line by line until a line equal to the permutation, and repeat the process with the remaining permutations.

The file contains this information: manila ana maria marta

or file: espanol.dic

Here attached a bit of code:

# coding=utf8
from __future__ import print_function
import os, re, itertools

new_dic_file = "espanol.dic"

def uniq(lst):
    # remove repeated
    key = dict.fromkeys(lst).keys()
    lst = list(key)
    return lst

def match(chars, num_chars):
    # Get the permutations of input string
    combs = itertools.permutations(chars, num_chars)
    result = []
    for combo in combs:
        result.append("".join(combo))

    # Iterate to Spanish dictionary and compare combinations of input string
    dic = open(new_dic_file)
    aux = dic.readlines()
    del dic
    aux = uniq(aux)

    for word in result:
        for word_dic in aux:
            print()
            print(word, word_dic, end="")
            print(type(word), type(word_dic), end="")
            if word == word_dic:
                print(word)
                print("########## Found! ##########")

I was printing the kind of "word" and "word_dic", and type 2 are str therefore should work, which does not ... I'm testing with this: match("aan", 3)

and the result is this:

 
ana marta
 
ana ana
 
ana manila
 
naa maria

On what should be:

ana

#### Found!!

Any questions about what I do, please tell me ...

This is the complete code. test.py

Thank you in advance.

Francis Potter · Accepted Answer

The readlines method leaves the LF characters on the strings. So the strings read from the file have an extra character in them. That's visible in the output; notice that the type lines fall below the strings, even though there is end="" on the print statements. The string "ana" with a newline is never equal to "ana".

To fix it, remove the readlines() statement and replace it with this:

aux = dic.read().splitlines()

See here for more on readlines: Best method for reading newline delimited files in Python and discarding the newlines?

Or you could leave the readlines() there but add this:

aux = [s.rstrip() for s in aux]

comparing strings: file and lists

Answers (1)

Related Questions