efuzz
efuzz

Reputation: 117

Trying to figure out why an empty string is appended to the list

My program will compare two paragraphs and return similar lines in a list. So split every lines in a list and compared them. Similar lines are append to a list. However, the outcome includes an empty string. Please help me figure out where its coming from.

story1 = '''This is a story.
This has multiple lines.
All lines will be split.
This is the last line.
'''

story2 = '''This is a new story.
This has multiple lines.
All lines will be split.
This is the not last line.
This is a story.
'''

lines1 = story1.split("\n")
lines2 = story2.split("\n")
similarities = []

#print(lines1)
#print(lines2)

for line in lines1:
    if line in lines2:
        similarities.append(line)

print(similarities)



Upvotes: 0

Views: 36

Answers (3)

ncica
ncica

Reputation: 7206

define your stoy1 and story2 to avoid an empty line, like:

story1 = '''This is a story.
This has multiple lines.
All lines will be split.
This is the last line.'''

or you can put:

if line in lines2 and line != '':

code:

story1 = '''This is a story.
This has multiple lines.
All lines will be split.
This is the last line.'''

story2 = '''This is a new story.
This has multiple lines.
All lines will be split.
This is the not last line.
This is a story.'''

lines1 = story1.split("\n")
lines2 = story2.split("\n")
similarities = []

for line in lines1:
    #if line in lines2 and line != '':
    if line in lines2:
        similarities.append(line)

print(similarities)

Upvotes: 1

Srihari Jayaram
Srihari Jayaram

Reputation: 16

Good day to you, Kan.

The reason you find the empty string appended to your similars is that you do actually have an empty line in both your stories.

story1 = '''This is a story.
This has multiple lines.
All lines will be split.
This is the last line.'''

story2 = '''This is a new story.
This has multiple lines.
All lines will be split.
This is the not last line.
This is a story.'''

The above won't append an empty line as the trailing '\n' has been removed.

Upvotes: 0

Avishay Cohen
Avishay Cohen

Reputation: 2218

the output of lines1 and lines2:

In [2]: lines1
Out[2]:
['This is a story.',
 'This has multiple lines.',
 'All lines will be split.',
 'This is the last line.',
 '']

In [3]: lines2
Out[3]:
['This is a new story.',
 'This has multiple lines.',
 'All lines will be split.',
 'This is the not last line.',
 'This is a story.',
 '']

both lists has an empty string which is the result of splitting on "\n" with a multiline block. that's why they both have it as a "similarities"

Upvotes: 0

Related Questions