Reputation: 89
So I'm trying to do a cosine similarity with a text file I have. https://lms.uwa.edu.au/bbcswebdav/pid-1143173-dt-content-rid-16133365_1/courses/CITS1401_SEM-2_2018/CITS1401_SEM-2_2018_ImportedContent_20180713092326/CITS1401_SEM-1_2018/Unit%20Content/Resources/Project2_2018/sample.txt
I'm wondering how I print this sentence by sentence and not readline() to read line by line. I'm trying to create the sentence variables. For example
s1 = "the mississippi is well worth reading about"
s2 = "it is not a commonplace river, but on the contrary is in all ways remarkable"
Is this first the way to go about it? If it is, my next step which I know how to do is remove the common words from the sentences and only leave unique words to compare with.
How do I stop at the full-stop and then store that sentence to a variable who looping through the text?
Thanks
Upvotes: 1
Views: 103
Reputation: 143
Do you mean this:
with open("file.txt",'r') as in_f:
sentences = in_f.read().replace('\n','').split('.')
for each s in sentences:
#your code
Upvotes: 1