Teriyaki
Teriyaki

Reputation: 29

How to get the longest word in txt file python

article = open("article.txt", encoding="utf-8")
for i in article:
    print(max(i.split(), key=len))

The text is written with line breaks, and it gives me the longest words from each line. How to get the longest word from all of the text?

Upvotes: 0

Views: 130

Answers (5)

romainguichard
romainguichard

Reputation: 11

If your file is large enough to fit in memory, you can read all line at once.

file = open("article.txt", encoding="utf-8", mode='r')
all_text = file.read()
longest = max(i.split(), key=len)
print(longest)

Upvotes: 0

Pieter Geelen
Pieter Geelen

Reputation: 548

There are many ways by which you could do that. This would work

with open("article.txt", encoding="utf-8") as article:
    txt =  [word for item in article.readlines() for word in item.split(" ")]
    
biggest_word = sorted(txt, key=lambda word: (-len(word), word), )[0]

Note that I am using a with statement to close the connection to the file when the reading is done, that I use readlines to read the entire file, returing a list of lines, and that I unpack the split items twice to get a flat list of items. The last line of code sorts the list and uses -len(word) to inverse the sorting from ascending to descending.

I hope this is what you are looking for :)

Upvotes: 0

vht981230
vht981230

Reputation: 4498

Instead of iterating through each line, you can get the entire text of the file and then split them using article.readline().split()

article = open("test.txt", encoding="utf-8")
print(max(article.readline().split(), key=len))
article.close()

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521103

One approach would be to read the entire text file into a Python string, remove newlines, and then find the largest word:

with open('article.text', 'r') as file:
    data = re.sub(r'\r?\n', '', file.read())

longest_word = max(re.findall(r'\w+', data), key=len)

Upvotes: 1

zkscpqm
zkscpqm

Reputation: 89


longest = 0
curr_word = ""
with open("article.txt", encoding="utf-8") as f:
    for line in f:
        for word in line.split(" "):  # Use line-by-line generator to avoid loading large file in memory
            word = word.strip()
            if (wl := len(word)) > longest:  # Python 3.9+, otherwise use 2 lines
                longest = wl
                curr_word = word
print(curr_word)

Upvotes: 0

Related Questions