Mmt
Mmt

Reputation: 11

Python: Find the longest word in a string

I'm preparing for an exam but I'm having difficulties with one past-paper question. Given a string containing a sentence, I want to find the longest word in that sentence and return that word and its length. Edit: I only needed to return the length but I appreciate your answers for the original question! It helps me learn more. Thank you.

For example: string = "Hello I like cookies". My program should then return "Cookies" and the length 7.

Now the thing is that I am not allowed to use any function from the class String for a full score, and for a full score I can only go through the string once. I am not allowed to use string.split() (otherwise there wouldn't be any problem) and the solution shouldn't have too many for and while statements. The strings contains only letters and blanks and words are separated by one single blank.

Any suggestions? I'm lost i.e. I don't have any code.

Thanks.

EDIT: I'm sorry, I misread the exam question. You only have to return the length of the longest word it seems, not the length + the word.

EDIT2: Okay, with your help I think I'm onto something...

def longestword(x):
      alist = []
      length = 0
      for letter in x:
             if letter != " ":
                     length += 1
             else:
                     alist.append(length)
                     length = 0
      return alist

But it returns [5, 1, 4] for "Hello I like cookies" so it misses "cookies". Why? EDIT: Ok, I got it. It's because there's no more " " after the last letter in the sentence and therefore it doesn't append the length. I fixed it so now it returns [5, 1, 4, 7] and then I just take the maximum value.

I suppose using lists but not .split() is okay? It just said that functions from "String" weren't allowed or are lists part of strings?

Upvotes: 0

Views: 36448

Answers (14)

walidOmz
walidOmz

Reputation: 1

in case you want to ignore punctuation as well here is a more Pythonic way: this will return the longest string you can use len to get length and return it

from string import punctuation
def longest_word(str):
   return max("".join([s for s in str if s not in punctuation]).split(" "), key=len)

in case you have two strings with the same length the first one that appeared will be returned

simple Inputs,Outputs:

print(longest_word("test1 test2")) # -> test1
print(longest_word("t.est1: test2")) # -> test1
print(longest_word("hy: hello")) # -> hello 

Upvotes: 0

rui girao
rui girao

Reputation: 11

found an error in a previous provided solution, he's the correction:

def longestWord(text):
    
    current_word = ''
    current_longest = ''
    for c in text:
        if c in string.ascii_letters:
            current_word += c
        else:
            if len(current_word)>len(current_longest):
                current_longest = current_word
            current_word = ''    

    if len(current_word)>len(current_longest):
        current_longest = current_word
    return   current_longest

Upvotes: 1

Azlan Siddiqui
Azlan Siddiqui

Reputation: 1

import re

def longest_word(sen):
  res = re.findall(r"\w+",sen)
  n = max(res,key = lambda x : len(x))
  return n

print(longest_word("Hey!! there, How is it going????"))

Output : there

Here I have used regex for the problem. Variable "res" finds all the words in the string and itself stores them in the list after splitting them. It uses split() to store all the characters in a list and then regex does the work.

findall keyword is used to find all the desired instances in a string. Here \w+ is defined which tells the compiler to look for all the words without any spaces.

Variable "n" finds the longest word from the given string which is now free of any undesired characters.

Variable "n" uses lambda expressions to define the key len() here.

Variable "n" finds the longest word from "res" which has removed all the non-string charcters like %,&,! etc.

>>>#import regular expressions for the problem.**
>>>import re

>>>#initialize a sentence
>>>sen = "fun&!! time zone"

>>>res = re.findall(r"\w+",sen)
>>>#res variable finds all the words and then stores them in a list.

>>>res
Out: ['fun','time','zone']

>>>n = max(res)
Out: zone

>>>#Here we get "zone" instead of "time" because here the compiler
>>>#sees "zone" with the higher value than "time".
>>>#The max() function returns the item with the highest value, or the item with the highest value in an iterable.

>>>n = max(res,key = lambda x:len(x))
>>>n
Out: time

Here we get "time" because lambda expression discards "zone" as it sees the key is for len() in a max() function.

Upvotes: 0

Punith Sharma
Punith Sharma

Reputation: 1

list1 = ['Happy', 'Independence', 'Day', 'Zeal']
listLen = []
for i in list1:
          listLen.append(len(i))
print list1[listLen.index(max(listLen))]

Output - Independence

Upvotes: -1

Aneesh Kumar
Aneesh Kumar

Reputation: 21

For python 3. If both the words in the sentence is of the same length, then it will return the word that appears first.

def findMaximum(word):
    li=word.split()
    li=list(li)
    op=[]
    for i in li:
        op.append(len(i))
    l=op.index(max(op))
    print (li[l])
findMaximum(input("Enter your word:"))

Upvotes: 2

Nishant Goutham kumar
Nishant Goutham kumar

Reputation: 19

It's quite simple:

def long_word(s):
    n = max(s.split())
    return(n)

IN [48]: long_word('a bb ccc dddd')

Out[48]: 'dddd'

Upvotes: 1

Jerome Vacher
Jerome Vacher

Reputation: 324

My proposal ...

import re
def longer_word(sentence):
    word_list = re.findall("\w+", sentence)
    word_list.sort(cmp=lambda a,b: cmp(len(b),len(a)))
    longer_word = word_list[0]
    print "The longer word is '"+longer_word+"' with a size of", len(longer_word), "characters."
longer_word("Hello I like cookies")

Upvotes: 0

Malik Brahimi
Malik Brahimi

Reputation: 16711

Just search for groups of non-whitespace characters, then find the maximum by length:

longest = len(max(re.findall(r'\S+',string), key = len))

Upvotes: 2

spectras
spectras

Reputation: 13542

Given you are not allowed to use string.split() I guess using a regexp to do the exact same thing should be ruled out as well.

I do not want to solve your exercise for you, but here are a few pointers:

  • Suppose you have a list of numbers and you want to return the highest value. How would you do that? What information do you need to track?
  • Now, given your string, how would you build a list of all word lengths? What do you need to keep track of?
  • Now, you only have to intertwine both logics so computed word lengths are compared as you go through the string.

Upvotes: 0

Francis Colas
Francis Colas

Reputation: 3647

Finding a max in one pass is easy:

current_max = 0
for v in values:
    if v>current_max:
        current_max = v

But in your case, you need to find the words. Remember this quote (attribute to J. Zawinski):

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

Besides using regular expressions, you can simply check that the word has letters. A first approach is to go through the list and detect start or end of words:

current_word = ''
current_longest = ''
for c in mystring:
    if c in string.ascii_letters:
        current_word += c
    else:
        if len(current_word)>len(current_longest):
            current_longest = current_word
            current_word = ''
else:
    if len(current_word)>len(current_longest):
        current_longest = current_word

A final way is to split words in a generator and find the max of what it yields (here I used the max function):

def split_words(mystring):
    current = []
    for c in mystring:
        if c in string.ascii_letters:
            current.append(c)
        else:
            if current:
                yield ''.join(current)
max(split_words(mystring), key=len)

Upvotes: 2

Brionius
Brionius

Reputation: 14098

This method uses only one for loop, doesn't use any methods in the String class, strictly accesses each character only once. You may have to modify it depending on what characters count as part of a word.

s = "Hello I like cookies"
word = ''
maxLen = 0
maxWord = ''
for c in s+' ':
    if c == ' ':
        if len(word) > maxLen:
            maxWord = word
        word = ''
    else:
        word += c


print "Longest word:", maxWord
print "Length:", len(maxWord)

Upvotes: 0

Omid
Omid

Reputation: 2667

Regular Expressions seems to be your best bet. First use re to split the sentence:

>>> import re
>>> string = "Hello I like cookies"
>>> string = re.findall(r'\S+',string)

\S+ looks for all the non-whitespace characters and puts them in a list:

>>> string
['Hello', 'I', 'like', 'cookies']

Now you can find the length of the list element containing the longest word and then use list comprehension to retrieve the element itself:

>>> maxlen = max(len(word) for word in string)
>>> maxlen
7
>>> [word for word in string if len(word) == maxlen]
['cookies']

Upvotes: 0

Alexandre
Alexandre

Reputation: 1683

You can try to use regular expressions:

import re

string = "Hello I like cookies"
word_pattern = "\w+"

regex = re.compile(word_pattern)
words_found = regex.findall(string)

if words_found:
    longest_word = max(words_found, key=lambda word: len(word))
    print(longest_word)

Upvotes: 3

Jon Surrell
Jon Surrell

Reputation: 9637

I can see imagine some different alternatives. Regular expressions can probably do much of the splitting words you need to do. This could be a simple option if you understand regexes.

An alternative is to treat the string as a list, iterate over it keeping track of your index, and looking at each character to see if you're ending a word. Then you just need to keep the longest word (longest index difference) and you should find your answer.

Upvotes: 0

Related Questions