Reputation: 319
I am very new in Julia, I got this challenge from the web:
How can I find the longest word in a given string?
I would like to build a function which would allow to obtain the longest string, even in cases where punctuation is used.
I was trying to to the following code:
function LongestWord(sen::String)
sentence =maximum(length(split(sen, "")))
word= [(x, length(x)) for x in split(sen, " ")]
return((word))
end
LongestWord("Hello, how are you? nested, punctuation?")
But I haven't manage to find the solution.
Upvotes: 1
Views: 935
Reputation: 372
My version specifically defines what symbols are allowable (in this case letters, numbers and spaces):
ALLOWED_SYMBOLS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890 \t\n"
function get_longest_word(text::String)::String
letters = Vector{Char}()
for symbol in text
if uppercase(symbol) in ALLOWED_SYMBOLS
push!(letters, symbol)
end
end
words = split(join(letters))
return words[indmax(length.(words))]
end
@time get_longest_word("Hello, how are you? nested, punctuation?")
"punctuation"
I doubt it's the most efficient code in the world, but it pulls 'ANTIDISESTABLISHMENTARIANISM' out of a 45,000-word dictionary in about 0.1 seconds. Of course, it won't tell me if there is more than one word of the maximum length! That's a problem for another day...
Upvotes: 0
Reputation: 5073
You can use regex too. It only needs a slight change from @Bogumil's answer:
julia> function LongestWord2(sen::AbstractString)
words = matchall(r"\w+", sen)
words[findmax(length.(words))[2]]
end
LongestWord2 (generic function with 1 method)
julia> LongestWord2("Hello, how are you? nested, punctuation?")
"punctuation"
This way you get rid of the punctuations and get the raw word back.
To consolidate the comments here's some further explanation:
matchall()
takes a regex, in this case r"\w+"
which matches word like substrings, so letters, numbers and lowercases and returns an array of strings that match the regex.
length.()
is using the combination of the length
function and .
which broadcasts the operation across all elements of the array. So we're counting the length of each array element (word).
Findmax()
returns a tuple of length 2 where the 2 argument gives us the index of the maximum element. I use this to subset the words
array and return the longest word.
Upvotes: 4
Reputation: 69949
I understand that you want to retain punctuation and want to split only on space (" "
). If this is the case then you can use findmax
. Note that I have changed the order of length(x)
and x
. In this way you will find the longest word, and among words of equal maximum length you will find the word that is last when using string comparison. Also I put AbstractString
in the signature of the function as it will work on any string:
julia> function LongestWord(sen::AbstractString)
word = [(length(x), x) for x in split(sen, " ")]
findmax(word)[1][2]
end
LongestWord (generic function with 1 method)
julia> LongestWord("Hello, how are you? nested, punctuation?")
"punctuation?"
This is the simplest solution but not the fastest (you could loop through the original string by searching consecutive occurrences of space without creating word
vector using findnext
function).
Other approach (even shorter):
julia> function LongestWord3(sen::AbstractString)
word = split(sen, " ")
word[indmax(length.(word))]
end
LongestWord3 (generic function with 1 method)
julia> LongestWord3("Hello, how are you? nested, punctuation?")
"punctuation?"
Upvotes: 3