Stephens
Stephens

Reputation: 537

How to find word with the greatest number of repeated letters

My goal is to find the word with greatest number of repeated letters in a given string. For example, "aabcc ddeeteefef iijjfff" would return "ddeeteefef" because "e" is repeated five times in this word and that is more than all other repeating characters.

So far this is what I got, but it has many problems and is not complete:

def LetterCountI(str)
  s = str.split(" ")
  i = 0
  result = []
  t = s[i].scan(/((.)\2+)/).map(&:max) 
  u = t.max { |a, b| a.length <=> b.length }
  return u.split(//).count 
end

The code I have only finds consecutive patterns; if the pattern is interrupted (such as with "aabaaa", it counts a three times instead of five).

Upvotes: 5

Views: 2838

Answers (4)

pjs
pjs

Reputation: 19855

Similar to sawa's answer:

"aabcc ddeeteefef iijjfff".split.max_by{|w| w.length - w.chars.uniq.length}
=> "ddeeteefef"

In Ruby 2.x, this works as-is because String#chars returns an array. In earlier versions of ruby, String#chars yields an enumerator so you need to add .to_a before applying uniq. I did my testing in Ruby 2.0, and overlooked this until it was pointed out by Stephens.

I believe this is valid, since the question was "greatest number of repeated letters in a given string" rather than greatest number of repeats for a single letter in a given string.

Upvotes: 3

Phrogz
Phrogz

Reputation: 303261

str.scan(/\w+/).max_by{ |w| w.chars.group_by(&:to_s).values.map(&:size).max }
  • scan(/\w+/) — create an array of all sequences of 'word' characters
  • max_by{ … } — find the word that gives the largest value inside this block
  • chars — split the string into characters
  • group_by(&:to_s) — create a hash mapping each character to an array of all the occurrences
  • values — just get all the arrays of the occurrences
  • map(&:size) — convert each array to the number of characters in that array
  • max — find the largest characters and use this as the result for max_by to examine

Edit: Written less compactly:

str.scan(/\w+/).max_by do |word|
  word.chars
      .group_by{ |char| char }
      .map{ |char,array| array.size }
      .max
end

Written less functionally and with less Ruby-isms (to make it look more like "other" languages):

words_by_most_repeated = []
str.split(" ").each do |word|
  count_by_char = {} # hash mapping character to count of occurrences
  word.chars.each do |char|
    count_by_char[ char ] = 0 unless count_by_char[ char ]
    count_by_char[ char ] += 1
  end
  maximum_count = 0
  count_by_char.each do |char,count|
    if count > maximum_count then
      maximum_count = count
    end
  end
  words_by_most_repeated[ maximum_count ] = word
end

most_repeated = words_by_most_repeated.last

Upvotes: 6

Arup Rakshit
Arup Rakshit

Reputation: 118271

I'd do as below :

s = "aabcc ddeeteefef iijjfff" 
# intermediate calculation that's happening in the final code
s.split(" ").map { |w| w.chars.max_by { |e| w.count(e) } }
# => ["a", "e", "f"] # getting the max count character from each word
s.split(" ").map { |w| w.count(w.chars.max_by { |e| w.count(e) }) }
# => [2, 5, 3] # getting the max count character's count from each word
# final code
s.split(" ").max_by { |w| w.count(w.chars.max_by { |e| w.count(e) }) }
# => "ddeeteefef"

update

each_with_object gives better result than group_by method.

require 'benchmark'

s = "aabcc ddeeteefef iijjfff" 

def phrogz(s)
   s.scan(/\w+/).max_by{ |word| word.chars.group_by(&:to_s).values.map(&:size).max }
end

def arup_v1(s)
    max_string = s.split.max_by do |w| 
       h = w.chars.each_with_object(Hash.new(0)) do |e,hsh|
         hsh[e] += 1
       end
       h.values.max
    end
end

def arup_v2(s)
   s.split.max_by { |w| w.count(w.chars.max_by { |e| w.count(e) }) }
end

n = 100_000
Benchmark.bm do |x|
  x.report("Phrogz:") { n.times {|i| phrogz s } }
  x.report("arup_v2:"){ n.times {|i| arup_v2 s } }
  x.report("arup_v1:"){ n.times {|i| arup_v1 s } }
end

output

            user     system      total        real
Phrogz:   1.981000   0.000000   1.981000 (  1.979198)
arup_v2:  0.874000   0.000000   0.874000 (  0.878088)
arup_v1:  1.684000   0.000000   1.684000 (  1.685168)

Upvotes: 4

sawa
sawa

Reputation: 168131

"aabcc ddeeteefef iijjfff"
.split.max_by{|w| w.chars.sort.chunk{|e| e}.map{|e| e.last.length}.max}
# => "ddeeteefef"

Upvotes: 2

Related Questions