CodeDependency
CodeDependency

Reputation: 135

Ruby Counting The Number of Unique Chars in a String

I am working with a string of chars alphabet ="AABBBCCCCDDDDDEFGHIJKLMNOPQRSTUVWXYZZZZZZ" I want to create a def that counts the number of unique chars in a string and the percentage of unique chars without having to use alphabet.count("A"), alphabet.count"("B"), alphabet.count("C"), etc etc so I don't have to waste time tediously entering each char into the .count() method.

I have succeeded in one sense, that I get the output I wanted, but the output repeats each result numerous times due to how I structured my for loop

Here is my code:

alphabet ="AABBBCCCCDDDDDEFGHIJKLMNOPQRSTUVWXYZZZZZZ"

def count_num_of_uniq_chars(string)
  len = string.length
  len = len.to_f
  for i in 0..len-1

    uniq_char=string[i]
    puts "uniq_chars --> #{uniq_char}"

    count_of_uniq_char = string.count(string[i])

    puts "count_of_uniq_char--> #{count_of_uniq_char}"


    percent_of_uniq_char = ( (count_of_uniq_char / len) * 100 )
    percent_of_uniq_char=percent_of_uniq_char.to_f

    puts "there are #{count_of_uniq_char} letter '#{uniq_char}'s in the string which is #{percent_of_uniq_char}% of strings length "
    puts
  end # loop end

end #def end

count_num_of_uniq_chars(alphabet)

Outputs as:

uniq_chars --> A
count_of_uniq_char--> 2
there are 2 letter 'A's in the string which is 4.878048780487805% of strings length

uniq_chars --> A
count_of_uniq_char--> 2
there are 2 letter 'A's in the string which is 4.878048780487805% of strings length

uniq_chars --> B
count_of_uniq_char--> 3
there are 3 letter 'B's in the string which is 7.317073170731707% of strings length

uniq_chars --> B
count_of_uniq_char--> 3
there are 3 letter 'B's in the string which is 7.317073170731707% of strings length

uniq_chars --> B
count_of_uniq_char--> 3
there are 3 letter 'B's in the string which is 7.317073170731707% of strings length

uniq_chars --> C
count_of_uniq_char--> 4
there are 4 letter 'C's in the string which is 9.75609756097561% of strings length

uniq_chars --> C
count_of_uniq_char--> 4
there are 4 letter 'C's in the string which is 9.75609756097561% of strings length

uniq_chars --> C
count_of_uniq_char--> 4
there are 4 letter 'C's in the string which is 9.75609756097561% of strings length

uniq_chars --> C
count_of_uniq_char--> 4
there are 4 letter 'C's in the string which is 9.75609756097561% of strings length

uniq_chars --> D
count_of_uniq_char--> 5
there are 5 letter 'D's in the string which is 12.195121951219512% of strings length

uniq_chars --> D
count_of_uniq_char--> 5
there are 5 letter 'D's in the string which is 12.195121951219512% of strings length

uniq_chars --> D
count_of_uniq_char--> 5
there are 5 letter 'D's in the string which is 12.195121951219512% of strings length

uniq_chars --> D
count_of_uniq_char--> 5
there are 5 letter 'D's in the string which is 12.195121951219512% of strings length

uniq_chars --> D
count_of_uniq_char--> 5
there are 5 letter 'D's in the string which is 12.195121951219512% of strings length

uniq_chars --> E
count_of_uniq_char--> 1
there are 1 letter 'E's in the string which is 2.4390243902439024% of strings length

uniq_chars --> F
count_of_uniq_char--> 1
there are 1 letter 'F's in the string which is 2.4390243902439024% of strings length

uniq_chars --> G
count_of_uniq_char--> 1
there are 1 letter 'G's in the string which is 2.4390243902439024% of strings length

uniq_chars --> H
count_of_uniq_char--> 1
there are 1 letter 'H's in the string which is 2.4390243902439024% of strings length

uniq_chars --> I
count_of_uniq_char--> 1
there are 1 letter 'I's in the string which is 2.4390243902439024% of strings length

uniq_chars --> J
count_of_uniq_char--> 1
there are 1 letter 'J's in the string which is 2.4390243902439024% of strings length

uniq_chars --> K
count_of_uniq_char--> 1
there are 1 letter 'K's in the string which is 2.4390243902439024% of strings length

uniq_chars --> L
count_of_uniq_char--> 1
there are 1 letter 'L's in the string which is 2.4390243902439024% of strings length

uniq_chars --> M
count_of_uniq_char--> 1
there are 1 letter 'M's in the string which is 2.4390243902439024% of strings length

uniq_chars --> N
count_of_uniq_char--> 1
there are 1 letter 'N's in the string which is 2.4390243902439024% of strings length

uniq_chars --> O
count_of_uniq_char--> 1
there are 1 letter 'O's in the string which is 2.4390243902439024% of strings length

uniq_chars --> P
count_of_uniq_char--> 1
there are 1 letter 'P's in the string which is 2.4390243902439024% of strings length

uniq_chars --> Q
count_of_uniq_char--> 1
there are 1 letter 'Q's in the string which is 2.4390243902439024% of strings length

uniq_chars --> R
count_of_uniq_char--> 1
there are 1 letter 'R's in the string which is 2.4390243902439024% of strings length

uniq_chars --> S
count_of_uniq_char--> 1
there are 1 letter 'S's in the string which is 2.4390243902439024% of strings length

uniq_chars --> T
count_of_uniq_char--> 1
there are 1 letter 'T's in the string which is 2.4390243902439024% of strings length

uniq_chars --> U
count_of_uniq_char--> 1
there are 1 letter 'U's in the string which is 2.4390243902439024% of strings length

uniq_chars --> V
count_of_uniq_char--> 1
there are 1 letter 'V's in the string which is 2.4390243902439024% of strings length

uniq_chars --> W
count_of_uniq_char--> 1
there are 1 letter 'W's in the string which is 2.4390243902439024% of strings length

uniq_chars --> X
count_of_uniq_char--> 1
there are 1 letter 'X's in the string which is 2.4390243902439024% of strings length

uniq_chars --> Y
count_of_uniq_char--> 1
there are 1 letter 'Y's in the string which is 2.4390243902439024% of strings length

uniq_chars --> Z
count_of_uniq_char--> 6
there are 6 letter 'Z's in the string which is 14.634146341463413% of strings length

uniq_chars --> Z
count_of_uniq_char--> 6
there are 6 letter 'Z's in the string which is 14.634146341463413% of strings length

uniq_chars --> Z
count_of_uniq_char--> 6
there are 6 letter 'Z's in the string which is 14.634146341463413% of strings length

uniq_chars --> Z
count_of_uniq_char--> 6
there are 6 letter 'Z's in the string which is 14.634146341463413% of strings length

uniq_chars --> Z
count_of_uniq_char--> 6
there are 6 letter 'Z's in the string which is 14.634146341463413% of strings length

uniq_chars --> Z
count_of_uniq_char--> 6
there are 6 letter 'Z's in the string which is 14.634146341463413% of strings length




Notice the output statement per letter repeats based upon how many times that letter occurs in the string. How can I get it to output once per letter regardless of how many occurrences are in the string?

Upvotes: 1

Views: 2729

Answers (4)

Cary Swoveland
Cary Swoveland

Reputation: 110675

Here are three ways to do that.

alphabet = "AABBBCCCCDDDDDEFGHIJKLMNOPQRSTUVWXYZZZZZZ"

Use the method Enumerable#tally, which made its debut in Ruby 2.7.0

h = alphabet.each_char.tally
  #=> {"A"=>2, "B"=>3, "C"=>4,..., "Z"=>6}

Use the form of the class method Hash::new that takes an argument of zero (but no block), the argument being the hash's default value

h = alphabet.each_char.with_object(Hash.new(0)) { |c,h| h[c] += 1 }
  #=> {"A"=>2, "B"=>3, "C"=>4,..., "Z"=>6}

h[c] += 1 expands to h[c] = h[c] + 1. If h does not have a key c, h[c] on the right of the equality returns the default value of zero, yielding h[c] = 0 + 1.

Use the method Enumerable#group_by

h = alphabet.each_char.
             group_by(&:itself).
             transform_values(&:count)
  #=> <same as above>

See Hash#transform_values.

The steps are as follows:

enum = alphabet.each_char
  #=> #<Enumerator: "AABBB...ZZZ":each_char> 
a = enum.group_by(&:itself)
  #=> {"A"=>["A", "A"], "B"=>["B", "B", "B"],...,
  #          "Z"=>["Z", "Z", "Z", "Z", "Z", "Z"]} 
a.transform_values(&:count)
  #=> {"A"=>2, "B"=>3,..., "Z"=>6} 

Using the hash

Once you have the hash you can display information as you wish. For example:

n = alphabet.size
  #=> 41  
h.each { |k,v| puts "#{v} #{k}'s #{(100*v.fdiv(n)).round(2)}%" }
2 A's 4.88%
3 B's 7.32%
4 C's 9.76%
...
1 X's 2.44%
1 Y's 2.44%
6 Z's 14.63%

Upvotes: 4

Yashvi Patel
Yashvi Patel

Reputation: 33

It can be done like this

def count_occ(str)
    d=Hash.new(0)
    str.split('').each do |ch|
            d[ch]=d[ch]+1
    end
    d.each do |key,value|
        count_ch=value
        percentage=count_ch/Float(str.length)
        puts "there are #{count_ch} letter '#{key}'s in the string which is #{percentage}% of strings length "
    end

end

Upvotes: 0

Stefan
Stefan

Reputation: 114158

I want to create a def that counts the number of unique chars in a string [...]

You can get the string's chars via String#each_char and have Enumerable#tally count the occurrences: (tally requires Ruby 2.7)

alphabet.each_char.tally
#=> {
#     "A"=>2, "B"=>3, "C"=>4, "D"=>5, "E"=>1, "F"=>1, "G"=>1,
#     "H"=>1, "I"=>1, "J"=>1, "K"=>1, "L"=>1, "M"=>1, "N"=>1,
#     "O"=>1, "P"=>1, "Q"=>1, "R"=>1, "S"=>1, "T"=>1, "U"=>1,
#     "V"=>1, "W"=>1, "X"=>1, "Y"=>1, "Z"=>6
#   }

To get the percentages, you simply divide the char's occurrences by the total number of chars, e.g.:

hash = alphabet.each_char.tally
hash.each do |char, count|
  q = count.quo(hash.size)
  puts format(" %s | %d | %4.1f%%", char, count, q * 100)
end

Output:

 A | 2 |  7.7%
 B | 3 | 11.5%
 C | 4 | 15.4%
 D | 5 | 19.2%
 E | 1 |  3.8%
 F | 1 |  3.8%
 G | 1 |  3.8%
 H | 1 |  3.8%
 I | 1 |  3.8%
 J | 1 |  3.8%
 K | 1 |  3.8%
 L | 1 |  3.8%
 M | 1 |  3.8%
 N | 1 |  3.8%
 O | 1 |  3.8%
 P | 1 |  3.8%
 Q | 1 |  3.8%
 R | 1 |  3.8%
 S | 1 |  3.8%
 T | 1 |  3.8%
 U | 1 |  3.8%
 V | 1 |  3.8%
 W | 1 |  3.8%
 X | 1 |  3.8%
 Y | 1 |  3.8%
 Z | 6 | 23.1%

Instead of hash.size (number of unique chars) you could also divide by alphabet.size (number of chars in the string), depending on what you want.

Upvotes: 5

Sebasti&#225;n Palma
Sebasti&#225;n Palma

Reputation: 33420

You can use string.chars.uniq and get rid of len, the for loop and the uniq_char initialization:

def count_num_of_uniq_chars(string)
  string.chars.uniq.each do |uniq_char|
    puts "uniq_chars --> #{uniq_char}"

    count_of_uniq_char = string.count(uniq_char)

    puts "count_of_uniq_char--> #{count_of_uniq_char}"


    percent_of_uniq_char = ( (count_of_uniq_char / string.length.to_f) * 100 )
    percent_of_uniq_char=percent_of_uniq_char.to_f

    puts "there are #{count_of_uniq_char} letter '#{uniq_char}'s in the string which is #{percent_of_uniq_char}% of strings length \n\n"
  end
end

See String#chars and Array#uniq.

Notice the percent_of_uniq_char is calculated as the count_of_uniq_char divided the length of the string converted to float. If that's a problem for this case, you can initialize it outside the loop.

Upvotes: 1

Related Questions