Karan Thakkar
Karan Thakkar

Reputation: 1027

Most common character in a string

Write a function that takes a string consisting of alphabetic characters as input argument and returns the most common character. Ignore white spaces i.e. Do not count any white space as a character. Note that capitalization does not matter here i.e. that a lower case character is equal to a upper case character. In case of a tie between certain characters return the last character that has the most count

This is the updated code

def most_common_character (input_str):
    input_str = input_str.lower()
    new_string = "".join(input_str.split())
    print(new_string)
    length = len(new_string)
    print(length)
    count = 1
    j = 0
    higher_count = 0
    return_character = ""
    for i in range(0, len(new_string)):
        character = new_string[i]
        while (length - 1):
            if (character == new_string[j + 1]):
                count += 1
            j += 1
            length -= 1    
            if (higher_count < count):
                higher_count = count
    return (character)     

#Main Program
input_str = input("Enter a string: ")
result = most_common_character(input_str)
print(result)

The above is my code. I am getting an error of string index out of bound which I can't understand why. Also the code only checks the occurrence of first character I am confused about how to proceed to the next character and take the maximum count?

The error i get when I run my code:

> Your answer is NOT CORRECT Your code was tested with different inputs.
> For example when your function is called as shown below:
> 
> most_common_character ('The cosmos is infinite')
> 
> ############# Your function returns ############# e The returned variable type is: type 'str'
> 
> ######### Correct return value should be ######## i The returned variable type is: type 'str'
> 
> ####### Output of student print statements ###### thecosmosisinfinite 19

Upvotes: 2

Views: 6728

Answers (4)

Jitendra Meena
Jitendra Meena

Reputation: 21

from collections import Counter

def most_common_character(s):
    char_count = Counter(s)
    return char_count.most_common(1)[0][0]

# Example
string21 = "programming is fun"
common_char = most_common_character(string21)
print("Most Common Character:", common_char)

#Output = r

Upvotes: 0

Alexander
Alexander

Reputation: 109546

You can use a regex patter to search for all characters. \w matches any alphanumeric character and the underscore; this is equivalent to the set [a-zA-Z0-9_]. The + after [\w] means to match one or more repetitions.

Finally, you use Counter to total them and most_common(1) to get the top value. See below for the case of a tie.

from collections import Counter
import re

s = "Write a function that takes a string consisting of alphabetic characters as input argument and returns the most common character. Ignore white spaces i.e. Do not count any white space as a character. Note that capitalization does not matter here i.e. that a lower case character is equal to a upper case character. In case of a tie between certain characters return the last character that has the most count"

>>> Counter(c.lower() for c in re.findall(r"\w", s)).most_common(1)
[('t', 46)]

In the case of a tie, it is a little more tricky.

def top_character(some_string):
    joined_characters = [c for c in re.findall(r"\w+", some_string.lower())]
    d = Counter(joined_characters)
    top_characters = [c for c, n in d.most_common() if n == max(d.values())]
    if len(top_characters) == 1:
        return top_characters[0]
    reversed_characters = joined_characters[::-1]  
    for c in reversed_characters:
        if c in top_characters:
            return c

>>> top_character(s)
't'

>>> top_character('the the')
'e'

In the case of your code above and your sentence "The cosmos is infinite", you can see that 'i' occurs more frequently that 'e' (the output of your function):

>>> Counter(c.lower() for c in "".join(re.findall(r"[\w]+", 'The cosmos is infinite'))).most_common(3)
[('i', 4), ('s', 3), ('e', 2)]

You can see the issue in your code block:

for i in range(0, len(new_string)):
    character = new_string[i]
    ...
return (character)     

You are iterating through a sentence and assign that letter to the variable character, which is never reassigned elsewhere. The variable character will thus always return the last character in your string.

Upvotes: 4

Lafexlos
Lafexlos

Reputation: 7735

Actually your code is almost correct. You need to move count, j, length inside of your for i in range(0, len(new_string)) because you need to start over on each iteration and also if count is greater than higher_count, you need to save that charater as return_character and return it instead of character which is always last char of your string because of character = new_string[i].

I don't see why have you used j+1 and while length-1. After correcting them, it now covers tie situations aswell.

def most_common_character (input_str):
    input_str = input_str.lower()
    new_string = "".join(input_str.split())
    higher_count = 0
    return_character = ""

    for i in range(0, len(new_string)):
        count = 0
        length = len(new_string)
        j = 0
        character = new_string[i]
        while length > 0:
            if (character == new_string[j]):
                count += 1
            j += 1
            length -= 1    
            if (higher_count <= count):
                higher_count = count
                return_character = character
    return (return_character) 

Upvotes: 2

jfs
jfs

Reputation: 414335

If we ignore the "tie" requirement; collections.Counter() works:

from collections import Counter
from itertools import chain

def most_common_character(input_str):
    return Counter(chain(*input_str.casefold().split())).most_common(1)[0][0]

Example:

>>> most_common_character('The cosmos is infinite')
'i'
>>> most_common_character('ab' * 3)
'a'

To return the last character that has the most count, we could use collections.OrderedDict:

from collections import Counter, OrderedDict
from itertools import chain
from operator import itemgetter

class OrderedCounter(Counter, OrderedDict):
    pass

def most_common_character(input_str):
    counter = OrderedCounter(chain(*input_str.casefold().split()))
    return max(reversed(counter.items()), key=itemgetter(1))[0]

Example:

>>> most_common_character('ab' * 3)
'b'

Note: this solution assumes that max() returns the first character that has the most count (and therefore there is a reversed() call, to get the last one) and all characters are single Unicode codepoints. In general, you might want to use \X regular expression (supported by regex module), to extract "user-perceived characters" (eXtended grapheme cluster) from the Unicode string.

Upvotes: 1

Related Questions