Viacheslav Kondratiuk
Viacheslav Kondratiuk

Reputation: 8879

What data structure is better to use to find if sentence consist of unique characters?

I'm trying to solve a task and not sure if I'm using suitable data structure for it. My task is to find if sentence consist of unique characters and as a result return boolean value.

Here is my function:

bool use_map(string sentence) {
    map<int, string> my_map;

    for (string::size_type i = 0; i <= sentence.length(); i++) {
        unsigned int index = (int)sentence[i];    
        if (my_map.find(index) != my_map.end())
            return false;       
        my_map[index] = sentence[i];
    }

    return true;    
}

I found only map structure which is suitable for me. Maybe I miss something?

Maybe it's better to use something like dynamic arrays at PHP?


I'm trying to use hash table solution.

Upvotes: 0

Views: 320

Answers (7)

Alex Chamberlain
Alex Chamberlain

Reputation: 4207

What about this? There is a case issue of course...

bool use_map(const std::string& sentence)
{
    std::vector<bool> chars(26, false);
    for(std::string::const_iterator i = sentence.begin(); i != sentence.end(); ++i) {
        if(*i == ' ' || *i - 'a' > 25 || *i - 'a' < 0) {
            continue;
        } else if(chars[*i - 'a']) {
            return false;
        } else {
            chars[*i - 'a'] = true;
        }
    }

    return true;
}

Upvotes: 2

Spatarel
Spatarel

Reputation: 242

Here is the fastest possible solution:

bool charUsed[256];
bool isUnique(string sentence) {
    int i;
    for(i = 0; i < 256; ++i) {
        charUsed[i] = false;
    }

    int n = s.size();
    for(i = 0; i < n; ++i) {
        if (charUsed[(unsigned char)sentence[i]]) {
            return false;
        }
        charUsed[(unsigned char)sentence[i]] = true;
    }
    return true;
}

Upvotes: 0

jrok
jrok

Reputation: 55395

A very simple (but rather memory expensive) way would be:

bool use_map(const std::string& sentence)
{
    std::set<char> chars(sentence.begin(), sentence.end());
    return chars.size() == sentence.size();
}

If there's no duplicate chars, the sizes of both string and set will be equal.

@Jonathan Leffler raises a good point in the comments: sentences usualy contain several whitespaces, so this will return false. You'll want to filter spaces out. Still, std::set should be your container of choice.

Edit:

Here's an idea for O(n) solution with no additional memory. Just use a look-up table where you mark if the char was seen before:

bool no_duplicates(const std::string& sentence)
{
    static bool table[256];
    std::fill(table, table+256, 0);

    for (char c : sentence) {

        // don't test spaces
        if (c == ' ') continue;
        // add more tests if needed

        const unsigned char& uc = static_cast<unsigned char>(c);
        if (table[uc]) return false;
        table[uc] = true;
    }
    return true;
}

Upvotes: 4

Grigor Gevorgyan
Grigor Gevorgyan

Reputation: 6853

If you are allowed to use additional memory, use a hash table:
Iterate through the array, check if current element has already been hashed. If yes, you found a repetition. If no, add it to hash. This will be linear, but will require additional memory.

If the range of original sequence elements is quite small, instead of hashing you can simply have an array of the range size and do like in a bucket sort. For example

bool hasDuplicate( string s )
{
   int n = s.size();
   vector<char> v( 256, 0 );
   for( int i = 0; i < n; ++i )
      if( v[ s[ i ] ] ) // v[ hash( s[i] ) ] here in case of hash usage
         return true;
      else
         v[ s[ i ] ] = 1; // and here too
   return false;
}

Finally, if you are not allowed to use additional memory, you can just sort it and check if two adjacent elements are equal in one pass. This will take O(nlogn) time. No need for sets or maps :)

Upvotes: 0

Pete Becker
Pete Becker

Reputation: 76315

Sort the characters and then look for an adjacent pair of alphabetic characters with both characters equal. Something like this:

std::string my_sentence = /* whatever */
std::sort(my_sentence.begin(), my_sentence.end());
std::string::const_iterator it =
    std::adjacent_find(my_sentence.begin(), my_sentence.end());
while (it != my_sentence.end() && isalpha((unsigned char)*it)
    it = std::adjacent_find(++it, my_sentence.end());
if (it == my_sentence.end())
    std::cout << "No duplicates.\n";
else
    std::cout << "Duplicated '" << *it << "'.\n";

Upvotes: 1

Kiril Kirov
Kiril Kirov

Reputation: 38173

The other answers suggested std::set and that's a solution. BUT, they copy all chars inside the std::set and then get the size of the set. You don't really need this and you can avoid it, using the return value of std::set::insert. Something like:

std::set< char > my_set;
for (std::string::size_type ii = 0; ii < sentence.size(); ++ii) 
{
    if( ! my_set.insert( sentence[ ii ] ).second )
    {
        return false;
    }
}

This way you'll:

  • stop on the first duplicated char and you will not copy the whole string (unnecessarily)
  • you will avoid the unnecessary cast to int in your code
  • will save memory - if you don't actually need you std::map< int, std::string >::second

Also, make sure you need to "count" all chars or you want to skip some of them (like white spaces, commas, question marks, etc)

Upvotes: 4

Andy Prowl
Andy Prowl

Reputation: 126452

I guess an easy way is to store all the characters in an associative container that does not allow duplicates, such as std::set, and check if it contains a single value:

#include <set>
#include <string>

bool has_unique_character(std::string const& str)
{
    std::set<char> s(begin(str), end(str));
    return (s.size() == str.size());
}

Upvotes: 3

Related Questions