CodingKingggg
CodingKingggg

Reputation: 73

Word Count with Sorting C++

This is a question I have for my Data Structure class. I have completely no idea in dealing with it, can anyone give some hints please?

  1. How to stop the program and ensure the output can be out correctly?
  2. Whether I have to deal with the mapping?

the question paper I had from the professor

The following is my coding example:

#include <iostream>
#include <string>

using namespace std;

int main()
{
    string s [100];

    for (int i = 0; i < 100; i++) {
        cin >> s[i];
        s[i] = Sort(s[i], s[i+1]);
    }


    //check the number of time the words repeatcout the answer
    for (int i = 0; i < 100; i++) {
        cout << s[i] << count (s[i],s[i+1]) <<endl;
    }
    return 0;
}


string Sort(string current, string next ) {
    if (current > next) {
        string temp = current;
        current = next;
        next = temp;
    }
    else {
        return current;
    }
}

int count(string word, string Nextword) {
    int count;
    if (word == Nextword) {
        count++;
    }
    else {
        return count;
    }
}

Upvotes: 0

Views: 1649

Answers (4)

Wolfgang Brehm
Wolfgang Brehm

Reputation: 1721

std::map can do sorting and counting at the same time for you:

#include <map>
#include <iostream>
using std::cin;
using std::cout;
using std::endl;
using std::string;

int main() {
  std::map<string,size_t> wordcount;
  for(string word;cin>>word;++wordcount[word]);
  for(auto it=wordcount.begin();it!=wordcount.end();++it)
    cout << it->first << " " << it->second << endl;
}
echo -ne "Computer system\ncomputer design\nalgorithm design and analysis\nquantum computer\ncomputer science department" | ./a.out
Computer 1
algorithm 1
analysis 1
and 1
computer 3
department 1
design 2
quantum 1
science 1
system 1

Upvotes: 1

Manu Gond
Manu Gond

Reputation: 174

If using STL is not an option, which means if you can't use vector etc, then this simple solution might help. Also in your code you have declared the size of string array to 100 which i think you did because MAX size of array cannot be taken as input as per your problem statement.

You need a sorting method (I have used bubble sort), and after sorting the array, you need to just iterate over the array and count the words which is easy by just keeping track of upcoming words and comparing them with current word. Remember the code below will stop taking input as soon you enter an empty line.

Also i suggest you to learn about STL in C++, it will help you in competitive coding also.

    #include <iostream>
    using namespace std;

    //Forward declaration of functions 
    void sortStringArray(int index,string* s);
    void printWordCount(int index,string array[]);

    int main()
    {
        string s [100]; //considering maximum words will be upto 100 words
        string input;  //variable to keep track of each line input
        int index=0;   //index to keep track of size of populated array since we don't need to sort whole array of size 100 if it is not filled


        while (getline(cin, input) && !input.empty()){  //while line is not empty continue to take inputs
            string temp="";                     
            char previous=' ';                  
            for(int i=0;i<input.size();i++){            //loop to seperate the words by space or tab
                if(input[i]==' ' && previous!=' '){
                    s[index++]=temp;
                    temp="";
                }
                else if(input[i]!=' '){
                    temp+=input[i];
                }
                previous=input[i];
            }
            if(temp.size()!=0){
                s[index++] =temp;   
            }
        }

        //Step 1: sort the generated array by calling function with reference, thus function directly modifies the array and don't need to return anything
        sortStringArray(index,s);


        //Step 2: print each word count in sorted order
        printWordCount(index,s);


        return 0;
    }

    /*
    Function takes the "index" which is the size upto which array is filled and we need only filled elements
    Function takes stirng array by reference and uses Bubble Sort to sort the array
    */
    void sortStringArray(int index,string* s){
        for(int i=0;i<index-1;i++){
            for(int j=0;j<index-i-1;j++){
                if(s[j]>s[j+1]){
                    string temp=s[j];
                    s[j]=s[j+1];
                    s[j+1]=temp;
                }
            }
        }
    }


    /*
    Function takes the "index" which is the size upto which array is filled and we need only filled elements
    Function takes stirng array by reference and prints count of each word
    */
    void printWordCount(int index,string array[]){
        int count=1; //to keep track of the similar word count
        for(int i=0;i<index-1;i++){
            if(array[i]==array[i+1]){ //if current and upcoming words are same then increase the count
                count++;
            }
            else{                       //if next word is not equal to current word then print the count of current word and set counter to 1
                cout<<array[i]<<" "<<count<<endl;
                count=1;
            }
        }
        if(array[index-1]==array[index-2]){  //at end of array if the last and second last words were equal then print the count+1
            cout<<array[index-1]<<" "<<count+1<<endl;
        }
        else{                               //otherwise the count equal to the last count which will be "1"
            cout<<array[index-1]<<" "<<count;
        }
    }

Output:

    Computer system
    computer design
    algorithm design and analysis
    quantum computer
    computer science department

    Computer 1
    algorithm 1
    analysis 1
    and 1
    computer 3
    department 1
    design 2
    quantum 1
    science 1
    system 1
    --------------------------------
    Process exited after 1.89 seconds with return value 0

Upvotes: 0

AravindK
AravindK

Reputation: 151

Copy-paste this code and check with Multiple strings. Solution :

#include <iostream>
#include <string>
#include <vector>
#include <algorithm>

using namespace std;

vector<string> split(const string& i_str, const string& i_delim)
{
    vector<string> result;

    size_t found = i_str.find(i_delim);
    size_t startIndex = 0;

    while (found != string::npos)
    {
        string temp(i_str.begin() + startIndex, i_str.begin() + found);
        result.push_back(temp);
        startIndex = found + i_delim.size();
        found = i_str.find(i_delim, startIndex);
    }
    if (startIndex != i_str.size())
        result.push_back(string(i_str.begin() + startIndex, i_str.end()));
    return result;
}

void countFunc(vector<string> cal) {
    vector<pair<string, int>> result;
    for (int i = 0; i < cal.size(); i++)
    {
        string temp = cal[i];
        if (temp.empty())
        {
            continue;
        }
        int ncount = 1;
        int j = i+1;
        while(j < cal.size())
        {
            if (temp == cal[j])
            {
                ncount++;
                cal[j] = "";
            }
            j++;
        }
        result.push_back(make_pair(temp, ncount));
    }
    std::cout << "String\tCount\n";
    for (int i = 0; i < result.size(); i++)
    {
        cout << result[i].first << "\t" << result[i].second << endl;
    }
}

int main()
{
    vector<string> str;
    vector<string> res;
    printf("Enter the Number Line :");
    int size = 0;
    cin >> size;
    for (int i = 0; i < size+1; i++)
    {
        string s;
        getline(cin, s);
        if (s.empty())
        {
            continue;
        }
        else
        {
            str.push_back(s);
        }
    }
    for (int i = 0; i < size; i++)
    {
        vector<string> temp;
        temp = split(str[i], " ");
        res.insert(res.end(), temp.begin(), temp.end());
    }
    sort(res.begin(), res.end());
    countFunc(res);
}

Upvotes: 0

David C. Rankin
David C. Rankin

Reputation: 84561

Rather than attempting to use a basic array of string, you will need some way to keep track of the number of times each word is seen. You can either use a simple struct or a std::map. In either case you can associate a word and the number of times it is seen as a single object. If you then collect all struct containing the word and count in a std::vector as opposed to a basic array, you can provide a simple comparison function to use std::sort to sort the vector by the word while preserving the association of the count with each word.

Taking the approach of using a stuct, you can make a struct that contains a std::string and a counter such as:

 struct wordcount {      /* struct holding word and count */
    std::string word;
    size_t count;
};

The for a comparison function to sort the vector of wordcount by word, you can use a simple:

/* compare function to sort vector of struct by words */
bool cmp (const wordcount& a, const wordcount& b)
{
    return a.word < b.word;
}

Using a struct, you will need to iterate over the words you have seen so far to determine if you simply need to increment the count on an existing word or to add a new wordcount struct to your vector with the count = 1; To make the function useful, you can have it return the index within the vector (loosely equivalent to the index in in an array) if the word already exists, or return -1 if it doesn't.

/* interate over each struct in vector words to find word */
int findword (const std::vector<wordcount>& words, 
                const std::string& word)
{
    for (auto w = words.begin(); w != words.end(); w++)
        if (w->word == word)            /* if word found */
            return w - words.begin();   /* return index */

    return -1;  /* return word not found */
}

Based on the return, you can either increment the count at the index, or add a new wordcount to your vector. A short implementation using the above would be:

int main (int argc, char **argv) {

    if (argc < 2) { /* validate filename given as argument */
        std::cerr << "error: insufficient input.\n"
                << "usage: " << argv[0] << "<filename>\n";
        return 1;
    }

    std::string word;                   /* string to hold word */
    std::vector<wordcount> words {};    /* vector of struct wordcount */
    std::fstream f (argv[1]);           /* file stream */

    while (f >> word) {                 /* read each word from file */
        int idx = findword (words, word);   /* alread exists, get index */
        if (idx != -1) {                /* if index found */
            words[idx].count++;         /* increment count */
        }
        else {  /* otherwise new word */
            wordcount tmp = {word, 1};  /* initialize struct */
            words.push_back(tmp);       /* add to vector */
        }
    }

    std::sort (words.begin(), words.end(), cmp);    /* sort by words */

    for (auto& w : words)   /* output results */
        std::cout << w.word << " " << w.count << '\n';
}

If you put all the pieces above together you would have:

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <algorithm>

struct wordcount {      /* struct holding word and count */
    std::string word;
    size_t count;
};

/* compare function to sort vector of struct by words */
bool cmp (const wordcount& a, const wordcount& b)
{
    return a.word < b.word;
}

/* interate over each struct in vector words to find word */
int findword (const std::vector<wordcount>& words, 
                const std::string& word)
{
    for (auto w = words.begin(); w != words.end(); w++)
        if (w->word == word)            /* if word found */
            return w - words.begin();   /* return index */

    return -1;  /* return word not found */
}

int main (int argc, char **argv) {

    if (argc < 2) { /* validate filename given as argument */
        std::cerr << "error: insufficient input.\n"
                << "usage: " << argv[0] << "<filename>\n";
        return 1;
    }

    std::string word;                   /* string to hold word */
    std::vector<wordcount> words {};    /* vector of struct wordcount */
    std::fstream f (argv[1]);           /* file stream */

    while (f >> word) {                 /* read each word from file */
        int idx = findword (words, word);   /* alread exists, get index */
        if (idx != -1) {                /* if index found */
            words[idx].count++;         /* increment count */
        }
        else {  /* otherwise new word */
            wordcount tmp = {word, 1};  /* initialize struct */
            words.push_back(tmp);       /* add to vector */
        }
    }

    std::sort (words.begin(), words.end(), cmp);    /* sort by words */

    for (auto& w : words)   /* output results */
        std::cout << w.word << " " << w.count << '\n';
}

Example Use/Output

Running against your sample input, you would receive.

$ ./bin/wordcount dat/webpage.txt
Computer 1
algorithm 1
analysis 1
and 1
computer 3
department 1
design 2
quantum 1
science 1
system 1

There are many, many ways to approach this type problem. It can be done with plain-old arrays, but then you would keep track of the words and count in some separate array (or arrays) and then either write your own sort (or use C qsort on one array holding the words and then map the count back to the sorted output with a copy of the original and your array of counts). The key regardless of the approach you take is you must have a way to preserve the pre-sort association between the words and count of times they are each seen with the post-sort result of your words and then a way to map the counts back to the correct word. Using an object that associates the word and count as a single unit solves the association problem.

Look things over, take them as one way to approach it. Let me know if you have further questions.

Upvotes: 2

Related Questions