Reputation: 11
I'm trying to create a C++ project which takes filenames from a txt file and count them and make a top 10 list of it. A small piece of input has shown below:
local - - [24/Oct/1994:13:41:41 -0600] "GET index.html HTTP/1.0" 200 150
local - - [24/Oct/1994:13:41:41 -0600] "GET 1.gif HTTP/1.0" 200 1210
local - - [24/Oct/1994:13:43:13 -0600] "GET index.html HTTP/``1.0" 200 3185
local - - [24/Oct/1994:13:43:14 -0600] "GET 2.gif HTTP/1.0" 200 2555
local - - [24/Oct/1994:13:43:15 -0600] "GET 3.gif HTTP/1.0" 200 36403
local - - [24/Oct/1994:13:43:17 -0600] "GET 4.gif HTTP/1.0" 200 441
local - - [24/Oct/1994:13:46:45 -0600] "GET index.html HTTP/1.0" 200 31853
The code I'm trying to do is below:
#include <iostream>
#include <fstream>
#include <sstream>
#include <unordered_map>
#include <vector>
#include <iterator>
#include <algorithm>
#include <functional>
std::string get_file_name(const std::string& s) {
std::size_t first = s.find_first_of("\"");
std::size_t last = s.find_last_of("\"");
std::string request = s.substr(first, first - last);
std::size_t file_begin = request.find_first_of(' ');
std::string truncated_request = request.substr(++file_begin);
std::size_t file_end = truncated_request.find(' ');
std::string file_name = truncated_request.substr(0, file_end);
return file_name;
}
int main() {
std::ifstream f_s("text.txt");
std::string content;
std::unordered_map<std::string,long int> file_access_counts;
while (std::getline(f_s, content)) {
auto file_name = get_file_name(content);
auto item = file_access_counts.find(file_name);
if (item != file_access_counts.end()) {
++file_access_counts.at(file_name);
}
else {
file_access_counts.insert(std::make_pair(file_name, 1));
}
}
f_s.close();
std::ofstream ofs;
ofs.open("all.txt", std::ofstream::out | std::ofstream::app);
for (auto& n : file_access_counts)
ofs << n.first << ", " << n.second << std::endl;
std::ifstream file("all.txt");
std::vector<std::string> rows;
while (!file.eof())
{
std::string line;
std::getline(file, line);
rows.push_back(line);
}
std::sort(rows.begin(), rows.end());
std::vector<std::string>::iterator iterator = rows.begin();
for (; iterator != rows.end(); ++iterator)
std::cout << *iterator << std::endl;
getchar();
return 0;
}
When i executed, it shows me file names and how many times it repeated but not from highest to lowest and I don't think that it will work with large datas (like 50000 datas). Can you help me? Thank you.
Upvotes: 1
Views: 53
Reputation: 886
The contents of all.txt
are being sorted after being read back in. The problem is that the count is at the end of the line and therefor only affects the sort after the name.
all.txt
:
3.gif, 1
index.html, 3
1.gif, 1
2.gif, 1
4.gif, 1
rows
vector after sort:
1.gif, 1
2.gif, 1
3.gif, 1
4.gif, 1
index.html, 3
Either change the way the values are being written to all.txt
, or parse the count before sorting.
If you put the count at the beginning of the line, be sure to pad with zeros so 3 comes after 10.
Upvotes: 1