Elpezmuerto
Elpezmuerto

Reputation: 5571

Sort C++ Strings with multiple criteria

I need to sort a C++ std::vector<std::string> fileNames. The fileNames are labeled as such

YYDDDTTTT_Z_SITE

YY = Year (i.e 2009 = 09, 2010 = 10) DDD = Day of the year (i.e 1 January = 001, 31 December = 365) TTTT = Time of the day (i.e midnight = 0000, noon = 1200)

ZONE = Will be either E or W

SITE = Four letter site name (i.e HILL, SAMM)

I need the strings to be sorted by the following order: ZONE, SITE, YY, DDD, TTTT

Upvotes: 2

Views: 3437

Answers (7)

blue scorpion
blue scorpion

Reputation: 387

Here's a boost lambda functions version. This is overkill, and pretty cryptic, but it's brief and flexible in terms of how one can juggle with different fields criteria. Obviously you need boost. Also, expect increased compilation time. So, here it is:

#include <boost/lambda/lambda.hpp>
#include <boost/lambda/bind.hpp>
#include "boost/lambda/detail/operator_actions.hpp"
#include "boost/lambda/detail/operator_return_type_traits.hpp"
#include "boost/lambda/detail/control_structures_impl.hpp"
#include "boost/ref.hpp"
#include <iostream>
#include <vector>
#include <string>
#include <iterator>
#include <algorithm>
#include <cassert>

using namespace std;
using namespace boost::lambda;

//helpers: a better way would be to group them
//under a flyweight, or something...
string extract_year(string str_)
{
    return str_.substr(0,2);
}

string extract_dayofyear(string str_)
{
    return str_.substr(2,3);
}

string extract_timeofday(string str_)
{
    return str_.substr(5,4);
}

string extract_zone(string str_)
{
    return str_.substr(10,1);
}

string extract_site(string str_)
{
    return str_.substr(12,4);
}

//Uhm, just for brevity... ('cause otherwise we should stay away from macros ;-)
#define IF_THEN_ELSE_RET(op1,op2,exp) if_then_else_return(var(op1)<var(op2),true,if_then_else_return(var(op1)>var(op2),false,exp))

void sort_fnames(vector<string>& fnames)
{
    string z1,z2,s1,s2,y1,y2,d1,d2,t1,t2;

    //sort by zone-then-site-then-year-then-day-then-time:
    //Note the format of the sort(fnames.begin(),fnames.end(), (,...,boolean_expression) );
    //remember, in a sequence of comma-dellimited statements enclosed between parens, like
    //val=(.,...,boolean_expression); only the last expression, boolean_expression, gets
    //assigned to variable "val";
    //So, in the sort() call below, the statements 
    //var(z1)=bind(extract_zone,_1),var(z2)=bind(extract_zone,_2), etc.
    //are only initializing variables that are to be used in the composition 
    //of if_then_else_return(,,) lambda expressions whose composition 
    //combines the zone-then-site-then-year-then-day-then-time criteria 
    //and amounts to a boolean that is used by sort to decide the ordering
    sort(fnames.begin(),fnames.end(),
        (var(z1)=bind(extract_zone,_1),var(z2)=bind(extract_zone,_2),
         var(s1)=bind(extract_site,_1),var(s2)=bind(extract_site,_2),
         var(y1)=bind(extract_year,_1),var(y2)=bind(extract_year,_2),
         var(d1)=bind(extract_dayofyear,_1),var(d2)=bind(extract_dayofyear,_2),
         var(t1)=bind(extract_timeofday,_1),var(t2)=bind(extract_timeofday,_2),
         IF_THEN_ELSE_RET(z1,z2,IF_THEN_ELSE_RET(s1,s2,IF_THEN_ELSE_RET(y1,y2,IF_THEN_ELSE_RET(d1,d2,IF_THEN_ELSE_RET(t1,t2,true)))))
         ));
}

Upvotes: 0

Owen S.
Owen S.

Reputation: 7855

The easy part: write the sort itself:

// Return true if the first arg is strictly less than the second
bool compareFilenames(const std::string& rhs, const std::string& lhs);
...
std::sort(fileNames.begin(), fileNames.end(), &compareFilenames);

The harder part: writing the comparison itself. In pseudocode, for full generality:

bool compareFilenames(const std::string& lhs, const std::string& rhs)
{
    parse the filenames
    if (lhs zone != rhs zone)
        return lhs zone < rhs zone
    if (lhs site != rhs site)
        return lhs site < rhs site
    ...
    return false
}

where lhs site, etc. are the individual bits of data you need to sort by, picked out of the filename.

Given the strict file naming structure you have, though, and your specific sorting needs, you can actually get away with just splitting the string by the first '_' character and doing a lexicographical compare of the second chunk, followed the first chunk if the second chunk is equal. That will make the code to parse the filename much easier, at the potential cost of flexibility if the file naming format ever changes.

Upvotes: 2

Larry Wang
Larry Wang

Reputation: 1006

You could use qsort with your own string compare function that takes into account your sorting rules, and the address of the first element in each vector where it asks for an array.
http://www.cplusplus.com/reference/clibrary/cstdlib/qsort/

But you shouldn't. Just use std::sort

Upvotes: 0

Gustavo V
Gustavo V

Reputation: 152

Use std::sort and implement a Compare Class

look into http://www.cplusplus.com/reference/stl/list/sort/ for further details

Upvotes: 1

codymanix
codymanix

Reputation: 29490

Your sort predicate which you pass to vector::sort() may create reordered temporary strings of the string which it then compares.

Upvotes: 0

KLee1
KLee1

Reputation: 6178

Just write a method that will compare two filenames based upon your criteria, to determine which one comes first then use any standard sorting method.

Upvotes: 1

Cogwheel
Cogwheel

Reputation: 23217

Use std::sort with a comparison function.

(The link has a nice example)

Upvotes: 7

Related Questions