PentiumPro200
PentiumPro200

Reputation: 631

C++ efficient parse

I am programming some automated test equipment (ATE) and I'm trying to extract the following values out of an example response from the ATE:

DCRE? 1, 
DCRE P, 10.3, (pin1)
DCRE F, 200.1, (pin2)
DCRE P, 20.4, (pin3)

From each line, I only care about the pin and the measured result value. So for the case above, I want to store the following pieces of information in a map<std::string, double> results;

results["pin1"] = 50.3;
results["pin2"] = 30.8;
results["pin3"] = 70.3;

I made the following code to parse the response:

void parseResultData(map<Pin*, double> &pinnametoresult, string &datatoparse) {
    char *p = strtok((char*) datatoparse.c_str(), " \n");
    string lastread;
    string current;
    while (p) {
        current = p;
        if(current.find('(') != string::npos) {
            string substring = lastread.substr(1); 
            const char* last = substring.c_str();
            double value = strtod(last, NULL);
            unsigned short number = atoi(current.substr(4, current.size()-2).c_str());
            pinnametoresult[&pinlookupmap[number]] = value;
        }
        lastread = p;
        p = strtok(NULL, " \n");
    }
}

It works, but it's not very efficient. Is there a way to make the function more efficient for this specific case? I don't care about the DCRE or P/F value on each line. I thought about using Boost regex library, but not sure if that would be more efficient.

Upvotes: 0

Views: 132

Answers (3)

PentiumPro200
PentiumPro200

Reputation: 631

Here's my final result. Does not involve any copying, but it will destroy the string.

void parseResultData3(map<std::string, double> &pinnametoresult, std::string &datatoparse) {
    char* str = (char*) datatoparse.c_str();
    int length = datatoparse.size();
    double lastdouble = 0.0;
    char* startmarker = NULL; //beginning of next pin to parse

    for(int pos = 0; pos < length; pos++, str++) {
        if(str[0] == '(') {
            startmarker = str + 1;
            //get previous value
            bool triggered = false;
            for(char* lookback = str - 1; ; lookback--) {
                if(!triggered && (isdigit(lookback[0]) || lookback[0] == '.')) {
                    triggered = true;
                    *(lookback + 1) = '\0';
                }
                else if(triggered && (!isdigit(lookback[0]) && lookback[0] != '.')) {
                    lastdouble = strtod(lookback, NULL);
                    break;
                }
            }
        }
        else if(startmarker != NULL) {
            if(str[0] == ')') {
                str[0] = '\0';
                pinnametoresult[startmarker] = lastdouble;
                startmarker = NULL;
            }
            if(str[0] == ',') {
                str[0] = '\0';
                pinnametoresult[startmarker] = lastdouble;
                startmarker = str + 1;
            }
        }
    }
}

Upvotes: 0

Mustafa Ozturk
Mustafa Ozturk

Reputation: 811

This is what I did:

#include <cstdlib>
#include <string>
#include <vector>
#include <unordered_map>
#include <sstream>
#include <iostream>

using namespace std;

struct Pin {
    string something; 
    Pin() {}
};

vector<Pin*> pins = { new Pin(), new Pin(), new Pin() };
typedef unordered_map<Pin*, double> CONT_T;

inline bool OfInterest(const string& line) {
    return line.find("(") != string::npos;
}

void parseResultData(CONT_T& pinnametoresult, const string& datatoparse)
{
    istringstream is(datatoparse);
    string line;
    while (getline(is, line)) {
        if (OfInterest(line)) {
            double d = 0.0;
            unsigned int pinid;
            size_t firstComma = line.find(",")+2; // skip space
            size_t secondComma = line.find(",", firstComma);                        
            istringstream is2(line.substr(firstComma, secondComma-firstComma));            
            is2 >> d;
            size_t paren = line.find("(")+4; // skip pin            
            istringstream is3(line.substr(paren, (line.length()-paren)-1));
            is3 >> pinid;
            --pinid;
            Pin* pin = pins[pinid];
            pinnametoresult[pin] = d;            
        }
    }
}



/*
 * 
 */
int main(int argc, char** argv) {
    string datatoparse = "DCRE? 1, \n"
            "DCRE P, 10.3, (pin1)\n"
            "DCRE F, 200.1, (pin2)\n"
            "DCRE P, 20.4, (pin3)\n";


    CONT_T results;
    parseResultData(results, datatoparse);

    return 0;
}

Upvotes: 1

Ulrich Eckhardt
Ulrich Eckhardt

Reputation: 17415

In order to make this a bit more efficient, try to avoid copying. In particular, calls to substring, assignments etc can cause havoc on the performance. If you look at your code, you will see that the content of datatoparse are repeatedly assigned to lastread and current, each time with one line less at the beginning. So, on average you copy half of the original string times the number of lines, making just that part an O(n^2) algorithm. This isn't relevant if you have three or four line (not even on 100 lines!) but if you have a few more, performance degrades rapidly.

Try this approach instead:

string::size_type p0 = 0;
string::size_type p1 = input.find('\n', p0);
while (p1 != string::npos) {
    // extract the line
    string line = input.substr(p0, p1 - p0);

    // move to the next line
    p0 = p1 + 1;
    p1 = input.find('\n', p0);
}

Notes:

  • Note that the algorithm still copies all input once, but each line only once, making it O(n).
  • Since you have a copy of the line, you can insert '\0' as artificial separator in order to give a substring to e.g. atoi() or strtod().
  • I'm not 100% sure of the order of parameters for string::find() and too lazy to look it up, but the idea is to start searching at a certain position. Look at the various overloads of find-like functions.
  • When handling a line, search the indices of the parts you need and then extract and parse them.
  • If you have line fragments (i.e. a partial line without a newline) at the end, you will have to modify the loop slightly. Create tests!

Upvotes: 2

Related Questions