Maroš Beťko
Maroš Beťko

Reputation: 2329

Fixing wrong JSON data

I'm working with a JSON data that I download from web. Problem with this JSON is that it's contents are incorrect. To show the problem here is a simplified preview:

[
  {
    "id": 0,
    "name": "adsad"
  },
  {
    "id": "123",
    "name": "aawew"
  }
]

So there is an array of these items, where in some value for "id" is string and somewhere it is an integer. This is the data that I get and I can't make the source fix this.

The solution I came up with was to fix this data before serializing it and here is my naive algorithm where Defaults::intTypes() is a vector of all key that should be integer but are sometimes string:

void fixJSONData(QString& data) {
    qDebug() << "Fixing JSON data ( thread: " << QThread::currentThreadId() << ")";
    QElapsedTimer timer;
    timer.start();

    for (int i = 0; i < data.size(); ++i) {
        for (const auto& key : Defaults::intTypes()) {
            if (data.mid(i, key.size() + 3) == "\"" + key + "\":") {
                int newLine = i + key.size() + 3;

                while (data[newLine] != ',' && data[newLine] != '}') {
                    if (data[newLine] == '"') {
                        data.remove(newLine, 1);
                    } else {
                        ++newLine;
                    }
                }

                i = newLine;
                break;
            }
        }
    }
    qDebug() << "Fixing done in " << timer.elapsed() << " ms.";
}

Well it does fix the problem, but the algorithm is too slow and it is too slow (went through 4.5 million characters in 390 seconds). How could this be done faster?

P.S.: for JSON serialization I use nlohmann::json library.

Edit: After reading up a bit deeper into JSON rules, it looks like that example above is absolutely valid JSON file. Should this be an issue related to C++ being strongly type dependent so it can't serialize an array of different elements into C++ classes?

Edit2: What I would like to create from that json string is QVector<Model> where:

class Model {
    unsigned id;
    QString name;
}

Upvotes: 2

Views: 989

Answers (1)

Selindek
Selindek

Reputation: 3423

Although there must be several way to improve this conversion maybe there is a much more effective solution.

Most of the JSON libraries allow the end user to define custom serializer/deserializer for an object. If you create a custom deserializer then it can parse the original data and you don't have to modify the stream or files.

It's not only faster but also more elegant.

(If the given JSON library doesn't support custom deserialization I would consider choosing an other one.)

Upvotes: 3

Related Questions