Domooo93
Domooo93

Reputation: 65

Qt reading some infomation of a textfile

I have a .txt file and need to read from it. The file consists data of cities, their longitude, latitude and some other stuff.

Thats the data format:

DE  01945   **Tettau**  Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   **51.4333   13.7333**

DE  01968   **Schipkau Hörlitz**    Brandenburg BB      00  Landkreis Oberspreewald-Lausitz 12066   **51.5299   13.9508**

...

In every line of the file is one city, but for me only the bold information is important (name, Latitude, Longitude). All in all there are 16k lines in the file. Can you please explain me how i get theese information.

QFile file ("path");
QTextStream in (&file);
while (!in.atEnd()) {
    QString line = in.readLine();
    std::string s = line.toLocal8Bit().constData();
    std::cout << s << endl;
}
file.close();

As far I can only read the whole line but I dont have any idea how to get these 3 information of every line. I created a class "City" with three members. _name, _longitude, _latitude. And then i wanted to create a vector to safe every city inside. Is this method efficent ? But more important please tell me how i can read theese 3 bold information of every line, cause i have no idea how to do it. (I thought to iterate through every character of the string and search for tabs, but it took freaky long). So I'm really happy if you show me a fast method how to do it. Programm is developed in Qt with c++.

PS: I also noticed the Problem that some city names consists of 2 words, seperated by a space.

Upvotes: 2

Views: 69

Answers (2)

eyllanesc
eyllanesc

Reputation: 243927

The file you have is a tab-separated values ​​(TSV), so the logic is to obtain each line and separate through the tab, and then choose the elements as shown below:

#include <QFile>
#include <QTextStream>

#include <iostream>

struct CityData
{
    std::string city;
    float latitude;
    float longitude;
};

int main()
{
    QFile file("/path/of/DE.txt");
    if(!file.open(QFile::ReadOnly | QFile::Text))
        return -1;

    QTextStream stream(&file);
    QString line;

    std::vector<CityData> datas;

    while (stream.readLineInto(&line)) {
        QStringList elements = line.split("\t");

        CityData data{elements[2].toStdString(),
                    elements[9].toFloat(),
                    elements[10].toFloat()
                     };
        datas.push_back(data);
    }
    for(const CityData & data: datas){
        std::cout<< "city: "<< data.city <<"\t" << "latitude: "<< data.latitude <<"\t" << "longitude: "<<data.longitude<<"\n";
    }
    return 0;
}

Output:

city: Tettau    latitude: 51.4333   longitude: 13.7333
city: Guteborn  latitude: 51.4167   longitude: 13.9333
city: Hermsdorf latitude: 51.4055   longitude: 13.8937
city: Grünewald latitude: 51.4  longitude: 14
city: Hohenbocka    latitude: 51.431    longitude: 14.0098
city: Lindenau  latitude: 51.4  longitude: 13.7333
city: Ruhland   latitude: 51.4576   longitude: 13.8664
city: Schwarzbach   latitude: 51.45 longitude: 13.9333
city: Kroppen   latitude: 51.3833   longitude: 13.8
city: Schipkau Hörlitz  latitude: 51.5299   longitude: 13.9508
city: Senftenberg   latitude: 51.5252   longitude: 14.0016
city: Schipkau  latitude: 51.5456   longitude: 13.9121
...

In this type of materials you should read the readme.txt:

...

The data format is tab-delimited text in utf8 encoding, with the following fields :

country code      : iso country code, 2 characters
postal code       : varchar(20)
place name        : varchar(180)
admin name1       : 1. order subdivision (state) varchar(100)
admin code1       : 1. order subdivision (state) varchar(20)
admin name2       : 2. order subdivision (county/province) varchar(100)
admin code2       : 2. order subdivision (county/province) varchar(20)
admin name3       : 3. order subdivision (community) varchar(100)
admin code3       : 3. order subdivision (community) varchar(20)
latitude          : estimated latitude (wgs84)
longitude         : estimated longitude (wgs84)
accuracy          : accuracy of lat/lng from 1=estimated to 6=centroid

Upvotes: 1

Aziuth
Aziuth

Reputation: 3902

Essentially, you only need to delimit your line:

QStringList delimited = line.split(" ");
QString town = delimited[2];

in order to get Tettau or Schipkau in your example, likewise with the other items.

That said, I'm not sure about the "Schipkau Hörlitz" thing in your example, assuming that this is the name of a single town or a quarter of a town with a composed name. That depends on your format. One option is to start at index 2 and add whatever comes as long as it is not the name of a german state. Of course, this then will only work for germany. You could also try to find out the next index that is only numbers, in your example "00", and work back from that one. Again, depends on your format, and I hope I gave you enough to work with.

Might look like:

QStringList delimited = line.split(" ");
QString town = delimited[2];
size_t pos = 3;
while(not is_german_state(delimited[pos]))
{
    town += " "  + delimited[pos];
    pos++;
}
QString longitude = delimited[pos+6];
QString latitude= delimited[pos+7];

(Note that I did not catch the case when a line is not properly formated and thus delimited[pos] or the ones for longitude or latitude might result in a segmentation fault if not.)

After that you store it in some way, like having a vector<TownData> with a structure TownData that stores the data you need, and in each iteration, you append to the vector. I assume that how to do that is clear, but ask if it isn't.

In Qt, in general, it pays to look at the classes you are currently using. In this case, QString, which has a lot of functionality.

Since a vector is copied when it changes it's reservation size and you asked about efficiency in particular, it would be a good idea to reserve enough space for the vector before you enter the iterations. I'm not aware of any method to get the number of lines in a file without actually iterating through them, so you might need to either do that one time before you actually work with the data in it, or you need to create some estimator, like estimating lines by file size or estimating it to be 16k. Then call vector::reserve(size_type n) on your vector. That said, 16k lines does not sound as much, might be that this is premature optimization. I'd probably first go without the reservation and simply look if it runs smoothly as it is.

Upvotes: 1

Related Questions