Reputation: 3628
i got a big xml file and i want to save each id, source & target in a stringlist to generate after successfull import to stringlists build a query to mysql.
heres a snippet of my xml:
xliff version="1.1">
<file original="Xliff Demo" source-language="EN" target-language="DE" datatype="html">
<external-file uid="017dbcf0-c82c-11e2-ba2b-005056c00008" href="skl\simple.htm.skl"/>
<trans-unit id="00ffmnpB5wBV5KFqBxuHLi4fwJvvuB">
<source xml:lang="EN">1lnRUfBBeHtbS96uULSht42VNMN7XE4qt9JrOcWhtoTuhnbAQ9</source>
<target xml:lang="DE">zZvOLJfLCy9oP5GQYfEqw5LAeC2ESAxRmVe1JyQdmJ1eG2jz1N</target>
<trans-unit id="00kjUwy1rJ54bEGYp7XZvtBiY32pmj">
<source xml:lang="EN">HXOQLUWkfJg206vRw8lyWhCWChOacVxbMukfQ0HUdNHSI18GG4</source>
<target xml:lang="DE">8dsX38mezeZ0w0w37LI66CDRuI8gBD23zT5KR4iqYNv3IGUgH0</target>
<trans-unit id="00kk3Af8SFpHyelAaYrgK58b9GbIDj">
<source xml:lang="EN">wQFxZiCiRsSNWs20G4WXAmDBRdRL6fcrrJnCgtbiXGSfHzpYrT</source>
<target xml:lang="DE">oFVTUdPkExOhISYofIImLsnVKd3NSZg32tyeP5iRxRZdmuYQDy</target>
<trans-unit id="00Ky2dmDU9wGTWBnJxeL9b9gkts5UQ">
<source xml:lang="EN">nHQcjAW02lWe0SyOhqGtyqUhpwQ8qgWX3rUynMRf4BDHfVdHOC</source>
<target xml:lang="DE">0CURp1dcZydB1V2rEZ1lnOhmYufOYbrLbh84e1ZnALlzZPVq4F</target>
<trans-unit id="00pMSFlBfA3bJ8Xy9I78wz6XisPYcV">
<source xml:lang="EN">IuhtaVnZtF67nxKz5dbmuy8BEMTs2X1120FzDtIplKF2Me5AsQ</source>
<target xml:lang="DE">1BGSJQDZBm4UW974pucnX3XHuYOQYpC7nTcIH01rbKlOkVi9bo</target>
<trans-unit id="012w2kb2d1Lo6NbJLE0BawThzsSuCJ">
<source xml:lang="EN">0RoniOGZ7V7WTF1YQg59B8jBhRxnLVXscC1LOGPzKPYRs76oIz</source>
<target xml:lang="DE">gyw15fkHTni2aUGWI5qiPHEz8vsJJJsW4OOqKwGYL1qzfUVfLO</target>
So i try to save each entry of trans-unit id, source xml:lang"EN", target xml:lang="DE" in a seperate stringlist but only the values.
Thats my code:
{ ----------- Import Procedure ------------ }
procedure TForm2.Button2Click(Sender: TObject);
xmlFile, idList, sourceList, targetList: TStringList; // StringListe wo die Xml Datei eingelesen wird
i: Integer;
id, source, target: String;
idTmp, idTmp2, sourceTmp, sourceTmp2, targetTmp, targetTmp2: Integer;
xmlFile := TStringList.Create;
idList := TStringList.Create;
sourceList := TStringList.Create;
targetList := TStringList.Create;
if OpenDialog1.Execute then
//ShowMessage(IntToStr(XmlFile.Count)); Ausgabe der Zeilenlänge
//ShowMessage(XmlFile[8]); // Ausgabe der Zeile 8
for i := 0 to xmlFile.Count-1 do // Über alle Zeilen der StringList gehen und folgendes tun:
begin // Code pro Zeile
idTmp := Pos('<trans-unit id="', xmlFile.Strings[i])+16; // Sucht nach trans-unit id (16 ist die Anzahl der Länge vom Suchstring in dem Fall trans-unit id 16 Stellen lang
if idTmp > 5 then // Überprüfen ob was gefunden wurde (Ungleich 0)
idTmp2 := Pos('"', xmlFile.Strings[i], idTmp); // Ermittelt die Position vom Ende des Strings (")
idList.Add(Copy(xmlFile.Strings[i], idTmp, idTmp2-idTmp));
sourceTmp := Pos('<source xml:lang="EN">', xmlFile.Strings[i])+22;
if sourceTmp > 5 then // Überprüfen ob was gefunden wurde (Ungleich 0)
sourceTmp2 := Pos('<', xmlFile.Strings[i], sourceTmp); // Ermittelt die Position vom Ende des Strings (")
sourceList.Add(Copy(xmlFile.Strings[i], sourceTmp, sourceTmp2-sourceTmp));
targetTmp := Pos('<target xml:lang="DE">', xmlFile.Strings[i])+22;
if targetTmp > 5 then // Überprüfen ob was gefunden wurde (Ungleich 0)
targetTmp2 := Pos('<', xmlFile.Strings[i], targetTmp); // Ermittelt die Position vom Ende des Strings (")
targetList.Add(Copy(xmlFile.Strings[i], targetTmp, targetTmp2-targetTmp));
ShowMessage('Import in StringListen fertiggestellt.');
But it's not working like i want. My problem is, that it saves empty lines too in the stringlist and other trash. I dont really find my error and its the first time im using this copy/pos function.
Heres a screenshot
What should i change to fixx my problem and only save the correct strings in my 3 stringlists?
Upvotes: 0
Views: 554
Reputation: 31433
Here :
idTmp := Pos('<trans-unit id="', xmlFile.Strings[i])+16;
if idTmp > 5 then
will always be greater than 5 - you are adding 16 to it no matter what and it always returns a positive value (or zero if no match).
The simplest change here would be :
idTmp := Pos('<trans-unit id="', xmlFile.Strings[i]);
if idTmp > 0 then begin //Pos returns 0 if no match found
idTmp := idTmp + 16;
idTmp2 := PosEx('"', xmlFile.Strings[i], idTmp);
idList.Add(Copy(xmlFile.Strings[i], idTmp, idTmp2-idTmp));
The change for the other two blocks would follow in a similar way.
You'll notice that I used StrUtils.PosEx here for idTmp2
- I don't know how your code compiled using Pos for the second function...
Ok, it looks like Pos was changed in XE3 to include offset overloads. If performance is your objective here (as it seems from comments) you should probably have a read of this :
Additionally, which I think is probably quite important, this really is a terrible way to parse XML. I highly suggest you read through some source code from projects that do this already to get a better understanding of how you should approach the problem. Some examples might be :
Upvotes: 3
Reputation: 101
Maybe you should think about using the IXMLDocument interface to load the XML File into a data structure and fill your stringlists afterwards.
An example has been posted here:
Upvotes: 6