user3802199
user3802199

Reputation: 41

How to decode/encode XML?

I'm sending request to XML document, then I need to parse some data, and sometimes in this document are symbol like this: enter image description here

and if this symbol are in this document - data doesn't record to my string variable...

Sorry for my bad English...

var
response:string;
begin
response:=IdHTTP1.GET('http:// site.com/document.xml');
// parsing data...
end;

How can I decode/encode this symbol? Here is encoded/decoded variant (for example) of this symbol - 𞉀 ('&#123456') (numbers change all the time in XML document) - this shows me online decoder/encoder, but how to encode/decode it in Delphi?

Upvotes: 0

Views: 1797

Answers (1)

Remy Lebeau
Remy Lebeau

Reputation: 595369

XML is charset-sensitive, and thus transferred as charset-encoded bytes. You are downloading the XML as an AnsiString (since you are using an Ansi version of Delphi), so TIdHTTP.Get() will decode the raw bytes to Unicode and then convert that to Ansi when returning to you. That can alter/corrupt the XML content, or at least make the XML content incompatible with the XML's prolog (which Indy does not alter during these conversions).

When dealing with XML, an XML parser should be given the raw XML data exactly as the server sent it. Let the parser, not Indy, deal with the XML's original bytes. To do that, use the overloaded of version of TIdHTTP.Get() that downloads to a TStream instead of returning a String. Download to a TMemoryStream and then pass it to the XMLDocument.LoadFromStream() method, eg:

var
  response: TMemoryStream;
begin
  response := TMemoryStream.Create;
  try
    IdHTTP1.GET('http://example.com/document.xml', response);
    response.Position := 0;
    // parsing data...
  finally
    response.Free;
  end;
end;

Upvotes: 2

Related Questions