Reputation: 10117
I am consuming an XML feed which contains a great deal of whitespace. When I echo out the raw feed, it looks as though the columns of the tabled data are properly formatted with just the white space.
I have tried many regex patterns to remove it, to only allow visible characters, trim, chop, utf-8 encode/decode, nothing is touching it. It's like it is laughing in my face when I echo out a value and see this:
string(17) "72"
Opened the data in Notepad++ with show all characters on, and it simply shows it as spaces. I am at a loss of where to go with this.
I did recieve the following error:
simplexml_load_string(): Entity: line 265: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0xB0 0x43 0x20 0x74
Upvotes: 0
Views: 1583
Reputation: 10117
Solution
My very hacky workaround that works:
$raw = file_get_contents('http://stupidwebservice.com/xmldata.asmx/Feed');
$raw = urlencode(utf8_encode($raw));
$raw = str_replace('++','',$raw);
$raw = urldecode($raw);
urlencoding after the utf-8 encoding turned the space into +'s. I simply removed all instances of double ++'s and took it back. Works great.
Upvotes: 0
Reputation: 3189
Try running the data through utf8_encode()
- it might seem like a hack, but it seems like the originating data isn't properly setup.
My theory is that you're grabbing it with the wrong encoding, and the proper solution would be to load it differently.
Upvotes: 1
Reputation: 61
I just found this regex (untested)
$xml_data = preg_replace("/>\s+</", "><", $xml_data);
If you are using the xml parser, I think you can use the 'XML_OPTION_SKIP_WHITE' option referenced here: http://php.net/manual/en/function.xml-parser-set-option.php
Upvotes: 1