MarathonStudios
MarathonStudios

Reputation: 4321

PHP's cUrl function returning bad characters

I'm attempting to retrieve a remote HTML page with cURL - however, when I analyze the text that gets returned, I'm noticing alot of odd characters like ▀Ã, which makes me think that something went wrong with the text encoding somewhere along the line.

How can I ensure that the text I get back from cURL is properly encoded, and how can I normalize it so I can safely store results in a database without any encoding issues?

Upvotes: 1

Views: 2309

Answers (2)

big_hands
big_hands

Reputation: 9

You need to include the following on the top of your page:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Upvotes: -1

Kumar
Kumar

Reputation: 5147

I hope you have set CURLOPT_ENCODING to "" and the page is not full of those gibberish which you see, second thing I can suggest is to run the string through some thing like html entities to sanitise it. Curl simply gets/posts the data and, IMHO, doesn't change the encodings

Upvotes: 5

Related Questions