JohnDotOwl
JohnDotOwl

Reputation: 3755

Detecting Chinese Characters in HTML

useful Link to understand Encoding http://kunststube.net/encoding/ - shared by @deceze

I'm trying to detect for the chinese character but cant. When i try echo , i get this "´Ë±¦±´ÒÑϼÜ". I don't need to display it, just need to detect the characters on the html page.

//Set the post parameters
    curl_setopt($ch, CURLOPT_URL, 'http://bit.ly/1y');
    //execute new request
    $htmlcode = curl_exec($ch);
    curl_close($ch);

    if (stripos($htmlcode, "已下架") !== false) {
    echo "True";
}else{
  echo "Fail";
}

Any suggestions would be greatly appreciated

Upvotes: 0

Views: 275

Answers (1)

deceze
deceze

Reputation: 522085

The page is encoded as GBK. You probably save your source as UTF-8, so "已下架" is UTF-8 encoded. Therefore stripos will not match, since it just compares bytes and is not encoding aware.

Either convert $htmlcode to the encoding of your file or convert "已下架" to the encoding of $htmlcode to perform string matching. Use mb_convert_encoding or iconv.

Upvotes: 2

Related Questions