Reputation: 183
my script work great, but today after checkin logs i found some matrix words, after analysing i understood that there is something with utf8, files are parsed, title is extracted, but result instead of russian words is (Сериалы ТУТ! СериÐ) unknown symbols
i use
$cont = "dasdas<title>Сериалы ТУТ! Сериалы онлайн sda</title>";
preg_match("'<title[^>]*?>(.*)</title>'siU", $cont, $match);
//$match[1] = Сериалы ТУТ! СериРsda
when i try to add pattern modifier /u there is no changes, the same unknown matrix words. Please.
Maybe there is something with PHP?
Upvotes: 0
Views: 154
Reputation: 89557
It is not a php or a regex problem, but an html problem. To obtain a correct display, you must add <meta charset="UTF-8"/>
in the header of your html code.
As an aside comment: using the U modifier is useless:
preg_match('~<title[^>]*>(.*?)</title>~si', $cont, $match);
Upvotes: 2