Reputation: 5024
I have sample text in english cyrillic letters:
“No,” the old man said.” But we have .Haven’t we?” Бале , -гуфт -Аммо мо бовар дорем . Дуруст”?
“Yes ,”the boy said . Can I offer you a beer on the Terrace and then we’ll take the stuff home .
Албатта . Мехоҳӣ, ки дар каҳвахона бароят оби ҷав бигирам? Баъд чизҳоро ба хона мебарем .
“Why not ?” the old man said . “ Between fishermen.”
Чаро не ?! гуфт пирамард .- Моҳигир моҳигириро метавонад даъват кунад.
How I can get sample result from this text to array:
$englishCyrillic = [
"No, the old man said. But we have .Haven’t we?" => "Бале , -гуфт -Аммо мо бовар дорем . Дуруст?",
"Yes ,the boy said . Can I offer you a beer on the Terrace and then we’ll take the stuff home." => "Албатта . Мехоҳӣ, ки дар каҳвахона бароят оби ҷав бигирам? Баъд чизҳоро ба хона мебарем.",
"Why not ? the old man said . Between fishermen." => "Чаро не ?! гуфт пирамард .- Моҳигир моҳигириро метавонад даъват кунад.",
];
And also I have Cyrillic English sentence type:
Куҷо дард мекунад? Show me where it hurts?
Нафас гиред / Нафас нагиред. Breath / Do not breath
Чуқуртар нафас гиред Breathe deeply
How to get sample result from this text:
$cyrillicEnglish = [
"Куҷо дард мекунад?" => "Show me where it hurts?",
"Нафас гиред / Нафас нагиред." => "Breath / Do not breath",
"Чуқуртар нафас гиред" => "Breathe deeply",
];
I tired with regex but my code can not split by sentence and return needed me result:
Search english words:
preg_match_all('/[\p{Latin}]+/u', $text, $matches);
Search cyrillic words:
preg_match_all('/[\p{Cyrillic}]+/u', $text, $matches);
Upvotes: 1
Views: 75
Reputation: 627082
The strings in the first format can be read line by line, and all you need to do is to add the odd ones as English, and even ones as Cyrillic. No regex is required.
For the second format, you might use
preg_match('~(.*\p{Cyrillic}\S*)\h+(.+)~u', $s, $matches)
and the create the array:
array_combine($matches[1], $matches[2])
See the second regex demo
Upvotes: 1