sam
sam

Reputation: 1087

preg_match all paragraphs in a string

The following string contains multiple <p> tags. I want to match the contents of each of the <p> with a pattern, and if it matches, I want to add a css class to that specific paragraph.

For example in the following string, only the second paragraph content matches, so i want to add a class to that paragraph only.

$string = '<p>para 1</p><p>نص عربي أو فارسي</p><p>para3</p>';

With the following code, I can match all of the string, but I am unable to figure out how to find the specific paragraph.

$rtl_chars_pattern = '/[\x{0590}-\x{05ff}\x{0600}-\x{06ff}]/u';
$return = preg_match($rtl_chars_pattern, $string);

Upvotes: 1

Views: 1276

Answers (2)

Jan
Jan

Reputation: 43169

Use a combination of SimpleXML, XPath and regular expressions (regex on text(), etc. are only supported as of XPath 2.0).
The steps:

  1. Load the DOM first
  2. Get all p tags via an xpath query
  3. If the text / node value matches your regex, apply a css class

This is the actual code:

<?php

$html = "<html><p>para 1</p><p>نص عربي أو فارسي</p><p>para3</p></html>";
$xml = simplexml_load_string($html);

# query the dom for all p tags
$ptags = $xml->xpath("//p");

# your regex
$regex = '~[\x{0590}-\x{05ff}\x{0600}-\x{06ff}]~u';

# alternatively:
# $regex = '~\p{Arabic}~u';

# loop over the tags, if the regex matches, add another attribute
foreach ($ptags as &$p) {
    if (preg_match($regex, (string) $p))
        $p->addAttribute('class', 'some cool css class');
}

# just to be sure the tags have been altered
echo $xml->asXML();

?>

See a demo on ideone.com. The code has the advantage that you only analyze the content of the p tag, not the DOM structure in general.

Upvotes: 2

ʰᵈˑ
ʰᵈˑ

Reputation: 11375

https://regex101.com/r/nE5pT1/1

$str = "<p>para 1</p><p>نص عربي أو فارسي</p><p>para3</p>"; 
$result = preg_replace("/(<p>)[\\x{0590}-\\x{05ff}\\x{0600}-\\x{06ff}]/u", "<p class=\"foo\">", $str, 1);

Upvotes: 3

Related Questions