Reputation: 13
I got this function from php.net for convert uppercase become lowercase in sentence case.
function sentence_case($string) {
$sentences = preg_split('/([.?!]+)/', $string, -1, PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
$new_string = '';
foreach ($sentences as $key => $sentence) {
$new_string .= ($key & 1) == 0
? ucfirst(strtolower(trim($sentence)))
: $sentence . ' ';
}
return trim($new_string);
}
If the sentence is not in the paragraph, all works well. But if the sentence is in the paragraph, the first letter in opening paragraph (<p>
) or break (<br>
) tag HTML become lowercase.
This is the sample:
Before:
<p>Lorem IPSUM is simply dummy text. LOREM ipsum is simply dummy text! wHAt is LOREM IPSUM? Hello lorem ipSUM!</p>
Output:
<p>lorem ipsum is simply dummy text. Lorem ipsum is simply dummy text! What is lorem ipsum? Hello lorem ipsum!</p>
Can someone help me to make the first letter in the paragraph become capital letter?
Upvotes: 1
Views: 1247
Reputation: 47900
When parsing valid html, it is best practice to leverage a legitimate DOM parser. Using regex is not reliable because regex does not know the difference between a tag and a substring that resembles a tag.
Code: (Demo)
$html = <<<HTML
<p>Lorem IPSUM is simply dummy text.<br>Here is dummy text. LOREM ipsum is simply dummy text! wHAt is LOREM IPSUM? Hello lorem ipSUM!</p>
HTML;
libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach($xpath->query('//text()') as $textNode) {
$textNode->nodeValue = preg_replace_callback(
'/(?:^|[.!?]) *\K[a-z]+/',
function($m) {
return ucfirst($m[0]);
},
strtolower($textNode->nodeValue)
);
}
echo $dom->saveHTML();
Output:
<p>Lorem ipsum is simply dummy text.<br>Here is dummy text. Lorem ipsum is simply dummy text! What is lorem ipsum? Hello lorem ipsum!</p>
The above snippet does not:
Upvotes: 0
Reputation: 605
try this
function html_ucfirst($s) {
return preg_replace_callback('#^((<(.+?)>)*)(.*?)$#', function ($c) {
return $c[1].ucfirst(array_pop($c));
}, $s);
}
and call this function
$string= "<p>Lorem IPSUM is simply dummy text. LOREM ipsum is simply dummy text! wHAt is LOREM IPSUM? Hello lorem ipSUM!</p>";
echo html_ucfirst($string);
here is working demo : https://ideone.com/fNq3Vo
Upvotes: 0
Reputation: 57408
Your problem is that you're considering HTML within the sentence, so the first "word" of the sentence is <P>lorem
, not Lorem
.
You can change the regexp to read /([>.?!]+)/
, but this way you'll see extra spaces before "Lorem" as the system now sees two sentences and not one.
Also, now Hello <em>there</em>
will be considered as four sentences.
This looks disturbingly like a case of "How can I use regexp to interpret (X)HTML"?
Upvotes: 0
Reputation: 6573
You can do it with CSS easily
p::first-letter {
text-transform: uppercase;
}
Upvotes: -1