Tunku Salim
Tunku Salim

Reputation: 167

Remove double space and space after line break from String

so, first i have this input

$string = "Lorem ipsum 
dolor sit amet, consectetur adipiscing 
elit https://www.youtube.com/watch?v=example sed do eiusmod tempor incididunt https://www.youtube.com/watch?v=example2 https://www.youtube.com/watch?v=example3";

and then i want to remove the url from the $string using regex

$string = preg_replace('/[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&\/\/=]*)/', '', $string);

after i removed all of the url from the string, the output will be

Lorem ipsum 
dolor sit amet, consectetur adipiscing 
 elit  sed do eiusmod tempor incididunt  

the problem is, there is double space and i want to make it more neat

ive tried using this, which will replaced all the double space with single space

$string = preg_replace('/\x20+/', ' ', $string);

and theres come another problem which is theres a space after line break

Lorem ipsum 
dolor sit amet, consectetur adipiscing 
 elit sed do eiusmod tempor incididunt

and it makes me uncomfortable.

i need a solution to get rid of the url, but also makes it neat. the last result I want is like this

Lorem ipsum 
dolor sit amet, consectetur adipiscing
elit sed do eiusmod tempor incididunt

sorry if its looks weird, thanks

Upvotes: 5

Views: 212

Answers (2)

magrigry
magrigry

Reputation: 427

I would use those regex :

$string = "Lorem ipsum 
dolor sit amet, consectetur adipiscing 
elit https://www.youtube.com/watch?v=example sed do eiusmod tempor incididunt https://www.youtube.com/watch?v=example2 https://www.youtube.com/watch?v=example3";

$string = preg_replace('/[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&\/\/=]*)([ ]*)?/', '', $string);
$string = preg_replace('/(([ ]*)?(\r\n|\n)([ ]*)?)/', "\r\n", $string); # Remove any potantial space before line break and remove any potential space after line break

echo $string;

Output

Lorem ipsum
dolor sit amet, consectetur adipiscing
elit sed do eiusmod tempor incididunt 

Note : I just added ([ ]*)? to the regex that match urls to be sure to also match spaces after urls

Upvotes: 1

0stone0
0stone0

Reputation: 43983

Use preg_replace() to remove all the URL's.

Use trim() to remove any left over spaces

Again, use preg_replace() to remove any dubble spaces. (regex)

Then, to remove any spaces that accrued at the beginning of the line, replace those with nothing to remove them.

<?php

    $r = '/\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]/i';
    $string = "Lorem ipsum
    dolor sit amet, consectetur adipiscing
    elit https://www.youtube.com/watch?v=example sed do eiusmod tempor incididunt https://www.youtube.com/watch?v=example2 https://www.youtube.com/watch?v=example3";

    // Remove url's
    $clean = preg_replace($r, ' ', $string);

    // Trim whitespaces
    $clean = trim($clean);

    // Replace dubble-space with single space
    $clean = preg_replace( '/\h+/', ' ', $clean);

    // Remove any spaces after newline
    $clean = preg_replace('/^ /m', '', $clean);

    // Show result
    echo $clean;

Output:

Lorem ipsum 
dolor sit amet, consectetur adipiscing 
elit sed do eiusmod tempor incididunt

Try online


Note: This could be a lot simplified by combining some calls, I chose not to so the steps are more clear

Upvotes: 2

Related Questions