aherlambang
aherlambang

Reputation: 14418

removing all hash tag and words following it in a string

I have the following regex to remove hashtags:

preg_replace('/#([\w-]+)/i', '$1', $string);

and say I have the following string:

Top idr 160\nDisc 5rbu\/pcs pembelian di atas 4pcs \n#onlineshop#lalashop88#jualanku#jualansis#olshop#baju#dress#import#bkk#bkkfashion#bangkok#celana#hotpants#goodquality#jumpsuit#bustier#pants#clothes#indoshop#indonesiashop#jualansis#medan#medanshop#trusted#trustedolshop#trustedshop#goorder#gofollow

how do I remove it such that I get this string in the end:

Top idr 160\nDisc 5rbu\/pcs pembelian di atas 4pcs \n

notice that the hashtags aren't separated by a space, but if it were separated by a space I would also want this regex to work

Here's another test case:

Top idr 160\nDisc #testing 5rbu\/pcs pembelian di atas 4pcs

should transform it into

Top idr 160\nDisc 5rbu\/pcs pembelian di atas 4pcs

Upvotes: 0

Views: 921

Answers (4)

rvalvik
rvalvik

Reputation: 1559

You can try /#.+?\b/, matches # followed by one or more characters and terminates at the first word boundary. Depending on what characters are allowed in the hashtags you are stripping that might be enough.

preg_replace('/#.+?\b/', '', $string);

If the hastags contain things like periods or dashes, you might need to use something like /#[\w\-.]+/ where \w\-. is the possible hashtag charset (\w being A-Za-z0-9 and _, so those as well as . and - (as pointed out in the comment below, - needs to be escaped inside character groups).

preg_replace('/#[\w\-.]+/', '', $string);

Although if you don't understand regex, maybe solving it by string manipulation would be a better option, so you understand your code fully.

Upvotes: 2

Woodham
Woodham

Reputation: 4273

If you just want to remove everything following a '#' from the string then

/#.*/i

Should work.

Upvotes: 0

Latheesan
Latheesan

Reputation: 24116

You can do it without regex using a simple function like this:

function getCleanString($sourceStr, $delimiter = '#') {
    $sourceStrArr = explode($delimiter, $sourceStr);
    return !empty($sourceStrArr[0]) ? $sourceStrArr[0] : $sourceStr;
}

Usage:

$sourceStr = 'Top idr 160\nDisc 5rbu\/pcs pembelian di atas 4pcs \n#onlineshop#lalashop88#jualanku#jualansis#olshop#baju#dress#import#bkk#bkkfashion#bangkok#celana#hotpants#goodquality#jumpsuit#bustier#pants#clothes#indoshop#indonesiashop#jualansis#medan#medanshop#trusted#trustedolshop#trustedshop#goorder#gofollow';

var_dump(getCleanString($sourceStr));

Outputs:

enter image description here

Upvotes: 2

fntneves
fntneves

Reputation: 414

Try it

preg_replace("/#(.*)$/i", "", $input_lines);

Replaces #.... with empty string. It allows only spaces between tags, not carriage returns.

Upvotes: 1

Related Questions