user321627
user321627

Reputation: 2572

How to match a string in php which contains an https url with a fixed number of alphanumeric characters?

I have a string that looks like:

"res":"https://my.site.com/image/I/fj23l6j2lgk_AM1200_.jpg"

My bash regex is (if we let the above equal to $str):

echo $str | grep -oE "\"res\":\"https://my.site.com/image/I/[[:alnum:]]{11}._[a-zA-Z0-9_]*_.jpg\"" \
| grep -oE "my.site.com/image/I/[[:alnum:]]{11}._[a-zA-Z0-9_]*_.jpg" | head -1

which cleanly extracts out https://my.site.com/image/I/fj23l6j2lgk_AM1200_.jpg.

In PHP, I am unsure if an equivalent can exist to what I have above. Does anyone have any suggestions?

Upvotes: 2

Views: 56

Answers (3)

mickmackusa
mickmackusa

Reputation: 47992

It seems to me that you wish to validate that the qualifying url is wrapped in double quotes and is preceded by "res":, then you want to extract the url only.

A lookbehind at the start and a lookahead at the end will validate the exact full string.

Dots must be escaped to be treated as string literals.

You had an extra dot before your underscore that I don't believe you want to keep.

You don't need to escape forward slashes if you use non-slash characters as pattern delimiters (I'll use ~).

[a-zA-Z0-9_] is more concisely written as \w.

Code: (Demo)

$string = '"res":"https://my.site.com/image/I/fj23l6j2lgk_AM1200_.jpg"';

echo preg_match('~(?<=^"res":")https://my\.site\.com/image/I/[a-zA-z\d]{11}_\w*_\.jpg(?="$)~', $string, $out) ? $out[0] : 'no match';

Output:

https://my.site.com/image/I/fj23l6j2lgk_AM1200_.jpg

Upvotes: 2

Yassine CHABLI
Yassine CHABLI

Reputation: 3734

Using PHP , you can pick it by:

$subject = '"res":"https://my.site.com/image/I/fj23l6j2lgk_AM1200_.jpg"';
$regex = '/https:\/\/my\.site\.com\/image\/I\/[[:alnum:]]{11}_[a-zA-Z0-9_]*_.jpg/'
preg_match($regex , $subject , $matches);

var_dump($matches);

The output :

array(1) {
  [0]=>
  string(51) "https://my.site.com/image/I/fj23l6j2lgk_AM1200_.jpg"
}

Upvotes: 1

totok
totok

Reputation: 1500

You just have to escape all the / and . in your regex, and it's fine. I also removed a dot near the end.

my\.site\.com\/image\/I\/[[:alnum:]]{11}_[a-zA-Z0-9_]*_\.jpg

Try it here.

Upvotes: 1

Related Questions