SexualPotatoes
SexualPotatoes

Reputation: 78

Finding and replacing strings with certain character patterns in array

I have a pretty large database with some data listed in this format, mixed up with another bunch of words in the keywords column.

BA 093, RJ 342, ES 324, etc.

The characters themselves always vary but the structure remains the same. I would like to change all of the strings that obey this character structure : 2 characters A-Z, space, 3 characters 0-9 to the following:

BA-093, RJ-342, ES-324, etc.

Be mindful that these strings are mixed up with a bunch of other strings, so I need to isolate them before replacing the empty space. Here is a sample string:

Km 111 aracoiaba Araçoiaba sp 270 spvias vias sao paulo Araçoiaba Bidirecional

sp 270 is the bit we want to change.

EDIT: There was also an exception which should ignore the condition in case KM are the first two characters, it was handled by one of the answers

I have written the beginning of the script that picks up all the data and shows it on the browser to find a solution, but I'm unsure on what to do with my if statement to isolate the strings and replace them. And since I'm using explode it is probably turning the data above into two separate arrays each, which further complicates things.

<?php

require 'includes/connect.php';

$pullkeywords = $db->query("SELECT keywords FROM main");

while ($result = $pullkeywords->fetch_object()) {

    $separatekeywords = explode(" ", $result->keywords);
    print_r ($separatekeywords);
    echo "<br />";
}

Any help is appreciated. Thank you in advance.

Upvotes: 1

Views: 55

Answers (1)

chris85
chris85

Reputation: 23892

This regex should do it.

([A-Z]{2})\h(\d{3})

That says any character between A-Z two times ({2}). A horizontal white space \h. Then three {3} numbers \d. The ( and ) capture the values you want to capture. So $1 and $2 have the found values.

Regex101 Demo: https://regex101.com/r/nU2yN0/1

PHP Usage:

$string = 'BA 093, RJ 342, ES 324';
echo preg_replace('~([A-Z]{2})\h(\d{3})~', '$1-$2', $string);

Output:

BA-093, RJ-342, ES-324

You may want (?:^|\h)([A-Z]{2})\h(\d{3}) which would require the capital letters don't have text running into them. For example AB 345, cattleBE 123, BE 678. With this regex cattleBE 123 wouldn't be found. Not sure what your intent with this example is though so I'll leave that to you..

The ?: makes the () non capturing there. The ^ is so the capital letters can be the start of the string. The | is or and the \h is another horizontal space. You could do \s in place of \h if you wanted to allow new lines as well.

Update:

(?!KM)([A-Z]{2})\h(\d{3})

This will ignore strings starting with KM. https://regex101.com/r/nU2yN0/2

Upvotes: 3

Related Questions