Sven Jens
Sven Jens

Reputation: 25

Correct way of handling multiple housenumber options possible regex

I'm looking for help refactoring this messy code part. It should handle the following house number formats

    ?.toUpperCase()
    ?.match(/[^a-z]+|[a-z]|[a-z]/gi)
    ?.map((part) => `${part.trim()} `) // Split each part with a space
    .toString() // Revert to one string
    .replace(/,/g, "") // Remove all commas
    .replace(/\s+/g, " ") // Remove duplicate spaces
    .trim(); // Remove and spaces

Upvotes: 1

Views: 76

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626754

You can use

function foo(input) {
  return !/[a-z]/i.test(input) ? input : input
    ?.toUpperCase()
    ?.match(/[^a-z\s]+|[a-z]/gi)
    .join("") // Revert to one string
    .replace(/(?<=\D)|(?=\D)/g, ' '); // Add spaces before/after a non-digit
}

console.log(foo("54 6 K 1"));
console.log(foo("54 6k 1"));
console.log(foo("20 H L"));
console.log(foo("1 10"));
console.log(foo("546 k1"));
console.log(foo("1K"));

First of all, test if your string contains a letter with !/[a-z]/i.test(input). If not, return as is, there is no need processing it (1 10 case). If there is a letter, perform the following:

  • ?.toUpperCase() - turn the string into upper case
  • ?.match(/[^a-z\s]+|[a-z]/gi) - tokenize the string into letter/non-letter & non-whitespace chunks
  • .join("") - convert an array to string
  • .replace(/(?<=\D)|(?=\D)/g, ' ') - add spaces between digit/non-digit chars.

See the last regex demo.

Upvotes: 1

VLAZ
VLAZ

Reputation: 28982

Unnecessary regex pattern

  /[^a-z]+|[a-z]|[a-z]/gi
           ^^^^^ ^^^^^

This pattern is repeated. Since both parts match the same thing, there is no real need for the repetition and it one of them can be removed:

  /[^a-z]+|[a-z]/gi

Overly broad regex pattern

  /[^a-z]+|[a-z]/gi
   ^^^^^^

The intention here is to catch "non-letters". However, the only characters in the inputs are letters, digits, and whitespace, so you should narrow it down. This will make dealing with the results much easier later on:

  /[0-9\s]+|[a-z]/gi

Unneeded conversion

?.map((part) => `${part.trim()} `) // Split each part with a space
.toString() // Revert to one string
.replace(/,/g, "") // Remove all commas
.replace(/\s+/g, " ") // Remove duplicate spaces
.trim(); // Remove and spaces

These lines will work correctly but the problem is that there is too many of them. toString() will implicitly call .join(",") on the array and this is why .replace(/,/g, "") is needed. Then there are still two other lines to deal with whitespace.

Instead this can be simplified to just these two lines:

.map((part) => part.replace(/\s+/g, "")) // Remove unneeded spaces
.join(" ") // Revert to one string

First mapping to remove any spaces. This works because part will either contain numbers and spaces like "54 6 " or a single letter. We want the numbers combined into one, so removing "54 6 " -> "546" is always correct.

Then join(" ") will combine the parts together placing spaces between them. Since each part is guaranteed to not have a space, there are no extra spaces, nor is there a need for .trim()

Result

function foo(input) {
  return input
    ?.toUpperCase()
    ?.match(/[0-9\s]+|[a-z]/gi)
    ?.map((part) => part.replace(/\s+/g, "")) // Remove unneeded spaces
    .join(" ") // Revert to one string
}

/*
    54 6 K 1 -> 546 K 1
    54 6k 1 -> 546 K 1
    20 H L -> 20 H L
    1 10 -> 1 10
    546 k1 -> 546 K 1
    1K -> 1 K
*/

console.log("54 6 K 1 ->", foo("54 6 K 1"));
console.log("54 6k 1  ->", foo("54 6k 1"));
console.log("20 H L   ->", foo("20 H L"));
console.log("1 10     ->", foo("1 10"));
console.log("546 k1   ->", foo("546 k1"));
console.log("1K       ->", foo("1K"));

console.log("null           ->", foo(null));
console.log("<empty string> ->", foo(""));
console.log(".,?!:;'        ->", foo(".,?!:;'"));
console.log("👍             ->", foo("👍"));

Note that 1 10 is converted to 110. The removal of spaces leaves it as a single number.

Upvotes: 0

Related Questions