Reputation: 13
I want to get this results (from -> to)
# use string length limit = 3
1 {2 3} -> 1 # the string between the {} must be whole
1 2 3 -> 1 2
1 23 -> 1
{1} -> {1}
{1 2} -> empty
123456 -> 123 # if there is no spaces, cut string by symbols (except {*} expressions). Not necessarily but it would be cool
# one more example. Use string length limit = 5
{1} 2 -> {1} 2
123 45 -> 123
123 4 -> 123 4
Is there a way to do this using PHP with one regex expression?
Length limit may be dynamic.
Similar question - Get first 100 characters from string, respecting full words (but my question requires full contain {*} expressions )
I tried: ^(.{1,3})({.*}|\s|$)
Upvotes: 0
Views: 688
Reputation: 26405
The idea here is to define your atomic bits, match each, and use a negative lookbehind to limit the character length (also makes sure to ditch trailing whitespace as well - not sure if this is needed or not, but figured I'd throw it in.)
Only other thing is to use a conditional expression to see whether it's just a single uninterrupted series of chars and split it naively if so (for your 123456 -> 123
example.)
function truncate($string, $length)
{
$regex = <<<REGEX
/
(?(DEFINE)
(?<chars> [^\s{}]+ )
(?<group> { (?&atom)* } )
(?<atom> (?&chars) | (?&group) | \s )
)
\A
(?(?=.*[\s{}])
(?&atom)*(?<! \s | .{{$length}}. ) |
.{0,$length}
)
/x
REGEX;
preg_match($regex, $string, $matches);
return $matches[0];
}
$samples = <<<'DATA'
1 {2 3}
1 2 3
1 23
{1}
{1 2}
123456
DATA;
foreach (explode("\n", $samples) as $sample) {
var_dump(truncate($sample, 3));
}
Output:
string(1) "1"
string(3) "1 2"
string(1) "1"
string(3) "{1}"
string(0) ""
string(3) "123"
And:
$samples = <<<'DATA'
{1} 2
123 45
123 4
DATA;
foreach (explode("\n", $samples) as $sample) {
var_dump(truncate($sample, 5));
}
Outputs:
string(5) "{1} 2"
string(3) "123"
string(5) "123 4"
Upvotes: 1
Reputation: 2855
try this one:
/^([\w ]{1,3}(?= )|\w{1,3}|\{\w\})/gm
It's working with given samples https://regex101.com/r/iF2tSp/3
1 {2 3}
1 2 3
1 23
{1}
{1 2}
123456
Match 1
Full match 0-1 `1`
Group 1. n/a `1`
Match 2
Full match 8-11 `1 2`
Group 1. n/a `1 2`
Match 3
Full match 14-15 `1`
Group 1. n/a `1`
Match 4
Full match 19-22 `{1}`
Group 1. n/a `{1}`
Match 5
Full match 29-32 `123`
Group 1. n/a `123`
Upvotes: 0
Reputation: 92884
The solution using preg_match_all
function with specific regex pattern:
$str = '1 {2 3}
1 2 3
1 23
{1}
{1 2}
123456 ';
$re = '/^(\S \S{1}(?=\s)|\S(?= \S{2})|\{\S\}|\w{3}(?=\w))/m';
preg_match_all($re, $str, $matches);
// the new line containing truncated items(you can `implode` it to get a single string)
print_r($matches[0]);
The output:
Array
(
[0] => 1
[1] => 1 2
[2] => 1
[3] => {1}
[4] => 123
)
Regex demo (check "Explanation" section at the right side)
Upvotes: 1