Uno Mein Ame
Uno Mein Ame

Reputation: 1090

regexp to match a pattern

I don't know enough regexp to write the code, but I know that regexp is the right way to do it.

Input: "2. Some text"
Output "%%2.|Some text"

Input: "[12] some text"
Output: "%%[12]|some text"

Input: "(a)       this is great"
Output: %%(a)|this is great"

The matching should be done only if the beginning of the string starts with:
A number ("1", "25", "234"); 
A number followed by a dot ("1.", "25.", "234.");
A number in brackets ("[1]", "[25]", "[234]");
A number in parenthesis ("(1)", "(25)", "(234)");
A roman number in parenthesis ("(i)", "iv", "viii");
A roman number in brackets ("[i], "[iv]", "[viii]");
A single letter followed by a dot ("a.", "B.");
A single letter in parenthesis ("(a)", "(B)");
A single letter in brackets ("[a]", "[B]")

Then followed by at least one space

Then followed by text

Output:
%%
followed by the number/letter in the same formatting as it was found
followed by a pipe "|"
followed by the rest of the string (stripping all spaces between the number/letter and the rest of the text).

Upvotes: 0

Views: 61

Answers (3)

anubhava
anubhava

Reputation: 785156

You need to use this regex for search:

^(?:<([ib])>)?([\[(]?\w+[\])]|\w+\.)(?:<\/\1>)?\s+

And use this for replacement:

%%\1|

Online Demo: http://regex101.com/r/vD6oX7

Upvotes: 2

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89557

You can try this:

$pattern =<<<'LOD'
~
(?(DEFINE)
    (?<roman> m{0,4}(?:cm|cd|d?c{0,3})(?:xc|xl|l?x{0,3})(?:ix|iv|v?i{0,3}) )
)
^
(   \( (?: [0-9]+ | \g<roman> | [a-z] ) \)
  | \[ (?: [0-9]+ | \g<roman> | [a-z] )  ]
  | [0-9]+ \.? | [a-z] \.
) \h*
~xi
LOD;

$result = preg_replace($pattern, '%%$2|', $txt);

Upvotes: 1

Kamehameha
Kamehameha

Reputation: 5473

This should work -

$regex = "/^((?:[0-9]+\.)|(?:[a-zA-Z]\.)|(?:\[[0-9]+\])|(?:\[M{0,4}(?:CM|CD|D?C{0,3})(?:XC|XL|L?X{0,3})(?:IX|IV|V?I{0,3})\])|(?:\[[a-zA-Z]+\])|(?:\([a-zA-Z]\))|(?:\([0-9]+\)))\s*(.*)$/";
    $replacement = "%%$1|$2";
    $arr=   Array(
                "2. Some text",
                "a. Some text",
                "[12] some text",
                "[iii] some text",
                "[a] some text",
                "(a)       this is great",
                "(1)       this is great"
            );
    foreach($arr as $str){
        var_dump(preg_replace($regex, $replacement, $str));
    }

The output-

/*
    OUTPUT
*/
string '%%2.|Some text' (length=14)
string '%%a.|Some text' (length=14)
string '%%[12]|some text' (length=16)
string '%%[iii]|some text' (length=17)
string '%%[a]|some text' (length=15)
string '%%(a)|this is great' (length=19)
string '%%(1)|this is great' (length=19)

Upvotes: 0

Related Questions