Reputation: 971
I use this regular expression [\p{Greek}]
to match any Greek character. It works as expected and matches the first Greek character on the line. However, I want to match all Greek characters that follows that first character but the *-quantifier doesn't seem to work for Greek characters.
This is my input data. First three spaces, a double quote and then a Greek or Latin string, with one or more space, ending with ", and a new line.
"ξηλξλκξ λκλξλξ",
"lkjlkj kjljl",
"δδσασα ασδ ασδφ",
"xxaax asdsd dsds",
"δερεφε αδσφδσ",
a. ^.*?[\p{Greek}|\s]
- just matches the first space on all lines.
b. ^.*?[\p{Greek}|\s]+
- matches all three initial spaces, on all lines.
c. ^.*?"[\p{Greek}|\s]+
- matches the whole line when it is written with Greek characters
d. ^.*?"[\p{Greek}|\s]*
- matches the initial spaces and the "
on the Latin lines and the whole line excluding the ",
at the end on the Greek lines.
e. [\p{Greek}]*
- matches all characters on the Latin lines, but just one at the time (in spite of the *). On the Greek lines it matches the initial spaces, one at the time, but not the first "
. Then it matches the first word, not the space between the words,
(e) is super confusing. If I do a search-and-replace using that regular expression on the string "XYZ NOP",
and insert A
for everything found one at the time ("replace and find next") the result looks like this A A"XAYZA NAOPA",A
. However, if I perform a "replace all", this is the result ´A A A A"AXAYAZA ANAOAPA"A,´. All the original characters remain, in spite of performing a search-and-replace, with As more or less randomly inserted.
I have no idea what is going on here.
A couple of questions here:
I am using BBEdit for this. I have used BBEdit with regular expressions since the 90s and have never encountered any issues with its regexp implementation. But OTOH, I have never tried working with Greek characters before.
Upvotes: 0
Views: 89