Reputation: 1949
I have difficulty using Regular Expression (Grep) in TextWrangler to find occurrences of lowercase letter followed by uppercase. For example:
This announcement meansStudents are welcome.
In fact, I want to split the occurrence by adding a colon so that it becomes means: Students
I have tried:
[a-z][A-Z]
But this expression does not work in TextWrangler.
*EDIT: here are the exact contexts in which the occurrences appear (I mean only with these font colors).*
<font color =#48B700> - Stột jlăm wẻ baOne hundred and three<br></font>
<font color =#C0C0C0> »» Qzống pguộc lyời ba yghìm fảy dyổiTo live a life full of vicissitudes, to live a life marked by ups and downs<br></font>
"baOne" and "dyổiTo" must be "ba: One" and "dyổi: To"
Could anyone help? Many thanks.
Upvotes: 4
Views: 17081
Reputation: 539
This question is ages old, but I stumbled upon it, so someone else might, as well. The OP's comment to Igor's response clarified how the task was meant to be described (& could have be added to the description).
To match only those font-specific lines of the HTML replace
(?<=<font color =#(?:48B700|C0C0C0)>)(.*?[a-z])([A-Z])
with \1: \2
Explanation:
(?<=[fixed-length regex])
is a positive lookbehind and means "if my match has this just before it"(?:48B700|C0C0C0)
is an unnamed group to match only 2 colours. Since they are of the same length, they work in a lookbehind (that needs to be of fixed length)(.*?[a-z])([A-Z])
will match everything after the >
of those begin font tags up to your Capital letters.\1: \2
replacement is the same as in Igor's response, only that \1
will match the entire first string that needs separating.Addition:
Your input strings contain special characters and the part you want to split may very well end in one. In this case they won't be caught by [a-z]
alone. You will need to add a character ranger that captures all the letters you care about, something like
(?<=<font color =#(?:48B700|C0C0C0)>)(.*?[a-zḁ-ῼ])([A-Z])
Upvotes: 2
Reputation: 13453
That is the correct pattern for identifying lower case and upper case letters, however, you will need to check matching to be Case Sensitive within the Find/Replace dialogue.
Upvotes: 0
Reputation: 59471
Replace ([a-z])([A-Z])
with \1:\2
- I don't have TextWrangler, but it works on Notepad++
The parenthesis are for capturing the data, which is referred to using \1
syntax in the replacement string
Upvotes: 2
Reputation: 8558
I do believe (don't have TextWrangler at hand though) that you need to search for ([a-z])([A-Z])
and replace it with: \1: \2
Hope this helps.
Upvotes: 4