Noollab
Noollab

Reputation: 35

How do I capitalize an entire text except for certain patterns?

I have a text (string) that I want in all upper case, except the following:

  1. Words starting with : (colon)
  2. Words or strings surrounded by double quotation marks, ""
  3. Words or strings surrounded by single quotation marks, ''

Everything else should be replaced with its upper case counterpart, and formatting (whitespaces, line breaks, etc.) should remain.

How would I go about doing this using Regex (C# style/syntax)?

Upvotes: 0

Views: 183

Answers (2)

Kobi
Kobi

Reputation: 138007

I think you are looking for something like this:

text = Regex.Replace(text, @":\w+|""[^""]*""|'[^']*'|(.)",
                     match => match.Groups[1].Success ?
                              match.Groups[1].Value.ToUpper() : match.Value);
  • :\w+ - match words with a colon.
  • "[^"]*"|'[^']*' - match quoted text. For escaped quotes, you may use:

    "[^"\\]*(?:\\.[^"\\]*)*"|'[^'\\]*(?:\\.[^'\\]*)*'
    
  • (.) - capture anything else (you can also try ([^"':]*|.), it might be faster).

Next, we use a callback for Regex.Replace to do two things:

  • Determine if we need to keep the text as-is, or
  • Return the upper-case version of the text.

Working example: http://ideone.com/ORFU8

Upvotes: 4

tuxtimo
tuxtimo

Reputation: 2790

You can start with this RegEx:

\b(?<![:"'])(\w+?)(?!["'])\b

But of course you have to improve it by yourself, if it is not enough. For example this will also not find "dfgdfg' (not equal quotation) The word which is found is in the first match ($1)

Upvotes: 1

Related Questions