Reputation: 1774
I have a question related to regular expressions in c#.
I want to find text between " characters. Example:
Enum resultado = SPDialogBox.Instance.show<ACTION_ENUMs.TORNEO_SORTEAR>("Esto es una prueba");
Matches: Esto es una prueba
But, in this example
Enum resultado = SPDialogBox.Instance.show<ACTION_ENUMs.TORNEO_SORTEAR>("Esto es una prueba");
pKR_MESAPUESTOASIGNACION.CONFIGTORNEO_ID = Valid.GetInt(dr.Cells["CONFIGTORNEO_ID"].Value);
Matches: Esto es una prueba
but must not match CONFIGTORNEO_ID
, because it is written between square brackets ([]
)
In brief, I want to match string between double quote ("
) characters, but that string must not be written between square brackets ([]
).
Here is my code:
var pattern = "\"(.*?)\"";
var matches = Regex.Matches(fullCode, pattern, RegexOptions.Multiline);
foreach (Match m in matches)
{
Console.WriteLine(m.Groups[1]);
}
That pattern matches all string between "
characters, but how can I modify the pattern to exclude those string that are written between square brackets?
-- edit ---
here is another example:
List<String> IdSorteados = new List<String>();
int TablesToSort = 0;
foreach (UltraGridRow dr in fg.hfg_Rows)
{
if (dr.Cells["MESA_ID"].Value == DBNull.Value && dr.Cells["Puesto"].Value == DBNull.Value && !Valid.GetBoolean(dr.Cells["BELIMINADO"].Value) && (Valid.GetBoolean(dr.Cells["Seleccionado"].Value) || SortearTodo))
TablesToSort++;
}
The expression must not match MESA_ID
( found within Cells["MESA_ID"].Value
) nor Puesto
(found within Cells["Puesto"].Value
). It also must not match ].Value == DBNull.Value && dr.Cells[
(found within ["MESA_ID"].Value == DBNull.Value && dr.Cells["Puesto"]
)
I hope I have made my intent clear.
Upvotes: 3
Views: 1500
Reputation: 3711
Many times I have to parse source code files (php|cpp|java|js|css|etc) and do some regexp replacements. To avoid replacing some strings/messages I mask all strings before doing my replacements, so I have to capture all possible strings and mask them.
This is how I capture all strings: /(['"])(\\\1|.)*?\1/gm
which means:
['"]
\
operator): (\\\1|.)*
?
\1
I want this search to be made both globally (to capture all possible matches) and also multi-line (a string may not continue on a new line delimited by CRLF, right?)
Perhaps you are interested not only to find but also to capture these strings groups so make sure you put within group delimiter the (\\\1|.)*?
which gives the final pattern:
([\'"])((\\\1|.)*?)\1
Examples of strings captured:
defined ( 'WP_DEBUG' ) || define( '\WP_DEBUG', true );
echo 'class="input-text card-number" type="text" maxlength="20"';
echo 'How are you? I\'m fine, thank you';
Check my pattern in an online regex tester.
Upvotes: 0
Reputation: 726579
To avoid matching quoted nested inside square brackets, you need to check that one of the following is true:
[
, or]
This can be done using this regexp:
(?<!\[\s*)\"[^"]*\"(?!\s*\])
It uses the lookaround feature of .NET regexp engine.
Note how this expression avoids the reluctant qualifier ?
inside the quoted string by using [^"]*
instead of .*?
.
Upvotes: 1
Reputation: 101604
Simple use a negative look-behind:
(?<!\[)
Basically, only match a string when not preceded by a [
. Example here, and code as follows:
String fullCode = "Enum resultado = SPDialogBox.Instance.show<ACTION_ENUMs.TORNEO_SORTEAR>(\"Esto es una prueba\");\r\n"
+ "pKR_MESAPUESTOASIGNACION.CONFIGTORNEO_ID = Valid.GetInt(dr.Cells[\"CONFIGTORNEO_ID\"].Value);";
String pattern = @"(?<!\[)\x22(.*?)\x22";
var matches = Regex.Matches(fullCode, pattern, RegexOptions.Multiline);
foreach (Match m in matches)
{
Console.WriteLine(m.Groups[1]);
}
Upvotes: 2