justSteve
justSteve

Reputation: 5524

Regex to remove line break when next line does not start with a given string

Working in .net i'm parsing a log file where some lines do not begin with '"2018'. I need a .Match clause that will find lines where the line begins with anything except the string "2018 (note that includes the double quote). When found (and this is the tricky bit) - remove the line break from the line before the offending line. In other words, append offending lines to the line above it.

"2018-02-22 10:06:10,857","[7]"," ERROR","MyApp.Web.Infrastructure.ErrorResponseCommand","ErrorResponseCMD logs Controller: webinar | Action: Index",""
"2018-02-22 10:06:37,742","[11]"," INFO ","MyApp.Web.MvcApplication","Anon Session Starts with: {""FirstPage"": ""https://www.bankwebinars.com/wp-login.php"", ""QueryString"": """", ""SessionId"": ""uhnev2dnds33dastwrdgftvm"", ""FirstCookies"": {""CookieName"": ""ASP.NET_SessionId"", ""Value"": ""uhnev2dnds33dastwrdgftvm""}}",""
"2018-02-22 10:06:48,053","[11]"," INFO ","MyApp.Web.Controllers.CartController","SessionInfo{
  ""FirstPage"": null,
  ""RemoteAddress"": ""207.46.13.159"",
  ""RemoteHost"": ""207.46.13.159"",
  ""RemoteUser"": """",
RelativeConfirmPasswordResetUrl:Account/PasswordResetConfirm
//and other non-predictable BOL patterns.
},""
"2018-02-22 10:06:10,857","[7]"," ERROR","MyApp.Web.Infrastructure.ErrorResponseCommand","ErrorResponseCMD logs Controller: webinar | Action: Index",""

ADDENDUM: Having tried the suggested pattern - and noting that pattern works correctly for regex101's sandbox - there must be something else wrong. Here's my current code.

string str = File.ReadAllText("myLog.log");            
Regex rx = new Regex("(?m)\r?\n^(?!\"2018)", RegexOptions.Singleline);
str = rx.Replace(str, "\"2018");            
File.WriteAllText("test1.txt", str);

I've tried a bunch of variations on the pattern - e.g. I think the RegexOption clause is equivalent to the (?m) phrase so I've tried omitting that. Singleline should be what i want since it views the whole file as a single line but I've tried Multiline mode as well. It's a Windows file so the ? qualifier between \r and \n should not be required. None of the variations have changed the output.

Upvotes: 3

Views: 670

Answers (3)

revo
revo

Reputation: 48741

Possible problems with your code in a top-down order

1- I saw documentation page of File.ReadAllText() emphasizes:

The resulting string does not contain the terminating carriage return and/or line feed.

If that's the problem take a look at this thread, I'm not a .NET guru.

2- And you need to @-quote regex string beside caring about inner double quotation mark ("" denotes a " in @-quoted string) and removing s flag as well which is extra.

Regex rx = new Regex(@"(?m)\r?\n^(?!""2018)");

3- Next thing is replacement string that you provided. You should replace with nothing. A Zero-Width Negative Lookahead Assertion asserts and doesn't consume:

str = rx.Replace(str, ""); 

Live demo

Upvotes: 2

CodeFuller
CodeFuller

Reputation: 31312

Here is regex replace that does the job:

str = Regex.Replace(str, @"\r?\n(?!""2018)", String.Empty);

The following code from the question is incorrect:

Regex rx = new Regex("(?m)\r?\n^(?!\"2018)", RegexOptions.Singleline);
str = rx.Replace(str, "\"2018");

(?!\"2018) is a negative lookahead. Like other lookarounds it does not actually capture matched text. That's why rx.Replace(str, "\"2018") will cause adding of "2018 to each moved string. For example for input:

"2018" Line 1
"2018" Line 2
  Sub-line 1
  Sub-line 2
"2018" Line 3

you'll get the following result:

"2018" Line 1
"2018" Line 2"2018  Sub-line 1"2018  Sub-line 2
"2018" Line 3"2018

That's why you should replace matched parts with just an empty string. In this case you will get correct result:

"2018" Line 1
"2018" Line 2  Sub-line 1  Sub-line 2
"2018" Line 3

Upvotes: 2

Devin Goble
Devin Goble

Reputation: 2867

I was able to get what I think is the desired result by doing the following:

Regex.Replace(logString, @"\r\n\s\s", "", RegexOptions.Multiline)

Upvotes: 0

Related Questions