Stephen Patten
Stephen Patten

Reputation: 6363

How to parse a message from a text file with c# that spans multiple lines?

Given this log file, how can I read a line with multiple new lines (\n) with a StreamReader? The ReadLine method literally returns each line, but a message may span more that one line.

Larger Image for the down votes

Here is what I have so far

using (var sr = new StreamReader(filePath))
using (var store = new DocumentStore {ConnectionStringName = "RavenDB"}.Initialize())
{
    IndexCreation.CreateIndexes(typeof(Logs_Search).Assembly, store);

    using (var bulkInsert = store.BulkInsert())
    {
        const char columnDelimeter = '|';
        const string quote = @"~";
        string line;

        while ((line = sr.ReadLine()) != null)
        {
            batch++;
            List<string> columns = null;
            try
            {
                columns = line.Split(columnDelimeter)
                                .Select(item => item.Replace(quote, string.Empty))
                                .ToList();

                if (columns.Count != 5)
                {
                    batch--;
                    Log.Error(string.Join(",", columns.ToArray()));
                    continue;
                }

                bulkInsert.Store(LogParser.Log.FromStringList(columns));

                /* Give some feedback */
                if (batch % 100000 == 0)
                {
                    Log.Debug("batch: {0}", batch);
                }

                /* Use sparingly */
                if (ThrottleEnabled && batch % ThrottleBatchSize == 0)
                {
                    Thread.Sleep(ThrottleThreadWait);
                }
            }
            catch (FormatException)
            {
                if (columns != null) Log.Error(string.Join(",", columns.ToArray()));
            }
            catch (Exception exception)
            {
                Log.Error(exception);
            }
        }
    }                   
}

And the Model

public class Log
{
    public string Component { get; set; }
    public string DateTime { get; set; }
    public string Logger { get; set; }
    public string Level { get; set; }
    public string ThreadId { get; set; }
    public string Message { get; set; }
    public string Terms { get; set; }

    public static Log FromStringList(List<string> row)
    {
        Log log = new Log();

        /*log.Component = row[0] == string.Empty ? null : row[0];*/
        log.DateTime = row[0] == string.Empty ? null : row[0].ToLower();
        log.Logger = row[1] == string.Empty ? null : row[1].ToLower();
        log.Level = row[2] == string.Empty ? null : row[2].ToLower();
        log.ThreadId = row[3] == string.Empty ? null : row[3].ToLower();
        log.Message = row[4] == string.Empty ? null : row[4].ToLower();

        return log;
    }
}

Upvotes: 2

Views: 870

Answers (3)

Jim Mischel
Jim Mischel

Reputation: 134085

If you can read the entire file into memory (i.e. File.ReadAllText), then you can treat it as a single string and use regular expressions to split on the date, or some such.

A more general solution that takes less memory would be to read the file line-by-line. Append lines to a buffer until you get the next line that starts with the desired value (in your case, a date/time stamp). Then process that buffer. For example:

StringBuilder buffer = new StringBuilder();
foreach (var line in File.ReadLines(logfileName))
{
    if (line.StartsWith("2013-06-19"))
    {
        if (sb.Length > 0)
        {
            ProcessMessage(sb.ToString());
            sb.Clear();
        }
        sb.AppendLine(line);
    }
}
// be sure to process the last message
if (sb.Length > 0)
{
    ProcessMessage(sb.ToString());
}

Upvotes: 2

gmail user
gmail user

Reputation: 2783

It is hard to see your file. But I would say read it line by line and Append to some variable. Check for end of message. When you see it, do whatever you want to do with the message in that variable (insert into DB etc...) and then keep reading the next message.

Pseudo code

read the line
variable a = a +  new line
if end of message
    insert into DB
    reset the variable
continue reading the message.....

Upvotes: 0

Abe Miessler
Abe Miessler

Reputation: 85116

I would use Regex.Split and break the file up on anything that matches the date pattern (ex. 2013-06-19) at the beginning of each error.

Upvotes: 3

Related Questions