Valryon
Valryon

Reputation: 368

Replacing text in a huge string without memory leak

I am currently working on a batch that must generate about 16000 emails in a row (a newsletter).

Either it is spam or not, my question is about how I generate those e-mails.

Some fields in the message must be replaced by custom values (date of the day, name of the user, etc).

For some deadline and code-reusability reasons my template is an HTML file with some "_FIELDNAME" fields that can be easily spotted by a regex :

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
...
<body>
<p>Hi _NAME, _DATE newsletter.</p>
</body>
...

The file is about ~1000 lines so it is quite a big string when loaded.

First thing, I load once the HTML file template in a string :

string template = File.ReadAllText(@"Template/newsletter.html");

And the replacing function looks like this :

return new StringBuilder(template)
.Replace("_DATE", profileConfig.SelectedMonth.ToString("MMMM yyyy"))
.Replace("_NAME", profileConfig.Name)
.ToString();

The problem is that the memory consumption increase slightly over each iteration. It's about 50MB for 1000 iterations, and it's due to my replacing function (I tried to comment it and the memory leaks disappeared).

How can I replace many fields (~50) in my template without overflowing the memory for my 16000 iterations ? I tried a couple a thing, like using Regex (but it's using string) or temporary files but both didn't satisfied me.

Thanks in advance for your help.

Upvotes: 2

Views: 1576

Answers (3)

Valryon
Valryon

Reputation: 368

After trying many solution, I finally decided to restart my batch from scratch.

I now use a proper XSLT file to generate the HTML from a XML configuration.

Memory Consumption still increases over time but it is now slower. I guess the garbage collector don't want to collect as my computer has 6GB RAM and no other huge processes to run.

Upvotes: 0

Alex
Alex

Reputation: 23300

If you can replace your _DATE, _NAME, etc. with {0}, {1}, etc. you can try string.Format()

Template would become:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
...
<body>
<p>Hi {0}, {1} newsletter.</p>
</body>
...

And code would look like this:

return string.Format(template, 
        profileConfig.SelectedMonth.ToString("MMMM yyyy"), 
        profileConfig.Name
    );

You actually don't need to go through a StringBuilder at all. You would greatly benefit in speed (and probably in resources usage) if you went File.ReadAllLines() and only swapped values in the lines which contain tokens.

UPDATE In order to enforce the use of the string.Format(string format, params object[] args) overload you may have to put all your arguments into a collection.

The following should make this solution work for you (I tested it up to 1000 arguments and it's both working and quite fast).

List<string> tokenValues = new List<string> 
{ 
    profileConfig.SelectedMonth.ToString("MMMM yyyy"), 
    profileConfig.Name, 
    <follow with your other values>
};
return string.Format(template, tokenValues.ToArray()); //.ToArray() is mandatory

Upvotes: 3

Serj-Tm
Serj-Tm

Reputation: 16981

    var patterns = new Dictionary<string, string>();
    patterns["_Date"] = profileConfig.SelectedMonth.ToString("MMMM yyyy");
    patterns["_Name"] = profileConfig.Name;

    var builder = new StringBuilder(template.Length);
    for (var i = 0; i < template.Length;)
    {
      var pattern = CompareAndFindPattern(template, i, patterns);
      if (pattern != null)
      {
        builder.Append(pattern.Value.Value);
        i += pattern.Value.Key.Length;
      }
      else
      {
        builder.Append(template[i]);
        i++;
      }
    }

  static KeyValuePair<string, string>? CompareAndFindPattern(string template, int index, Dictionary<string, string> patterns)
  {
    foreach (var pattern in patterns)
    {
      if (string.Compare(template, index, pattern.Key, 0, pattern.Key.Length) == 0)
        return pattern;
    }
    return null;
  }

Upvotes: 1

Related Questions