jayant rohra
jayant rohra

Reputation: 13

Best Possible String replacement In Memory

I need to replace a huge string say ~3MB in size. The replacement has to be done in-memory. I know the start and the end tag but not their indices (obviously) The replacement string is almost of the equal or lesser size! For example:
< hello > jaskhdjksehrf87897g8df7g8d7g8dfy9g6d89sg78dfgy9df69g87s97090dgs8d7f6srsd564f5sd45f46sd5f76d5g68df6g785sd67g58576sd5g875sdfg578df6g6sd87g6f89g69s6d8g6 AND MUCH MORE < /hello >

The replacement string would have another start and end tag which has to be replaced with the existing one. Can someone help me out with the best possible method with the least time complexity in doing it. I need to do it in C#.

P.S. Remember the string is not in a file!

Upvotes: 1

Views: 330

Answers (1)

Ivan Stoev
Ivan Stoev

Reputation: 205629

Here is the best you can do in my opinion.

First you locate the start/end tag positions within the input string. I see no better way than using linear search with string.IndexOf method. The only optimization would be to search for end tag starting from the end position of the start tag.

Then you can calculate the resulting string length, allocate char[] buffer, copy the unchanged parts of the input string and the replacement strings into appropriate places using the string.CopyTo method, and finally create and return a string from the buffer using the corresponding string constructor.

Something like this (no validations):

static string Replace(string input, string startTag, string endTag, string newStartTag, string newEndTag, string newContent)
{
    // Determine tags start/end positions
    int startTagStart = input.IndexOf(startTag);
    if (startTagStart < 0) return input; // or throw exception
    int startTagEnd = startTagStart + startTag.Length;
    int endTagStart = input.IndexOf(endTag, startTagEnd);
    if (endTagStart < 0) return input; // or throw exception
    int endTagEnd = endTagStart + endTag.Length;

    // Determine the resulting string length and allocate a buffer
    int resultLength = startTagStart + newStartTag.Length + newContent.Length + newEndTag.Length + (input.Length - endTagEnd);
    var buffer = new char[resultLength];
    int pos = 0;

    // Copy the substring before the start tag
    input.CopyTo(0, buffer, pos, startTagStart);
    pos += startTagStart;
    // Copy the new start tag
    newStartTag.CopyTo(0, buffer, pos, newStartTag.Length);
    pos += newStartTag.Length;
    // Copy the new content
    newContent.CopyTo(0, buffer, pos, newContent.Length);
    pos += newContent.Length;
    // Copy the new end tag
    newEndTag.CopyTo(0, buffer, pos, newEndTag.Length);
    pos += newEndTag.Length;
    // Copy the substring after the end tag
    input.CopyTo(endTagEnd, buffer, pos, input.Length - endTagEnd);

    // Create and return a string from the buffer
    return new string(buffer);
}

Test using the example from your comments:

var input = "<Tag1>string 1</Tag1>";
var output = Replace(input, "<Tag1>", "</Tag1", "<Tag2>", "</Tag2>", "string 2");
// output == "<Tag2>string 2</Tag2>"

Upvotes: 1

Related Questions