Reputation: 13
I need to replace a huge string say ~3MB in size. The replacement has to be done in-memory. I know the start and the end tag but not their indices (obviously) The replacement string is almost of the equal or lesser size!
For example:
< hello >
jaskhdjksehrf87897g8df7g8d7g8dfy9g6d89sg78dfgy9df69g87s97090dgs8d7f6srsd564f5sd45f46sd5f76d5g68df6g785sd67g58576sd5g875sdfg578df6g6sd87g6f89g69s6d8g6 AND MUCH MORE < /hello >
The replacement string would have another start and end tag which has to be replaced with the existing one. Can someone help me out with the best possible method with the least time complexity in doing it. I need to do it in C#.
P.S. Remember the string is not in a file!
Upvotes: 1
Views: 330
Reputation: 205629
Here is the best you can do in my opinion.
First you locate the start/end tag positions within the input string. I see no better way than using linear search with string.IndexOf
method. The only optimization would be to search for end tag starting from the end position of the start tag.
Then you can calculate the resulting string length, allocate char[]
buffer, copy the unchanged parts of the input string and the replacement strings into appropriate places using the string.CopyTo
method, and finally create and return a string from the buffer using the corresponding string
constructor.
Something like this (no validations):
static string Replace(string input, string startTag, string endTag, string newStartTag, string newEndTag, string newContent)
{
// Determine tags start/end positions
int startTagStart = input.IndexOf(startTag);
if (startTagStart < 0) return input; // or throw exception
int startTagEnd = startTagStart + startTag.Length;
int endTagStart = input.IndexOf(endTag, startTagEnd);
if (endTagStart < 0) return input; // or throw exception
int endTagEnd = endTagStart + endTag.Length;
// Determine the resulting string length and allocate a buffer
int resultLength = startTagStart + newStartTag.Length + newContent.Length + newEndTag.Length + (input.Length - endTagEnd);
var buffer = new char[resultLength];
int pos = 0;
// Copy the substring before the start tag
input.CopyTo(0, buffer, pos, startTagStart);
pos += startTagStart;
// Copy the new start tag
newStartTag.CopyTo(0, buffer, pos, newStartTag.Length);
pos += newStartTag.Length;
// Copy the new content
newContent.CopyTo(0, buffer, pos, newContent.Length);
pos += newContent.Length;
// Copy the new end tag
newEndTag.CopyTo(0, buffer, pos, newEndTag.Length);
pos += newEndTag.Length;
// Copy the substring after the end tag
input.CopyTo(endTagEnd, buffer, pos, input.Length - endTagEnd);
// Create and return a string from the buffer
return new string(buffer);
}
Test using the example from your comments:
var input = "<Tag1>string 1</Tag1>";
var output = Replace(input, "<Tag1>", "</Tag1", "<Tag2>", "</Tag2>", "string 2");
// output == "<Tag2>string 2</Tag2>"
Upvotes: 1