Howard
Howard

Reputation: 3758

C# How can I compare two word strings and indicate which parts are different

For example if I have...

string a = "personil";
string b = "personal";

I would like to get...

string c = "person[i]l";

However it is not necessarily a single character. I could be like this too...

string a = "disfuncshunal";
string b = "dysfunctional";

For this case I would want to get...

string c = "d[isfuncshu]nal";

Another example would be... (Notice that the length of both words are different.)

string a = "parralele";
string b = "parallel";

string c = "par[ralele]";

Another example would be...

string a = "ato";
string b = "auto";

string c = "a[]to";

How would I go about doing this?

Edit: The length of the two strings can be different.

Edit: Added additional examples. Credit goes to user Nenad for asking.

Upvotes: 2

Views: 6763

Answers (5)

Nenad
Nenad

Reputation: 26637

I must be very bored today, but I actually made UnitTest that pass all 4 cases (if you did not add some more in the meantime).

Edit: Added 2 edge cases and fix for them.

Edit2: letters that repeat multiple times (and error on those letters)

[Test]
[TestCase("parralele", "parallel", "par[ralele]")]
[TestCase("personil", "personal", "person[i]l")]
[TestCase("disfuncshunal", "dysfunctional", "d[isfuncshu]nal")]
[TestCase("ato", "auto", "a[]to")]
[TestCase("inactioned", "inaction", "inaction[ed]")]
[TestCase("refraction", "fraction", "[re]fraction")]
[TestCase("adiction", "ad[]diction", "ad[]iction")]
public void CompareStringsTest(string attempted, string correct, string expectedResult)
{
    int first = -1, last = -1;

    string result = null;
    int shorterLength = (attempted.Length < correct.Length ? attempted.Length : correct.Length);

    // First - [
    for (int i = 0; i < shorterLength; i++)
    {
        if (correct[i] != attempted[i])
        {
            first = i;
            break;
        }
    }

    // Last - ]
    var a = correct.Reverse().ToArray();
    var b = attempted.Reverse().ToArray();
    for (int i = 0; i < shorterLength; i++)
    {
        if (a[i] != b[i])
        {
            last = i;
            break;
        }
    }

    if (first == -1 && last == -1)
        result = attempted;
    else
    {
        var sb = new StringBuilder();
        if (first == -1)
            first = shorterLength;
        if (last == -1)
            last = shorterLength;
        // If same letter repeats multiple times (ex: addition)
        // and error is on that letter, we have to trim trail.
        if (first + last > shorterLength)
            last = shorterLength - first;

        if (first > 0)
            sb.Append(attempted.Substring(0, first));

        sb.Append("[");

        if (last > -1 && last + first < attempted.Length)
            sb.Append(attempted.Substring(first, attempted.Length - last - first));

        sb.Append("]");

        if (last > 0)
            sb.Append(attempted.Substring(attempted.Length - last, last));

        result = sb.ToString();
    }
    Assert.AreEqual(expectedResult, result);
}

Upvotes: 5

Alexei Levenkov
Alexei Levenkov

Reputation: 100527

Not really good approach but as an exercise in using LINQ: task seem to be find matching prefix and suffix for 2 strings, return "prefix + [+ middle of first string + suffix.

So you can match prefix (Zip + TakeWhile(a==b)), than repeat the same for suffix by reversing both strings and reversing result.

var first = "disfuncshunal";
var second = "dysfunctional";

// Prefix
var zipped = first.ToCharArray().Zip(second.ToCharArray(), (f,s)=> new {f,s});
var prefix = string.Join("", 
    zipped.TakeWhile(c => c.f==c.s).Select(c => c.f));

// Suffix
var zippedReverse = first.ToCharArray().Reverse()
   .Zip(second.ToCharArray().Reverse(), (f,s)=> new {f,s});
var suffix = string.Join("", 
    zippedReverse.TakeWhile(c => c.f==c.s).Reverse().Select(c => c.f));

// Cut and combine.
var middle = first.Substring(prefix.Length,
      first.Length - prefix.Length - suffix.Length);
var result = prefix + "[" + middle + "]" + suffix;

Much easier and faster approach is to use 2 for loops (from start to end, and from end to start).

Upvotes: 0

Javascript Elf
Javascript Elf

Reputation: 1

You did not specify what to do if the strings were of different lengths, but here is a solution to the problem when the strings are of equal length:

private string Compare(string string1, string string2) {
            //This only works if the two strings are the same length..
            string output = "";
            bool mismatch = false;
            for (int i = 0; i < string1.Length; i++) {
                char c1 = string1[i];
                char c2 = string2[i];
                if (c1 == c2) {
                    if (mismatch) {
                        output += "]" + c1;
                        mismatch = false;
                    } else {
                        output += c1;
                    }
                } else {
                    if (mismatch) {
                        output += c1;
                    } else {
                        output += "[" + c1;
                        mismatch = true;
                    }
                }
            }
            return output;
        }

Upvotes: 0

Lasse V. Karlsen
Lasse V. Karlsen

Reputation: 391306

Have you tried my DiffLib?

With that library, and the following code (running in LINQPad):

void Main()
{
    string a = "disfuncshunal";
    string b = "dysfunctional";

    var diff = new Diff<char>(a, b);

    var result = new StringBuilder();
    int index1 = 0;
    int index2 = 0;
    foreach (var part in diff)
    {
        if (part.Equal)
            result.Append(a.Substring(index1, part.Length1));
        else
            result.Append("[" + a.Substring(index1, part.Length1) + "]");
        index1 += part.Length1;
        index2 += part.Length2;
    }
    result.ToString().Dump();
}

You get this output:

d[i]sfunc[shu]nal

To be honest I don't understand what this gives you, as you seem to completely ignore the changed parts in the b string, only dumping the relevant portions of the a string.

Upvotes: 1

Mike Perrenoud
Mike Perrenoud

Reputation: 67898

Here is a complete and working console application that will work for both examples you gave:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication2
{
    class Program
    {
        static void Main(string[] args)
        {
            string a = "disfuncshunal";
            string b = "dysfunctional";

            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < a.Length; i++)
            {
                if (a[i] != b[i])
                {
                    sb.Append("[");
                    sb.Append(a[i]);
                    sb.Append("]");

                    continue;
                }

                sb.Append(a[i]);
            }

            var str = sb.ToString();
            var startIndex = str.IndexOf("[");
            var endIndex = str.LastIndexOf("]");

            var start = str.Substring(0, startIndex + 1);
            var mid = str.Substring(startIndex + 1, endIndex - 1);
            var end = str.Substring(endIndex);

            Console.WriteLine(start + mid.Replace("[", "").Replace("]", "") + end);
        }
    }
}

it will not work if you want to display more than one entire section of the mismatched word.

Upvotes: 0

Related Questions