PassionateDeveloper
PassionateDeveloper

Reputation: 15158

Remove from String Array what is in List

I have a String Array x and a List y and I want to remove all data from Y from the List X, how to do that in the fastest way?

e.g.: X: 1) "aaa.bbb.ccc" 2) "ddd.eee.fff" 3) "ggg.hhh.jjj"

Y: 1) "bbb" 2) "fff"

Result should be a new List in Which only 3) exist because X.1 gets deleted by Y.1 and X.2 gets deleted by Y.2

How to do that?

I know I could do a foreach on the List X and check with everything in List Y, bit is that the fastest way?

Upvotes: 5

Views: 279

Answers (5)

Esteban Elverdin
Esteban Elverdin

Reputation: 3582

Try this, using Aggregate function

    var xArr = new string[] { "aaa.bbb.ccc", "ddd.eee.fff", "ggg.hhh.jjj" };
    var yList = new List<string> { "bbb", "fff" };

    var result = xArr.Aggregate(new List<string> { }, (acc, next) =>
    {
        var elems = next.Split('.');
        foreach (var y in yList)
            if (elems.Contains(y))
                return acc;
        acc.Add(next);
        return acc;
    });

Upvotes: 1

Marc Gravell
Marc Gravell

Reputation: 1064244

The most convenient would be

var Z = X.Where(x => !x.Split('.').Intersect(Y).Any()).ToList();

That is not the same as "fastest". Probably the fastest (runtime) way to do that is to use a token search, like:

public static bool ContainsToken(string value, string token, char delimiter = '.')
{
    if (string.IsNullOrEmpty(token)) return false;
    if (string.IsNullOrEmpty(value)) return false;

    int lastIndex = -1, idx, endIndex = value.Length - token.Length, tokenLength = token.Length;
    while ((idx = value.IndexOf(token, lastIndex + 1)) > lastIndex)
    {
        lastIndex = idx;
        if ((idx == 0 || (value[idx - 1] == delimiter))
            && (idx == endIndex || (value[idx + tokenLength] == delimiter)))
        {
            return true;
        }
    }
    return false;
}

then something like:

var list = new List<string>(X.Length);
foreach(var x in X)
{
    bool found = false;
    foreach(var y in Y)
    {
        if(ContainsToken(x, y, '.'))
        {
            found = true;
            break;
        }
    }
    if (!found) list.Add(x);
}

This:

  • doesn't allocate arrays (for the output of Split, of for the params char[] of Split)
  • doesn't create any new string instances (for the output of Split)
  • doesn't use delegate abstraction
  • doesn't have captured scopes
  • uses the struct custom iterator of List<T> rather than the class iterator of IEnumerable<T>
  • starts the new List<T> with the appropriate worst-case size to avoid reallocations

Upvotes: 9

tnw
tnw

Reputation: 13887

If you've got a relatively small list the performance ramifications wouldn't really be a big deal. This is the simplest foreach solution I could come up with.

List<string> ListZ = ListX.ToList();

foreach (string x in ListX)
{
    foreach (string y in ListY)
    {
        if (x.Contains(y))
            ListZ.Remove(x);
    }
}

Upvotes: 0

Matthew Watson
Matthew Watson

Reputation: 109862

I think that a fairly fast approach would be to use List's built-in RemoveAll() method:

List<string> x = new List<string>
{
    "aaa.bbb.ccc",
    "ddd.eee.fff",
    "ggg.hhh.jjj"
};

List<string> y = new List<string>
{
    "bbb",
    "fff"
};

x.RemoveAll(s => y.Any(s.Contains));

(Note that I am assuming that you have two lists, x and y. Your OP mentions a string array but then goes on to talk about "List X" and "List Y", so I'm ignoring the string array bit.)

Upvotes: 1

Roy Dictus
Roy Dictus

Reputation: 33149

Iterating over X and Y would indeed be the fastest option because you have this Contains constraint. I really don't see any other way.

It should not be a foreach over X though, because you cannot modify the collection you iterate over with foreach.

So an option would be:

for (int counterX = 0; counterX < X.Length; counterX++)
{
    for(int counterY = 0; counterY < Y.Length; counterY++)
    {
        if (X[counterX].Contains(Y[counterY]))
        {
            X.RemoveAt(counterX--);
            counterY = Y.Length;
        }
    }
}

This should do it (mind you, this code is not tested).

Upvotes: 1

Related Questions