Palps
Palps

Reputation: 568

Replace empty string with null in array efficiently

I want to know the most efficient way of replacing empty strings in an array with null values.

I have the following array:

string[] _array = new string [10];
_array[0] = "A";
_array[1] = "B";
_array[2] = "";
_array[3] = "D";
_array[4] = "E";
_array[5] = "F";
_array[6] = "G";
_array[7] = "";
_array[8] = "";
_array[9] = "J";

and I am currently replacing empty strings by the following:

for (int i = 0; i < _array.Length; i++)
{
    if (_array[i].Trim() == "")
    {
        _array[i] = null;
    }
}

which works fine on small arrays but I'm chasing some code that is the most efficient at doing the task because the arrays I am working with could be much larger and I would be repeating this process over and over again.

Is there a linq query or something that is more efficient?

Upvotes: 3

Views: 7709

Answers (6)

Maddy
Maddy

Reputation: 774

Use the below code

_array = _array.Select(str => { if (str.Length == 0) str = null; return str; }).ToArray();

Upvotes: -1

Paul Bartlett
Paul Bartlett

Reputation: 831

It's ugly, but you can eliminate the CALL instruction to the RTL, as I mentioned earlier, with this code:

if (_array[i] != null) {
  Boolean blank = true;
  for(int j = 0; j < value.Length; j++) {
    if(!Char.IsWhiteSpace(_array[i][j])) { 
        blank = false;
        break;
    }
  }

  if (blank) {
    _array[i] = null;
  }
}

But it does add an extra assignment and includes an extra condition and it is just too ugly for me. But if you want to shave off nanoseconds off a massive list then perhaps this could be used. I like the idea of parallel processing and you could wrap this with Parallel.

Upvotes: 0

MarcinJuraszek
MarcinJuraszek

Reputation: 125620

You might consider switching _array[i].Trim() == "" with string.IsNullOrWhitespace(_array[i]) to avoid new string allocation. But that's pretty much all you can do to make it faster and still keep sequential. LINQ will not be faster than a for loop.

You could try making your processing parallel, but that seems like a bigger change, so you should evaluate if that's ok in your scenario.

Parallel.For(0, _array.Length, i => {
    if (string.IsNullOrWhitespace(_array[i]))
    {
        _array[i] = null;
    }
});

Upvotes: 6

ohiodoug
ohiodoug

Reputation: 1513

A linq query will do essentially the same thing behind the scenes so you aren't going to gain any real efficiency simply by using linq.

When determining something more efficient, look at a few things:

  1. How big will your array grow?
  2. How often will the data in your array change?
  3. Does the order of your array matter?

You've already answered that your array might grow to large sizes and performance is a concern.

So looking at options 2 and 3 together, if the order of your data doesn't matter then you could keep your array sorted and break the loop after you detect non-empty strings.

Ideally, you would be able to check the data on the way in so you don't have to constantly loop over your entire array. Is that not a possibility?

Hope this at least gets some thoughts going.

Upvotes: 0

Igor
Igor

Reputation: 62213

As far as efficiency it is fine but it also depends on how large the array is and the frequency that you would be iterating over such arrays. The main problem I see is that you could get a NullReferenceException with your trim method. A better approach is to use string.IsNullOrEmpty or string.IsNullOrWhiteSpace, the later is more along the lines of what you want but is not available in all versions of .net.

for (int i = 0; i < _array.Length; i++)
{
    if (string.IsNullOrWhiteSpace(_array[i]))
    {
        _array[i] = null;
    }
}

Upvotes: 3

Ian
Ian

Reputation: 30813

LINQ is mainly used for querying not for assignment. To do certain action on Collection, you could try to use List. If you use List instead of Array, you could do it with one line instead:

_list.ForEach(x => string.IsNullOrWhiteSpace(x) ? x = null; x = x);

Upvotes: 2

Related Questions