Reputation: 1678
I have code that is adding ~100,000 items to a List.
If I add an array of strings or objects the code runs almost instantly (under 100 ms), but if I try to add an array of structs, it takes almost 1.5 seconds just for the .Add calls.
Why is there such a performance impact when using a struct[]?
Here is my struct:
public struct LiteRowInfo
{
public long Position;
public int Length;
public int Field;
public int Row;
public LiteRowInfo(long position, int length, int field, int row)
{
this.Position = position;
this.Length = length;
this.Field = field;
this.Row = row;
}
}
EDIT 2: Performances of the string method is faster than that of the struct: I appreciate the comments, it does seem like there is additional overhead in creating the struct its self. I think I will just create 2 seperate list to store the position and length to improve performance.
private void Test()
{
Stopwatch watch = new Stopwatch();
watch.Start();
List<LiteRowInfo[]> structList = new List<LiteRowInfo[]>();
for (int i = 0; i < 100000; i++)
{
LiteRowInfo[] info = new LiteRowInfo[20];
for (int x = 0; x < 20; x++)
{
LiteRowInfo row;
row.Length = x;
row.Position = (long)i;
info[x] = row;
}
structList.Add(info);
}
Debug.Print(watch.ElapsedMilliseconds.ToString());
watch.Reset();
watch.Start();
List<string[]> stringList = new List<string[]>();
for (int i = 0; i < 100000; i++)
{
string[] info = new string[20];
for (int x = 0; x < 20; x++)
{
info[x] = "String";
}
stringList.Add(info);
}
Debug.Print(watch.ElapsedMilliseconds.ToString());
}
EDIT: Here is all relevant code: Note: If I comment out only the pos.Add(rowInfo); line, the performance is similar to that of a string[] or int[].
private void executeSqlStream()
{
List<LiteRowInfo[]> pos = new List<LiteRowInfo[]>();
long currentPos = 0;
_stream = new MemoryStream();
StreamWriter writer = new StreamWriter(_stream);
using (SqlConnection cnn = new SqlConnection(_cnnString))
{
cnn.Open();
SqlCommand cmd = new SqlCommand(_sqlString, cnn);
SqlDataReader reader = cmd.ExecuteReader();
int fieldCount = reader.FieldCount;
int rowNum = 0;
UnicodeEncoding encode = new UnicodeEncoding();
List<string> fields = new List<string>();
for (int i = 0; i < fieldCount; i++)
{
fields.Add(reader.GetFieldType(i).Name);
}
while (reader.Read())
{
LiteRowInfo[] rowData = new LiteRowInfo[fieldCount];
for (int i = 0; i < fieldCount; i++)
{
LiteRowInfo info;
if (reader[i] != DBNull.Value)
{
byte[] b;
switch (fields[i])
{
case "Int32":
b = BitConverter.GetBytes(reader.GetInt32(i));
break;
case "Int64":
b = BitConverter.GetBytes(reader.GetInt64(i));
break;
case "DateTime":
DateTime dt = reader.GetDateTime(i);
b = BitConverter.GetBytes(dt.ToBinary());
break;
case "Double":
b = BitConverter.GetBytes(reader.GetDouble(i));
break;
case "Boolean":
b = BitConverter.GetBytes(reader.GetBoolean(i));
break;
case "Decimal":
b = BitConverter.GetBytes((float)reader.GetDecimal(i));
break;
default:
b = encode.GetBytes(reader.GetString(i));
break;
}
int len = b.Length;
info.Position = currentPos += len;
info.Length = len;
info.Field = i;
info.Row = rowNum;
currentPos += len;
_stream.Write(b, 0, len);
}
else
{
info.Position = currentPos;
info.Length = 0;
info.Field = i;
info.Row = rowNum;
}
rowData[i] = info;
}
rowNum++;
pos.Add(rowData);
}
}
}
Upvotes: 0
Views: 2585
Reputation: 1501043
Given that the array itself is a reference type, I very much doubt that you're actually seeing what you think you're seeing.
I suspect the difference isn't in adding an array reference to a list - I suspect it's creating the array in the first place. Each array element will take more space than a reference, so you're having to allocate more memory. That may well mean you're also triggering garbage collection.
To benchmark just List<T>.Add
, I suggest that you repeatedly add a reference to the same array several times.
As an aside, having an array as the list element type feels like a bit of a smell to me. There are times when that's valid, but personally I would consider whether it's actually something which could be encapsulated in another type.
EDIT: You say you've posted all the relevant code, but that really isn't benchmark code for List<T>.Add
- it contains database access for one thing, which is almost certainly taking way longer than any of the in-memory manipulation!
Upvotes: 8
Reputation: 81660
There could be some boxing happening in the code which is not related to the List<>
since generic Lists handle value types without boxing. Unless sharing the code, cannot help.
Upvotes: 0