edoedoedo
edoedoedo

Reputation: 1641

Use a SqlDataReader to read multiple rows at the same time (block read)

I'm using a SqlDataReader to retrieve some "SELECT" query from a DBMS. So far, I read each row one by one in the result set using SqlDataReader.read(), and I process them one by one. When the result set is huge (meaning millions of rows times hundreds of columns), iterating with .read() is very slow. I'm asking: is there a way to do a "block" read from SqlDataReader, meaning that i.e. something like SqlDataReader.read(100) gives me an array of the next 100 rows in the result set?

I thought about doing something like DataTable.Load() to load all the result set in memory, but since the table has a size of several gigabytes, it would't fit in memory.

What would you recommend? Thanks a lot

Example code:

TdConnection conn;
TdCommand cmd;
TdDataReader reader;
IAsyncResult res;

conn = new TdConnection(@"xxxx;");
conn.Open();

cmd = new TdCommand(q,conn);
res = cmd.BeginExecuteReader();

while (!res.IsCompleted);

reader = cmd.EndExecuteReader(res);
if (reader.HasRows)
{
    string sb;
    string currentout = "C:\file";
    string[] row = new string[reader.FieldCount];
    sb = "";
    for (int i = 0; i < reader.FieldCount; i++)
        row[i] = reader.GetName(i);

    sb = String.Concat(sb,String.Join("\t",row),"\r\n");

    File.WriteAllText(currentout,sb);

    sb = "";

    /* With a test query, the following commented "while" takes 5 minutes 
    /* to iterate over a dataset with 639967 rows x 63 columns (about 300MB)
    /* while(reader.Read());
    */

    /* With the same test query, the following "while block" takes 6 minutes
    /* to iterate over the same dataset AND writing it on a text file
    /* I conclude that I/O writing on text file is fast, and .Read() is very slow
    */
    while(reader.Read())
    {
        for (int i = 0; i < reader.FieldCount; i++)
                row[i] = reader.GetValue(i).ToString();

        sb = String.Concat(sb,String.Join("\t",row),"\r\n");

        if (sb.Length > 100000)
        {
            File.AppendAllText(currentout,sb);
            sb = "";
        }
    }
    File.AppendAllText(currentout,sb);
}

reader.Close();
reader.Dispose();
cmd.Dispose();
conn.Close();

The "Td" components are the Teradata DBMS interface for .NET (but they behave just like "SQL" components).

Upvotes: 2

Views: 1033

Answers (1)

usr
usr

Reputation: 171178

What's slow here is the quadratic cost of the string concatenation in the loop:

sb = String.Concat(sb,String.Join("\t",row),"\r\n");

Since this is such a glaring perf problem I'm submitting this as an answer since it probably solves your problem.

If your app is slow, profile it to see what is slow.

Unfortunately, ADO.NET is indeed quite CPU heavy when reading data. Nothing you can do about it.

Upvotes: 1

Related Questions