Ankit Kumar
Ankit Kumar

Reputation: 514

CsvHelper Error "No header record found" on reading a csv stream

Below is the code which i using to read a stream source of csv files but I get error as "No header record found". The library is 15.0 and I am already using .ToList() as suggested in some solutions, but still the error persists. Below is the method along with the tablefield class and the Read Stream method.

Also note here, I can get the desired result if I pass source as MemoryStream but it fails if I pass it as Stream because I need to avoid writing to memory each time.

public async Task<Stream> DownloadBlob(string containerName, string fileName, string connectionString)
        {
            //  MemoryStream memoryStream = new MemoryStream();           

            if (string.IsNullOrEmpty(connectionString))
            {
                connectionString = @"UseDevelopmentStorage=true";
                containerName = "testblobs";
            }

            Microsoft.Azure.Storage.CloudStorageAccount storageAccount = Microsoft.Azure.Storage.CloudStorageAccount.Parse(connectionString);
            CloudBlobClient serviceClient = storageAccount.CreateCloudBlobClient();
            CloudBlobContainer container = serviceClient.GetContainerReference(containerName);
            CloudBlockBlob blob = container.GetBlockBlobReference(fileName);
            if (!blob.Exists())
            {
                throw new Exception($"Blob Not found");
            }
          
            return await blob.OpenReadAsync();



public class TableField
    {
        public string Name { get; set; }
        public string Type { get; set; }

        public Type DataType
        {
            get
            {
                switch( Type.ToUpper() )
                {
                    case "STRING":
                        return typeof(string);

                    case "INT":
                        return typeof( int );

                    case "BOOL":
                    case "BOOLEAN":
                        return typeof( bool );

                    case "FLOAT":
                    case "SINGLE":
                    case "DOUBLE":
                        return typeof( double );

                    case "DATETIME":
                        return typeof( DateTime );

                    default:
                        throw new NotSupportedException( $"CSVColumn data type '{Type}' not supported" );
                }
            }
        }

 private IEnumerable<Dictionary<string, EntityProperty>> ReadCSV(Stream source, IEnumerable<TableField> cols)           

 {                
         using (TextReader reader = new StreamReader(source, Encoding.UTF8))
                {
                
                    var cache = new TypeConverterCache();
                    cache.AddConverter<float>(new CSVSingleConverter());
                    cache.AddConverter<double>(new CSVDoubleConverter());
                    var csv = new CsvReader(reader,
                        new CsvHelper.Configuration.CsvConfiguration(global::System.Globalization.CultureInfo.InvariantCulture)
                        {
                            Delimiter = ";",
                            HasHeaderRecord = true,
                            CultureInfo = global::System.Globalization.CultureInfo.InvariantCulture,
                            TypeConverterCache = cache
                        });
                    csv.Read();
                    csv.ReadHeader();


                    var map = (
                            from col in cols
                            from src in col.Sources()
                            let index = csv.GetFieldIndex(src, isTryGet: true)
                            where index != -1
                            select new { col.Name, Index = index, Type = col.DataType }).ToList();

                    while (csv.Read())
                    {
                        yield return map.ToDictionary(
                            col => col.Name,
                            col => EntityProperty.CreateEntityPropertyFromObject(csv.GetField(col.Type, col.Index)));
                    }
                
                }
            
        }

StreamReading code:

public async Task<Stream> ReadStream(string containerName, string digestFileName, string fileName, string connectionString)
        {
            string data = string.Empty;
            string fileExtension = Path.GetExtension(fileName);
            var contents = await DownloadBlob(containerName, digestFileName, connectionString);               
                
            
            return contents;
        }

Sample CSv to be read:

PartitionKey;Time;RowKey;State;RPM;Distance;RespirationConfidence;HeartBPM
te123;2020-11-06T13:33:37.593Z;10;1;8;20946;26;815
te123;2020-11-06T13:33:37.593Z;4;2;79944;8;36635;6
te123;2020-11-06T13:33:37.593Z;3;3;80042;9;8774;5
te123;2020-11-06T13:33:37.593Z;1;4;0;06642;6925;37
te123;2020-11-06T13:33:37.593Z;6;5;04740;74753;94628;21
te123;2020-11-06T13:33:37.593Z;7;6;6;2;14;629
te123;2020-11-06T13:33:37.593Z;9;7;126;86296;9157;05
te123;2020-11-06T13:33:37.593Z;5;8;5;3;7775;08
te123;2020-11-06T13:33:37.593Z;2;9;44363;65;70;229
te123;2020-11-06T13:33:37.593Z;8;10;02;24666;2;2

Upvotes: 0

Views: 3976

Answers (4)

creeser
creeser

Reputation: 421

I was getting the same error 'No header found...' and this was after several hundred successful reads of the same file. I added the delimiter=","

reader = csv.reader(filename, delimiter=",")

and that solved the problem. I think the csv_reader will attempt to determine the delimiter if the delimiter is not specified, and fails after a while, maybe a memory leak? the comma is the default, but if the reader has to programatically determine it, it is more likely to fail.

Upvotes: 0

David Specht
David Specht

Reputation: 9074

Try setting the source stream back to the start.

private IEnumerable<Dictionary<string, EntityProperty>> ReadCSV(Stream source, IEnumerable<TableField> cols)           
{    
   source.Position = 0;

You also can't use yield return there. It delays execution of the code until you access the IEnumerable<Dictionary<string, EntityProperty>> returned from the ReadCSV method. The problem is at that point you have already closed the using statement with the TextReader that CsvHelper needs to read your data, so you get a NullReferenceException.

You either need to remove the yield return

var result = new List<Dictionary<string, EntityProperty>>();
while (csv.Read()){
   // Add to result
}
return result;

Or pass the TextReader to your method. Any enumaration of the IEnumerable<Dictionary<string, EntityProperty>> must occur before leaving the using statement which will dispose of the TextReader needed by the CsvReader

IEnumerable<Dictionary<string, EntityProperty>> result;

using (TextReader reader = new StreamReader(source, Encoding.UTF8)){
   // Calling ToList() will enumerate your yield statement
   result = ReadCSV(reader, cols).ToList(); 
}

Upvotes: 1

Caius Jard
Caius Jard

Reputation: 74605

Related to my answer to your other question (it has more detail ; you can read it there) I didn't encounter any problem connecting CsvHelper to a blob storage sourced stream

enter image description here

This was the code used (I took the CSV data you posted, added it to a file, upped it to blob):

public partial class Form1 : Form
{
    public Form1()
    {
        InitializeComponent();
    }

    private async void button1_Click(object sender, EventArgs e)
    {
        var cstr = "YOUR CONNSTR" HERE;

        var bbc = new BlockBlobClient(cstr, "temp", "ankit.csv");

        var s = await bbc.OpenReadAsync(new BlobOpenReadOptions(true) { BufferSize = 16384 });

        var sr = new StreamReader(s);

        var csv = new CsvHelper.CsvReader(sr, new CsvConfiguration(CultureInfo.CurrentCulture) { HasHeaderRecord = true, Delimiter = ";" });

        
        //try by read/getrecord
        while(await csv.ReadAsync())
        {
            var rec = csv.GetRecord<X>();
            Console.WriteLine(rec.PartitionKey);
        }

        var x = new X();
        //try by await foreach
        await foreach (var r in csv.EnumerateRecordsAsync(x))
        {
            Console.WriteLine(r.PartitionKey);
        }
    }
}

class X {
    public string PartitionKey { get; set; }
}

Upvotes: 1

Saber
Saber

Reputation: 501

I have tried to reproduce the problem with version 15.0 of the library, but have failed with classes CSVSingleConverter and CSVDoubleConverter. With the standard classes of the CSVHelper, however, reading the header works:

using System;
using System.IO;
using System.Text;
using CsvHelper;
using CsvHelper.TypeConversion;

namespace ConsoleApp2
{
    class Program
    {
        static void Main(string[] args)
        {
            using (Stream stream = new FileStream(@"e:\demo.csv", FileMode.Open, FileAccess.Read))
            {
                ReadCSV(stream);
            }
        }

        private static void ReadCSV(Stream source)

        {
            using (TextReader reader = new StreamReader(source, Encoding.UTF8))
            {

                var cache = new TypeConverterCache();
                cache.AddConverter<float>(new SingleConverter());
                cache.AddConverter<double>(new DoubleConverter());
                var csv = new CsvReader(reader,
                    new CsvHelper.Configuration.CsvConfiguration(global::System.Globalization.CultureInfo.InvariantCulture)
                    {
                        Delimiter = ";",
                        HasHeaderRecord = true,
                        CultureInfo = global::System.Globalization.CultureInfo.InvariantCulture,
                        TypeConverterCache = cache
                    });
                csv.Read();
                csv.ReadHeader();

                foreach (string headerRow in csv.Context.HeaderRecord)
                {
                    Console.WriteLine(headerRow);
                }
            }
        }
    }
}

I´ve changed the lines ...

cache.AddConverter<float>(new CSVSingleConverter());
cache.AddConverter<double>(new CSVDoubleConverter());

... to ...

cache.AddConverter<float>(new SingleConverter());
cache.AddConverter<double>(new DoubleConverter());

I put the CSV data into a UTF-8 text file. Output at the console is:

PartitionKey
Time
RowKey
State
RPM
Distance
RespirationConfidence
HeartBPM

EDIT 2020-12-24: Put the whole source text online, not just part of it.

Upvotes: 2

Related Questions