Mayou
Mayou

Reputation: 8818

Read from files and parse SQL table

All the following must be done in C#. Parsing the SQL table (SQL Server) will be done using methods in System.Data.Odbc.

Let's assume I have two .csv files, fi1 and fi2. The first csv file has two columns id and val1, and the second csv has two columns as well, id and val2.

I would like to read the two files, and parse the output to one SQL table with the following columns: id, val1, val2.

The problem is that the two files may have different entries in the id columns: in other words, some id's may have a val1 value but no val2 value, and vice versa, or they might have both values.

The table should contain the union of the id columns in the two files.

Example:

File 1

enter image description here

File2

enter image description here

The way I would want the final SQL table to look like is this:

enter image description here

Note that each file might contain duplicates, and we would want to exclude the duplicates when parsing the SQL table.

The thought I had is to create two dictionaries, dict1 and dict2, where the key would be the id, and the value would be val1 and val2. Dictionaries will be used to make sure that duplicates are not included:

 Dictionary<string, string> dict1 = new Dictionary<string, string>();
 string[] header1 = new string[]{};

 using (StreamReader rdr = new StreamReader(fi1))
 {
     header1 = rdr.ReadLine().Split(',');
     while (!rdr.EndOfStream)
     {
          string ln = rdr.ReadLine();
          string[] split_ln = ln.Split(',');
          dict1.Add(split_ln[0], split_ln[1]);
     }
 }

 Dictionary<string, string> dict2 = new Dictionary<string, string>();
 string[] header2 = new string[]{};

 using (StreamReader rdr = new StreamReader(fi2))
 {
     header2 = rdr.ReadLine().Split(',');
     while (!rdr.EndOfStream)
     {
          string ln = rdr.ReadLine();
          string[] split_ln = ln.Split(',');
          dict2.Add(split_ln[0], split_ln[1]);
     }
 }

However, after adding each file to a dictionary, I am not sure how to match the id's of both dictionaries.

Would anyone have a good hint as to how to deal with this problem?

Upvotes: 1

Views: 251

Answers (1)

theDarse
theDarse

Reputation: 737

I would do atually do a list of tuples to hold the values here instead of a dictionary so that all the information is in one place rather than matching keys, each tuple corresponds to a table record

var dict = new List<Tuple<string, string, string>>();
        using (StreamReader rdr = new StreamReader(fi1))
        {
            while (!rdr.EndOfStream)
            {
                string ln = rdr.ReadLine();
                string[] split_ln = ln.Split(',');
                dict.Add(new Tuple<string, string, string>(split_ln[0], split_ln[1],null));
            }
        }
        using (StreamReader rdr = new StreamReader(fi2))
        {
            while (!rdr.EndOfStream)
            {
                string ln = rdr.ReadLine();
                string[] split_ln = ln.Split(',');
                if (dict.Any(item => item.Item1 == split_ln[0]))
                {
                    var item = dict.Find(i => i.Item1 == split_ln[0]);
                    var newtuple = new Tuple<string, string, string>(item.Item1, item.Item2, split_ln[1]);
                    dict.Remove(item);
                    dict.Add(newtuple);
                }
                else
                {
                    dict.Add(new Tuple<string, string, string>(split_ln[0],null,split_ln[1]));
                }
            }
        }

Upvotes: 1

Related Questions