Reputation: 1266
Please consider the following records
I'm trying to flatten and group the data by Robot Name
then by date
+ Left Factory
time then group the address(es) for that date and time. Notice that some of the Left Factory
times are identical.
I wrote the code below and it works. It gives me the output that I want. I was a Perl developer so what you see below is from that mentality. I'm sure there is a better way of doing it in LINQ. A little help please.
static void Main(string[] args)
{
if (args.Length < 0){
Console.WriteLine("Input file name is required");
return;
}
List<string> rawlst = File.ReadAllLines(args[0]).ToList<string>();
Dictionary<string, Dictionary<DateTime, List<string>>> dicDriver = new Dictionary<string, Dictionary<DateTime, List<string>>>();
foreach (string ln in rawlst)
{
try
{
List<string> parts = new List<string>();
parts = ln.Split(',').ToList<string>();
string[] dtparts = parts[1].Split('/');
string[] dttime = parts[15].Split(':');
DateTime dtrow = new DateTime(
int.Parse(dtparts[2]), int.Parse(dtparts[0]), int.Parse(dtparts[1]),
int.Parse(dttime[0]), int.Parse(dttime[1]), int.Parse(dttime[2]));
string rowAddress = parts[7] + " " + parts[9] + " " + parts[10] + " " + parts[11];
if (!dicDriver.Keys.Contains(parts[3]))
{
Dictionary<DateTime, List<string>> thisRec = new Dictionary<DateTime, List<string>>();
thisRec.Add(dtrow, new List<string>() { rowAddress });
dicDriver.Add(parts[3], thisRec);
}
else
{
Dictionary<DateTime, List<string>> thisDriver = new Dictionary<DateTime, List<string>>();
thisDriver = dicDriver[parts[3]];
if (!thisDriver.Keys.Contains(dtrow))
{
dicDriver[parts[3]].Add(dtrow, new List<string>() { rowAddress });
}
else
{
dicDriver[parts[3]][dtrow].Add(rowAddress);
}
}
}
catch (Exception e)
{
Console.WriteLine("ERROR:" + ln);
}
}
//output
string filename = DateTime.Now.Ticks.ToString() + ".out";
foreach (var name in dicDriver.Keys)
{
foreach (var dd in dicDriver[name])
{
Console.Write(name + "," + dd.Key + ",");
File.AppendAllText(filename, name + "," + dd.Key + Environment.NewLine);
foreach (var addr in dd.Value)
{
Console.Write("\t\t" + addr + Environment.NewLine);
File.AppendAllText(filename, "\t" + addr + Environment.NewLine);
}
}
Console.Write(Environment.NewLine);
File.AppendAllText(filename, Environment.NewLine);
}
Console.ReadLine();
}
Upvotes: 1
Views: 61
Reputation: 30512
You should separate your concerns: separate your input from the processing and from the output.
For example: suppose you would have to read your input from a database instead from a CSV file? Would that seriously change the way your process your fetched data? In your design, fetching the data is mixed with processing: although you know that the data that you want to process contains something like FactoryProcesses
, you decide to present eache FactoryProcess
as a string. A FactoryProcess
is not a string. It describes how and when and who processed something in your factory. That is not a string, is it? However, it might be represented internally as a string, but the outside world should not know this. This way, if you change your FactoryProcess
from being read by a CSV-file, to something provided by a database, the users of your FactoryProcess
won't see any difference.
Separation of concerns makes your code easier to understand, easier to test, easier to change, and better to re-use.
So let's separate!
IEnumerable<FactoryProcess> ReadFactoryProcesses(string fileName)
{
// TODO: check fileName not null, file exists
using (var fileReader = new StreamReader(fileName))
{
// read the file Line by Line and split each line into one FactoryProcess object
string line = fileReader.ReadLine();
while (line != null)
{
// one line read, convert to FactoryProcess and yield return:
FactoryProcess factoryProcess = this.ToFactoryProcess(line);
yield return factoryProcess;
// read next line:
line = fileReader.ReadLine();
}
}
}
I'll leave the conversion of a read line into a FactoryProcess
up to you. Tip: if the items in your lines are separated by a comma, or something similar, consider using Nuget Package CSVHelper. It makes if easier to convert a file into a sequence of FactoryProcesses
.
I want to group the data by Robot Name then by date + Left Factory time then group the address(es) for that date and time.
First of all: make sure that the FactoryProcess class has the properties you actually need. Separate this representation from what it is in a file. Apparently you want to tread date + left factory as one item that represents the Date and Time that it left the factory. So Let's create a DateTime property for this.
class FactoryProcess
{
public int Id {get; set}
public int PartNo {get; set;}
public string RobotName {get; set;} // or if desired: use a unique RobotId
...
// DateTimes: ArrivalTime, OutOfFactoryTime, LeftFactoryTime
public DateTime ArrivalTime {get; set;}
public DateTime OutOfFactoryTime {get; set;}
public DateTime LeftFactoryTime {get; set;}
}
The reason that I put Date and Time into one DateTime, is because it will solve problems if an item arrives on 23:55 and leaves on 00:05 next day.
A procedure that converts a read CSV-line to a FactoryProcess should interpret your dates and times as strings and convert to FactoryProcess. You can create a constrcuctor for this, or a special Factory class
public FactoryProcess InterpretReadLine(string line)
{
// TODO: separate the parts, such that you've got the strings dateTxt, arrivalTimeTxt, ...
DateTime date = DateTime.Parse(dateTxt);
TimeSpan arrivalTime = TimeSpan.Parse(arrivalTimeTxt);
TimeSpan outOfFactoryTime = TimeSpan.Parse(outOfFactoryTimeTxt);
TimeSpan leftFactoryTime = TimeSpan.Parse(leftFactoryTimeTxt);
return new FactoryProces
{
Id = ...
PartNo = ..
RobotName = ...
// The DateTimes:
ArrivalTime = date + arrivalTime,
OutOfFactoryTime = date + outOfFactoryTime,
LeftFactoryTime = date + leftFactoryTime,
};
}
Now that you've created a proper method to convert your CSV-file into a sequence of FactoryProcesses
, let's process them
I want to group the data by Robot Name then by date + Left Factory time then group the address(es) for that date and time.
var result = fetchedFactoryProcesses.GroupBy(
// parameter KeySelector: make groups of FactoryProcesses with same RobotName:
factoryProcess => factoryProcess.RobotName,
// parameter ResultSelector: from every group of FactoryProcesses with this RobotName
// make one new Object:
(robotName, processesWithThisRobotName) => new
{
RobotName = robotName,
// Group all processes with this RobotName into groups with same LeftFactoryTime:
LeftFactory = processesWithThisRobotName.GroupBy(
// KeySelector: make groups with same LeftFactoryTime
process => process.LeftFactoryTime,
// ResultSelector: from each group of factory processes with the same LeftFactoryTime
(leftFactoryTime, processesWithThisLeftFactoryTime) => new
{
LeftFactoryTime = leftFactoryTime,
FactoryProcesses = processesWithThisLeftFactoryTime,
// or even better: select only the properties you actually plan to use
FactoryProcesses = processesWithThisLeftFactoryTime.Select(process => new
{
Id = process.Id,
PartNo = process.PartNo,
...
// not needed: you know the value, because it is in this group
// RobotName = process.RobotName,
// LeftFactoryTime = process.LeftFactoryTime,
}),
})
});
For completeness: grouping your code together:
void ProcessData(string fileName)
{
var fetchedFactoryProcesses = ReadFactoryProcess(fileName); // fetch the data
var groups = fetchFactoryProcesses.ToGroups(); // put into groups
this.Display(groups); // output result;
}
Because I separated the input from the conversion of strings to FactoryProcesses, and separated this conversion from the grouping, it will be easy to test the classes separately:
List<FactoryProcess>
, which is very convenient, because in your tests it is way easier to create a test list, than a test CSV-file.If in future you decide to change the source of your sequence of FactoryProcesses, for instance it comes from a database instead of a CSV-file, your grouping won't change. Or if you decide to support entering and leaving factories on different dates (multiple date values) only the conversion changes. If you decide to display the results in a tree-like fashion, or decide to write the groups in a database, your reading, conversion, grouping, etc won't change: what a high degree or re-usability!
Separating your concerns made it much easier to understand how to solve your grouping problem, without the hassle of splitting your read lines and converting Date + LeftFactory into one value.
Upvotes: 1