Reputation: 35
I have an input text file that comes from a third party and i wrote a c# program to process it and get the results. I have the results and I need to update the same file with the results. The third party updates their DB based on this output file. I need to get the position of the string to update the file.
Ex: The input file looks this way:
Company Name: <some name> ID: <some ID>
----------------------------------------------------
Transaction_ID:0000001233 Name:John Amount:40:00 Output_Code:
-----------------------------------------------------------------------
Transaction_ID:0000001234 Name:Doe Amount:40:00 Output_Code:
------------------------------------------------------------------------
Please note: transaction_ID is unique in each row.
The Output file should be:
Company Name: <some name> ID: <some ID>
----------------------------------------------------
Transaction_ID:0000001233 Name:John Amount:40:00 Output_Code:01
-----------------------------------------------------------------------
Transaction_ID:0000001234 Name:Doe Amount:40:00 Output_Code:02
---------------------------------------------------------------------------
The codes 01 and 02 are the results of the c# program and have to be updated in the response file.
I have the code find out the position of "Transaction_ID:0000001233" and "Output_Code:". I am able to update the first row. But I am not able to get the position of the "Output_Code:" for the second row. How do I identify the string based on the line number? I cannot rewrite the whole response file as it has other unwanted columns. The best option here would be to update the existing file.
long positionreturnCode1 = FileOps.Seek(filePath, "Output_Code:");
//gets the position of Output_Code in the first row.
byte[] bytesToInsert = System.Text.Encoding.ASCII.GetBytes("01");
FileOps.InsertBytes(bytesToInsert, newPath, positionreturnCode1);
// the above code inserts "01" in the correct position. ie:first row
long positiontransId2 = FileOps.Seek(filePath, "Transaction_ID:0000001234");
long positionreturnCode2 = FileOps.Seek(filePath, "Output_Code:");
// still gets the first row's value
long pos = positionreturnCode2 - positiontransId2;
byte[] bytesToInsert = System.Text.Encoding.ASCII.GetBytes("02");
FileOps.InsertBytes(bytesToInsert, newPath, pos);
// this inserts in a completely different position.
I know the logic is wrong. But I am trying to get the position of output code value in the second row.
Upvotes: 0
Views: 1399
Reputation: 744
The additions here are send in a position based on where your main program has already updated and keep that moving forward ahead the length of what you also added.
I believe if I am reading the code there and in your example correctly this should make you scoot along through the file.
This function is within the utils that you linked in your comment.
public static long Seek(string file, long position, string searchString)
{
//open filestream to perform a seek
using (System.IO.FileStream fs =
System.IO.File.OpenRead(file))
{
fs.Position = position;
return Seek(fs, searchString);
}
}
Upvotes: 0
Reputation: 29207
To start with, I'll isolate the part that takes a transaction and returns a code, since I don't know what that is, and it's not relevant. (I'd do the same thing even if I did know.)
public class Transaction
{
public Transaction(string transactionId, string name, decimal amount)
{
TransactionId = transactionId;
Name = name;
Amount = amount;
}
public string TransactionId { get; }
public string Name { get; }
public decimal Amount { get; }
}
public interface ITransactionProcessor
{
// returns an output code
string ProcessTransaction(Transaction transaction);
}
Now we can write something that processes a set of strings, which could be lines from a file. That's something to think about. You get the strings from a file, but would this work any different if they didn't come from a file? Probably not. Besides, manipulating the contents of a file is harder. Manipulating strings is easier. So instead of "solving" the harder problem we're just converting it into an easier problem.
For each string it's going to do the following:
Again, I'm leaving out the part that I don't know. For now it's in a private method, but it could be described as a separate interface.
public class StringCollectionTransactionProcessor // Horrible name, sorry.
{
private readonly ITransactionProcessor _transactionProcessor;
public StringCollectionTransactionProcessor(ITransactionProcessor transactionProcessor)
{
_transactionProcessor = transactionProcessor;
}
public IEnumerable<string> ProcessTransactions(IEnumerable<string> inputs)
{
foreach (var input in inputs)
{
var transaction = ParseTransaction(input);
var outputCode = _transactionProcessor.ProcessTransaction(transaction);
var outputLine = $"{input} {outputCode}";
yield return outputLine;
}
}
private Transaction ParseTransaction(string input)
{
// Get the transaction ID and whatever values you need from the string.
}
}
The result is an IEnumerable<string>
where each string is the original input, unmodified except for the output code appended that the end. If there were any extra columns in there that weren't related to your processing, that's okay. They're still there.
There are likely other factors to consider, like exception handling, but this is a starting point. It gets simpler if we completely isolate different steps from each other so that we only have to think about one thing at a time.
As you can see, I've still left things out. For example, where do the strings come from? Do they come from a file? Where do the results go? Another file? Now it's much easier to see how to add those details. They seemed like they were the most important, but now we've rearranged this so that they're the least important.
It's easy to write code that reads a file into a collection of strings.
var inputs = file.ReadLines(path);
When you're done and you have a collection of strings, it's easy to write them to a file.
File.WriteAllLines(path, linesToWrite);
We wouldn't add those details into the above classes. If we do, we've restricted those classes to only working with files, which is unnecessary. Instead we just write a new class which reads the lines, gets a collection of strings, passes it to the other class to get processed, gets back a result, and writes it to a file.
This is an iterative process that allows us to write the parts we understand and leave the parts we haven't figured out for later. That keeps us moving forward solving one problem at a time instead of getting stuck trying to solve a few at once.
A side effect is that the code is easier to understand. It lends itself to writing methods with just a few lines. Each is easy to read. It's also much easier to write unit tests.
In response to some comments:
If the output code doesn't go at the end of the line - it's somewhere in the middle, you can still update it:
var line = line.Replace("Output_Code:", "Output_Code:" + outputCode);
That's messy. If the line is delimited, you could split it, find the element that contains Output_Code
, and completely replace it. That way you don't get weird results if for some reason there's already an output code.
If the step of processing a transaction includes updating a database record, that's fine. That can all be within ITransactionProcessor.ProcessTransaction
.
If you want an even safer system you could break the whole thing down into two steps. First process all of the transactions, including your database updates, but don't update the file at all.
After you're done processing all of the transactions, go back through the file and update it. You could do that by looking up the output code for each transaction in the database. Or, processing transactions could return a Dictionary<string, string>
containing the transaction ids and output codes. When you're done with all the processing, go through the file a second time. For each transaction ID, see if there's an output code. If there is, update that line.
Upvotes: 0
Reputation: 2300
Don't try to "edit" the existing file. There is too much room for error.
Rather, assuming that the file format will not change, parse the file into data, then rewrite the file completely. An example, in pseudo-code below:
public struct Entry
{
public string TransactionID;
public string Name;
public string Amount;
public string Output_Code;
}
Iterate through the file and create a list of Entry
instances, one for each file line, and populate the data of each Entry
instance with the contents of the line. It looks like you can split the text line using white spaces as a delimiter and then further split each entry using ':'
as a delimiter.
Then, for each entry, you set the Output_Code
during your processing phase.
foreach(Entry entry in entrylist)
entry.Output_Code = MyProcessingOfTheEntryFunction(entry);
Finally iterate through your list of entries and rewrite the entire file using the data in your Entry list. (Making sure to correctly write the header and any line spacers, etc..)
OpenFile();
WriteFileHeader();
foreach(Entry entry in entrylist)
{
WriteLineSpacer();
WriteEntryData(entry);
}
CloseFile();
Upvotes: 1