Reading text Files - single line vs. multiple lines

Question

I am working on a particular scenario, where I have to read from a Text File, parse it, extract meaningful information from it, perform SQL queries with the information and then produce a reponse, output file.

I have about 3000 lines of code. Everything is working as expected. However I have been thinking of a connendrum that could possibly dissrupt my project.

The text file being read (lets call it Text.txt) may consist of a single line or multiple lines.

In my case, a 'line' is identified by its segment name - say ISA, BHT, HB, NM1, etc... each segment ending is identified by a special character '~'.

Now if the file consists of multiple lines (such that each line corresponds to a single segment); say:-

ISA....... ~

NM1....... ~

DMG....... ~

SE........ ~

and so on.... then my code essentially reads each 'line' (i.e. each segment), one at a time and stores it into a temp buffer using the following command :-

         ReadLn(myFile,buffer);

and then performs evaluations based on each line. Produces the desired output. No problems.

However the issue is... what if the file consists of only a single line (consisting of multiple segments), represented as:-

ISA....... ~NM1....... ~DMG....... ~SE........ ~

then with my ReadLine command I read the entire line instead of each segment, one at a time. This doesn't work for my code.

I was thinking about creating an if, else statement pair...which is based on how many lines my Txt.txt file consists of..such as:-

if line = 1:- then extract each segment at a time...seperated by the special character '~' perform necessary tasks (3000 lines of code) else if line > 1:- then extract each line at a time (corresponding to each segment) perform necessary tasks (3000 lines of code).

now the 3000 lines of code is repeated twice and I don't find it elegant to copy and paste all of that code twice.

I would appreciate if I could get some feedback on how to possibly solve this issue, such that, regardless of a one-line file or multiple-line file...when i proceed to evaluate, i only use one segment at a time.

David Dubois · Accepted Answer

There are many possible ways of doing this. Which is best for you might depend on how long these files are and how important performance is.

A simple solution is to just read characters one at a time until you hit your tilde delimiter. The routine ReadOneItem below shows how this can be done.

procedure TForm1.Button1Click(Sender: TObject);
const
  FileName = 'c:\kuiper	est2.txt';
var
  MyFile : textfile;
  Buffer : string;

  // Read one item from text file MyFile.
  // Load characters one at a time.
  // Ignore CR and LF characters
  // Stop reading at end-of-file, or when a '~' is read

  function ReadOneItem : string;
  var
    C : char;
  begin
    Result := '';

    // loop continues until break
    while true do
      begin

        // are we at the end-of-file? If so we're done
        if eof(MyFile) then
          break;

        // read in the next character
        read ( MyFile, C );

        // ignore CR and LF
        if ( C = #13 ) or ( C = #10 ) then
          {do nothing}
        else
          begin

            // add the character to the end
            Result := Result + C;

            // if this is the delimiter then stop reading
            if C = '~' then
              break;
          end;
      end;
  end;


begin
  assignfile ( MyFile, FileName );
  reset ( MyFile );
  try

    while not EOF(MyFile) do
      begin
        Buffer := ReadOneItem;
        Memo1 . Lines . Add ( Buffer );
      end;

  finally
    closefile ( MyFile );
  end;
end;

Reading text Files - single line vs. multiple lines

Answers (2)

Related Questions