LeonardBlunderbuss
LeonardBlunderbuss

Reputation: 1274

Is there a way to only read part of a line of an input file?

I have a routine that opens a lookup table file to see if a certain entry already exists before writing to the file. Each line contains about 2,500 columns of data. I need to check the first 2 columns of each line to make sure the entry doesn't exist.

I don't want to have to read in 2,500 columns for every line just to check 2 entries. I was attempting to use the fscanf function, but it gives me an Invalid size error when I attempt to only read 2 columns. Is there a way to only read part of each line of an input file?

        if(exist(strcat(fileDirectory,fileName),'file'))
            fileID = fopen(strcat(fileDirectory,fileName),'r');
            if(fileID == -1)
                disp('ERROR: Could not open file.\n')
            end  
            % Read file to see if line already exists
            dataCheck = fscanf(fileID, '%f %f', [inf 2]);
            for i=1:length(dataCheck(:,1))
                if(dataCheck(i,1) == sawAnglesDeg(sawCount))
                    if(dataCheck(i,2) == sarjAnglesDeg(floor((sawCount-1)/4)+1))
                        % This line has already been written in lookup table
                        lineExists = true;
                        disp('Duplicate lookup table line found. Skipping...\n')
                        break;
                    end
                end
            end
            fclose(fileID);
        end

Upvotes: 1

Views: 336

Answers (3)

nkjt
nkjt

Reputation: 7817

With textscan you can skip fields, parts of fields, or even "rest of line", so I would do this (based on MATLAB help example slightly modified):

fileID = fopen('data.dat');
data = textscan(fileID,'%f %f %*[^\n]');
fclose(fileID);

Then check data (should be the two columns you want) to see if any of those rows matches the requirements.

Upvotes: 2

marsei
marsei

Reputation: 7751

As @Jesper Grooss wrote, there is no solution to skip the remaining of a line without reading it. In a single text file context, a fastest solution would probably consist of

  • reading the entire file with textscan (one line of text into one cell element of a matrix)
  • appending the new line to the matrix even if it is a duplicate entry
  • uniquing the cell matrix with unique(cellmatrix, 'rows')
  • appending the new line to the text file if it corresponds to a new entry

The uniquing step replaces the putatively costly for loop.

Upvotes: 1

Jesper Grooss
Jesper Grooss

Reputation: 146

Well, not really.

You should be able in a loop to do an fscanf of the first two doubles, followed by a fgetl to read the rest of the line, i.e. on the form:

while there_are_more_lines
   dataCheck = fscanf(fileID, '%f', 2);
   fgetl(fileID); % Read remainder of line, discarding it
   % Do check here for each line
end

Since it is a text file, you can not really skip reading characters from the file. For binary files you can do an fseek, which can jump round in the file based on a byte-count - it can be used if you know exactly where the next line starts (in byte-count). But for a text file you do not know that, since each line will vary in length. If you save the data in a binary file instead, it would be possible to do something like that.

What I would probably do: Create two files, the first one containing the two "check-values", that could be read in quickly, and the other one containing the 2500 columns of data, with or without the two "check-values". They should be updated synchronously; when adding one line to the first file, also one line is added to the second file.

And I would definitely make a checkData matrix variable and keep that in memory as long as possible; when adding a new line to the file, also update the checkData matrix, so you should only need to read the file once initially and use the checkData matrix for the rest of the life of your program.

Upvotes: 3

Related Questions