Sibbs Gambling
Sibbs Gambling

Reputation: 20355

MATLAB reads line starting with a certain string from file?

I have a messy log file, from which I want to extract some information useful to me. By messy, I mean the file can contain arbitrary lines of characters/numbers. However, the numbers I need to extract are always preceded by a certain string -- new beta value =.

So an example, if my input log file is

3456789FGHJKLcvbnm,.ghjkl
Error!
Warning. GHJKL:6&*()_
new beta value = 1557.01
$%^&*()VGBNM<
GBHNM<
Warning!!! 
This is a random line
new beta value = 1101.6
TL:vbNM<>%^UIOP
FGHJKL]\[;/
new beta value = 100
...

I hope to read

1557.01
1101.6
100
...

into MATLAB.

It seems that MATLAB doesn't have built-in functions for this. How may I achieve this?

Upvotes: 1

Views: 970

Answers (3)

Luis Mendo
Luis Mendo

Reputation: 112679

Here's another implementation:

fid = fopen('file.txt', 'r');
str = reshape(fread(fid,inf,'*char'),1,[]);
fclose(fid);
numbers = str2double(regexp(str, '(?<=new beta value =\s+)\d+(.\d*)?','match')).';

This works as follows:

  • Lines 1--3: the file contents are read as a string.
  • Line 4: a regular expression is applied to extract the numbers. Lookbehind is used to detect (but not match) the string. The result is a cell array of strings, to which str2double is applied to convert into a vector of numbers.

Assumed format:

  • The number format \d+(.\d*)? detects numbers of the form 100.34 or 100. It doesn't detect -100, -100.34, .34, -.34. If you want those cases too you need to modify the regular expression accordingly.
  • The string that marks the desired numbers can optionally contain a space before the number (as in your example). Otherwise remove \s+ in the regular expression.

Upvotes: 3

GameOfThrows
GameOfThrows

Reputation: 4510

As @excaza suggested, there are multiple ways to do this. I find the read whole file and used regexp much easier + faster.

indata = fileread('test.txt');
pattern = 'new beta value =\s+(\d+.\d+)'; %//the pattern you are looking for is a Stirng "new beta value =" followed by a Double (which is the integer part of the number you are looking for) + a dot(or decimal) + another Double (which is the part 2 of the number you are looking for)
lines = regexp(indata, pattern, 'tokens'); %//output as cell array

result = cell2mat(cellfun(@(x) str2double(x{:}), lines, 'UniformOutput', false)); %//output as Matrix

result =

    1557.01        1101.6         100

Upvotes: 4

sco1
sco1

Reputation: 12214

One implementation utilizing fgetl

queryline = 'new beta value';

fID = fopen('test.txt');
mydata = []; % Initialize data
while ~feof(fID) % Loop until we get to the end of the file
    tline = fgetl(fID);
    if ~isempty(strfind(tline, queryline))
        % If we find a match for our query string in the line of the file
        formatspec = sprintf('%s = %%f', queryline)
        mydata = [mydata sscanf(tline, formatspec)];
    end
end

fclose(fID);

Upvotes: 4

Related Questions