DaveC
DaveC

Reputation: 167

What's confusing both grep and ack?

Try this: download https://www.mathworks.com/matlabcentral/fileexchange/19-delta-sigma-toolbox

In the unzipped folder, I get the following results:

ack --no-heading --no-break --matlab dsexample

Contents.m:56:%   dsexample1      - Discrete-time lowpass/bandpass/quadrature modulator.
Contents.m:57:%   dsexample2      - Continuous-time lowpass modulator.
dsexample1(dsm, LiveDemo); 
fprintf(1,'Done.\n');
adc.sys_cs = sys_cs;

grep -nH -R --include="*.m" dsexample

Contents.m:56:%   dsexample1      - Discrete-time lowpass/bandpass/quadrature modulator.
Contents.m:57:%   dsexample2      - Continuous-time lowpass modulator.
dsexample1(dsm, LiveDemo); d center frequency larger Hinfation Script
fprintf(1,'Done.\n');c = c;formed.s of finite op-amp gain and capacitorased;;n for the input.
adc.sys_cs = sys_cs;snr;seed with CT simulations tora states used in the d-t model_amp); Response');

What's going on ?

[Edit for clarification]: Why is there no file name, no line number on the 3rd line result ? Why results on the 4th and 5th line do not even contain dsexample ?

NB: using ack 3.40 and grep 2.16

Upvotes: 1

Views: 150

Answers (3)

ilkkachu
ilkkachu

Reputation: 6517

Let's see what files contain dsexample, grep -l doesn't print the contents, just file names:

$ grep -l dsexample *
Contents.m
demoLPandBP.m
dsexample1.m
dsexample2.m

Ok, then, file shows that they have CR line terminators. (It would say "CRLF line terminators" for Windows files.)

$ file Contents.m demoLPandBP.m dsexample*
Contents.m:    ASCII text
demoLPandBP.m: ASCII text, with CR line terminators
dsexample1.m:  ASCII text, with CR line terminators
dsexample2.m:  ASCII text, with CR line terminators

Unlike what I commented about before, Contents.m is fine. Let's look at another one, how it prints:

$ grep dsexample demoLPandBP.m 
dsexample1(dsm, LiveDemo); d center frequency larger Hinf

The output from grep is actually the whole file, since grep doesn't consider the plain CR as breaking a line -- the whole file is just one line. If we change CRs to LFs, we see it better, or can just count the lines:

$ grep dsexample demoLPandBP.m | tr '\r' '\n' | wc -l
51

These are the longest lines there, in order:

%% 5th-order lowpass with optimized zeros and larger Hinf
dsm.f0 = 1/6;   % Normalized center frequency
dsexample1(dsm, LiveDemo); 

With a CR in the end of each, the cursor moves back to the start of the line, partially overwriting the previous output, so you get:

dsexample1(dsm, LiveDemo); d center frequency larger Hinf

(There's a space after the semicolon on that line, so the e gets overwritten too. I checked.)

Someone said dos2unix can't deal with that, and well, they're not DOS or Windows files anyway so why should it. You could do something like this, though, in Bash:

for f in *.m; do
    if [[ $(file "$f") = *"ASCII text, with CR line terminators" ]]; then
        tr '\r' '\n' < "$f" > tmptmptmp &&
        mv tmptmptmp "$f"
    fi
done

I think it was just the .m files that had the issue, hence the *.m in the loop. There was at least one PDF file there, and we don't want to break that. Though with the check on file there, it should be safe even if you just run the loop on *.

Upvotes: 1

DaveC
DaveC

Reputation: 167

I do not deserve any credits for this answer - It is all about line endings.

I have known for years about Windows line endings (CR-LF) and Linux line endings (LF only), but I had never heard of Legacy MAC line endings (CR only)... The latter really upsets ack, grep, and I'm sure lots of other tools.

dos2unix and unix2dos have no effect on files with Legacy MAC format - But after using this nifty little endline tool, I could eventually bring some consistency to the source files:

endlines : 129 files converted from :
              - 23 Legacy Mac (CR)
              - 105 Unix (LF)
              - 1 Windows (CR-LF)

Now, ack and grep are much happier.

Upvotes: 2

Andy Lester
Andy Lester

Reputation: 93636

It looks like both ack and grep are getting confused by the line endings in the files. Run file *.m on your files. You'll see that some files have proper linefeeds, and some have CR line terminators.

If you clean up your line endings, things should be OK.

Upvotes: 0

Related Questions