LanceBaynes
LanceBaynes

Reputation: 1463

How do I find files that do not end with a newline/linefeed?

How can I list normal text (.txt) filenames, that don't end with a newline?

e.g.: list (output) this filename:

$ cat a.txt
asdfasdlsad4randomcharsf
asdfasdfaasdf43randomcharssdf
$ 

and don't list (output) this filename:

$ cat b.txt
asdfasdlsad4randomcharsf
asdfasdfaasdf43randomcharssdf

$

Upvotes: 48

Views: 21223

Answers (15)

ysth
ysth

Reputation: 98398

This is kludgy; someone surely can do better:

for f in `find . -name '*.txt' -type f`; do
    if test `tail -c 1 "$f" | od -c | head -n 1 | tail -c 3` != \\n; then
        echo $f;
    fi
done

N.B. this answers the question in the title, which is different from the question in the body (which is looking for files that end with \n\n I think).

Upvotes: 2

Tom Anderson
Tom Anderson

Reputation: 47193

If you have ripgrep installed:

rg -Ul '[^\n]\z'

That regular expression matches any character which is not a newline, and then the end of the file. Multi-line mode (-U) must be enabled to match on line terminators.

Upvotes: 20

user3719454
user3719454

Reputation: 1024

I think this is the most understandable script:

for FN in `find . -type f` ; do if [[ `cat "$FN" | tail -c 1 | xxd -p` != '0a' ]] ; then echo "$FN" ; fi ; done

Upvotes: 0

Anthony Bush
Anthony Bush

Reputation: 521

Use pcregrep, a Perl Compatible Regular Expressions version of grep which supports a multiline mode using -M flag that can be used to match (or not match) if the last line had a newline:

pcregrep -LMr '\n\Z' .

In the above example we are saying to search recursively (-r) in current directory (.) listing files that don't match (-L) our multiline (-M) regex that looks for a newline at the end of a file ('\n\Z')

Changing -L to -l would list the files that do have newlines in them.

pcregrep can be installed on MacOS with the homebrew pcre package: brew install pcre

Upvotes: 42

Gabriel Petrovay
Gabriel Petrovay

Reputation: 21884

Here another example using little bash build-in commands and which:

  • allows you to filter for extension (e.g. | grep '\.md$' filters only the md files)
  • pipe more grep commands for extending the filter (like exclusions | grep -v '\.git' to exclude the files under .git
  • use the full power of grep parameters to for more filters or inclusions

The code basically, iterates (for) over all the files (matching your chosen criteria grep) and if the last 1 character of a file (-n "$(tail -c -1 "$file")") is not not a blank line, it will print the file name (echo "$file").

The verbose code:

for file in $(find . | grep '\.md$')
do
    if [ -n "$(tail -c -1 "$file")" ]
    then
        echo "$file"
    fi
done

A bit more compact:

for file in $(find . | grep '\.md$')
do
    [ -n "$(tail -c -1 "$file")" ] && echo "$file"
done

and, of course, the 1-liner for it:

for file in $(find . | grep '\.md$'); do [ -n "$(tail -c -1 "$file")" ] && echo "$file"; done

Upvotes: 0

ppar
ppar

Reputation: 600

This example

  • Works on macOS (BSD) and GNU/Linux
  • Uses standard tools: find, grep, sh, file, tail, od, tr
  • Supports paths with spaces

Oneliner:

find . -type f -exec sh -c 'file -b "{}" | grep -q text' \; -exec sh -c '[ "$(tail -c 1 "{}" | od -An -a | tr -d "[:space:]")" != "nl" ]' \; -print

More readable version

  • Find under current directory
    • Regular files
    • That 'file' (brief mode) considers text
    • Whose last byte (tail -c 1) is not represented by od's named character "nl"
    • And print their paths
#!/bin/sh
find . \
    -type f \
    -exec sh -c 'file -b "{}" | grep -q text' \; \
    -exec sh -c '[ "$(tail -c 1 "{}" | od -An -a | tr -d "[:space:]")" != "nl" ]' \; \
    -print

Finally, a version with a -f flag to fix the offending files (requires bash).

#!/bin/bash
# Finds files without final newlines
# Pass "-f" to also fix those files
fix_flag="$([ "$1" == "-f" ] && echo -true || echo -false)"
find . \
    -type f \
    -exec sh -c 'file -b "{}" | grep -q text' \; \
    -exec sh -c '[ "$(tail -c 1 "{}" | od -An -a | tr -d "[:space:]")" != "nl" ]' \; \
    -print \
    $fix_flag \
    -exec sh -c 'echo >> "{}"' \;

Upvotes: 3

Julien Palard
Julien Palard

Reputation: 11566

Ok it's my turn, I give it a try:

find . -type f -print0 | xargs -0 -L1 bash -c 'test "$(tail -c 1 "$0")" && echo "No new line at end of $0"'

Upvotes: 44

Dennis Williamson
Dennis Williamson

Reputation: 360143

Give this a try:

find . -type f -exec sh -c '[ -z "$(sed -n "\$p" "$1")" ]' _ {} \; -print

It will print filenames of files that end with a blank line. To print files that don't end in a blank line change the -z to -n.

Upvotes: 12

udondan
udondan

Reputation: 59989

The best oneliner I could come up with is this:

git grep --cached -Il '' | xargs -L1 bash -c 'if test "$(tail -c 1 "$0")"; then echo "No new line at end of $0"; exit 1; fi'

This uses git grep, because in my use-case I want to ensure files commited to a git branch have ending newlines.

If this is required outside of a git repo, you can of course just use grep instead.

grep -RIl '' . | xargs -L1 bash -c 'if test "$(tail -c 1 "$0")"; then echo "No new line at end of $0"; exit 1; fi'

Why I use grep? Because you can easily filter out binary files with -I.

Then the usual xargs/tail thingy found in other answers, with the addition to exit with 1 if a file has no newline. So this can be used in a pre-commit githook or CI.

Upvotes: 4

Manish Jain
Manish Jain

Reputation: 91

Most solutions on this page do not work for me (FreeBSD 10.3 amd64). Ian Will's OSX solution does almost-always work, but is pretty difficult to follow : - (

There is an easy solution that almost-always works too : (if $f is the file) :

sed -i '' -e '$a\' "$f"

There is a major problem with the sed solution : it never gives you the opportunity to just check (and not append a newline).

Both the above solutions fail for DOS files. I think the most portable/scriptable solution is probably the easiest one, which I developed myself : - )

Here is that elementary sh script which combines file/unix2dos/tail. In production, you will likely need to use "$f" in quotes and fetch tail output (embedded into the shell variable named last) as \"$f\"

if file $f | grep 'ASCII text' > /dev/null; then
    if file $f | grep 'CRLF' > /dev/null; then
        type unix2dos > /dev/null || exit 1
        dos2unix $f
        last="`tail -c1 $f`"
        [ -n "$last" ] && echo >> $f
        unix2dos $f
    else
        last="`tail -c1 $f`"
        [ -n "$last" ] && echo >> $f
    fi
fi

Hope this helps someone.

Upvotes: 2

pelagic
pelagic

Reputation: 71

If you are using 'ack' (http://beyondgrep.com) as a alternative to grep, you just run this:

ack -v '\n$'

It actually searches all lines that don't match (-v) a newline at the end of the line.

Upvotes: 6

Ian Will
Ian Will

Reputation: 1052

This example works for me on OSX (many of the above solutions did not)

for file in `find . -name "*.java"`
do
  result=`od -An -tc -j $(( $(ls -l $file  | awk '{print $5}') - 1 )) $file`
  last_char=`echo $result | sed 's/ *//'`
  if [ "$last_char" != "\n" ]
  then
    #echo "Last char is .$last_char."
    echo $file
  fi
done

Upvotes: 1

Andrei Sfrent
Andrei Sfrent

Reputation: 189

This should do the trick:

#!/bin/bash

for file in `find $1 -type f -name "*.txt"`;
do
        nlines=`tail -n 1 $file | grep '^$' | wc -l`
        if [ $nlines -eq 1 ]
                then echo $file
        fi
done;

Call it this way: ./script dir

E.g. ./script /home/user/Documents/ -> lists all text files in /home/user/Documents ending with \n.

Upvotes: 3

marco
marco

Reputation: 4675

Since your question has the perl tag, I'll post an answer which uses it:

find . -type f -name '*.txt' -exec perl check.pl {} +

where check.pl is the following:

#!/bin/perl 

use strict;
use warnings;

foreach (@ARGV) {
    open(FILE, $_);

    seek(FILE, -2, 2);

    my $c;

    read(FILE,$c,1);
    if ( $c ne "\n" ) {
        print "$_\n";
    }
    close(FILE);
}

This perl script just open, one per time, the files passed as parameters and read only the next-to-last character; if it is not a newline character, it just prints out the filename, else it does nothing.

Upvotes: 2

Diego Torres Milano
Diego Torres Milano

Reputation: 69228

Another option:

$ find . -name "*.txt" -print0 | xargs -0I {} bash -c '[ -z "$(tail -n 1 {})" ] && echo {}'

Upvotes: 1

Related Questions