user3759763
user3759763

Reputation: 111

How to find non-printable characters in the file?

I tried to find out the unprintable characters in data filein unix. Code :

#!/bin/ksh
export SRCFILE='/data/temp1.dat'
while read line 
do
len=lenght($line)
for( $i = 0; $i < $len; $i++ ) {

        if( ord(substr($line, $i, 1)) > 127 )
        {
            print "$line\n";
            last;
        }
done < $SRCFILE

The code is not working , please help me in getting a solution for the above query.

Upvotes: 11

Views: 34195

Answers (3)

paxdiablo
paxdiablo

Reputation: 881403

You can use grep for finding non-printable characters in a file, something like the following, which finds all non-printable-ASCII and all non-ASCII:

grep -P -n "[\x00-\x1F\x7F-\xFF]" input_file

-P gives you the more powerful Perl regular expressions (PCREs) and -n shows line numbers.

If your grep doesn't support PCREs, I'd just use Perl for this directly:

perl -ne '$x++;if($_=~/[\x00-\x1F\x7F-\xFF]/){print"$x:$_"}' input_file

Upvotes: 17

mike marchywka
mike marchywka

Reputation: 11

This sounds pretty trite but I was not sure how to do it just now. I have become fond of "od" depending on what you are doing you may want something suited to printing arbitrary characters. The awk code is not very elegant but it is flexible if you are looking for specifics, the point is just to show the use of od however. Note the problems with awk compares and the spaces etc,

cat filename | od -A n -t x1z | awk '{ p=0; i=1; if ( NF>16) { while (i<17) {if ( $i!="0d"){ if ( $i!="0a") {if ( $i" " < "20 " ) {print $i ; p=1;}  if ( $i" "> "7f "){print $i;   p=1;}}}  i=i+1} if (p==1) print $0; }}' | more

Upvotes: -3

blackSmith
blackSmith

Reputation: 3154

You may try something like this :

grep '[^[:print:]]' filePath

Upvotes: 14

Related Questions