Reputation: 111
I tried to find out the unprintable characters in data filein unix. Code :
#!/bin/ksh
export SRCFILE='/data/temp1.dat'
while read line
do
len=lenght($line)
for( $i = 0; $i < $len; $i++ ) {
if( ord(substr($line, $i, 1)) > 127 )
{
print "$line\n";
last;
}
done < $SRCFILE
The code is not working , please help me in getting a solution for the above query.
Upvotes: 11
Views: 34195
Reputation: 881403
You can use grep
for finding non-printable characters in a file, something like the following, which finds all non-printable-ASCII and all non-ASCII:
grep -P -n "[\x00-\x1F\x7F-\xFF]" input_file
-P
gives you the more powerful Perl regular expressions (PCREs) and -n
shows line numbers.
If your grep
doesn't support PCREs, I'd just use Perl for this directly:
perl -ne '$x++;if($_=~/[\x00-\x1F\x7F-\xFF]/){print"$x:$_"}' input_file
Upvotes: 17
Reputation: 11
This sounds pretty trite but I was not sure how to do it just now. I have become fond of "od" depending on what you are doing you may want something suited to printing arbitrary characters. The awk code is not very elegant but it is flexible if you are looking for specifics, the point is just to show the use of od however. Note the problems with awk compares and the spaces etc,
cat filename | od -A n -t x1z | awk '{ p=0; i=1; if ( NF>16) { while (i<17) {if ( $i!="0d"){ if ( $i!="0a") {if ( $i" " < "20 " ) {print $i ; p=1;} if ( $i" "> "7f "){print $i; p=1;}}} i=i+1} if (p==1) print $0; }}' | more
Upvotes: -3
Reputation: 3154
You may try something like this :
grep '[^[:print:]]' filePath
Upvotes: 14