How to find non-printable characters in the file?

user3759763 picture user3759763 · Sep 8, 2014 · Viewed 14.9k times · Source

I tried to find out the unprintable characters in data filein unix. Code :

#!/bin/ksh
export SRCFILE='/data/temp1.dat'
while read line 
do
len=lenght($line)
for( $i = 0; $i < $len; $i++ ) {

        if( ord(substr($line, $i, 1)) > 127 )
        {
            print "$line\n";
            last;
        }
done < $SRCFILE

The code is not working , please help me in getting a solution for the above query.

Answer

paxdiablo picture paxdiablo · Sep 8, 2014

You can use grep for finding non-printable characters in a file, something like the following, which finds all non-printable-ASCII and all non-ASCII:

grep -P -n "[\x00-\x1F\7F-\xFF]" input_file

-P gives you the more powerful Perl regular expressions (PCREs) and -n shows line numbers.

If your grep doesn't support PCREs, I'd just use Perl for this directly:

perl -ne '$x++;if($_=~/[\x00-\x1F\x7F-\xFF]/){print"$x:$_"}' input_file