Reputation: 12178
I have a text file with 1.3million rows and 258 columns delimited by semicolons (;). How can I search for what characters are in the file, excluding letters of the alphabet (both upper and lower case), semicolon (;), quote (') and double quote (")? Ideally the results should be in a non-duplicated list.
Upvotes: 0
Views: 85
Reputation: 19345
Use the following pipeline
# Remove the characters you want to exclude
tr -d 'A-Za-z;"'\' <file |
# One character on each line
sed 's/\(.\)/\1\
/g' |
# Remove duplicates
sort -u
Example
echo '2343abc34;ABC;;@$%"' |
tr -d 'A-Za-z;"'\' |
sed 's/\(.\)/\1\
/g' |
sort -u
$
%
2
3
4
@
Upvotes: 2