Reputation: 5456
I want to check the no of characters in a file from starting to EOF character. Can anyone tell me how to do this through shell script
Upvotes: 98
Views: 193366
Reputation: 8786
I 'll cover the scenario where echo
counts an extra character as it also counts newline character.
So, if I do
# echo foo | wc -m
4
I get 4, as it counts the newline character as well.
In order to not count, you want to run echo
with -n
parameter.
# echo -n foo | wc -m
3
Upvotes: 0
Reputation: 360105
This will do it for counting bytes in file:
wc -c filename
If you want only the count without the filename being repeated in the output:
wc -c < filename
This will count characters in multibyte files (Unicode etc.):
wc -m filename
(as shown in Sébastien's answer).
Upvotes: 152
Reputation: 3031
Credits to user.py et al.
echo "ää" > /tmp/your_file.txt
cat /tmp/your_file.txt | wc -m
results in 3
.
In my example the result is expected to be 2
(twice the letter ä
). However, echo (or vi) adds a line break \n
to the end of the output (or file). So two ä
and one Linux line break \n
are counted. That's three together.
Working with pipes |
is not the shortest variant, but so I have to know less wc
parameters by heart. In addition, cat
is bullet-proof in my experience.
Tested on Ubuntu 18.04.1 LTS (Bionic Beaver).
Upvotes: 0
Reputation: 89
To get exact character count of string, use printf, as opposed to echo, cat, or running wc -c directly on a file, because using echo, cat, etc will count a newline character, which will give you the amount of characters including the newline character. So a file with the text 'hello' will print 6 if you use echo etc, but if you use printf it will return the exact 5, because theres no newline element to count.
How to use printf for counting characters within strings:
$printf '6chars' | wc -m
6
To turn this into a script you can run on a text file to count characters, save the following in a file called print-character-amount.sh:
#!/bin/bash
characters=$(cat "$1")
printf "$characters" | wc -m
chmod +x on file print-character-amount.sh containing above text, place the file in your PATH (i.e. /usr/bin/ or any directory exported as PATH in your .bashrc file) then to run script on text file type:
print-character-amount.sh file-to-count-characters-of.txt
Upvotes: 7
Reputation: 207465
I would have thought that it would be better to use stat
to find the size of a file, since the filesystem knows it already, rather than causing the whole file to have to be read with awk
or wc
- especially if it is a multi-GB file or one that may be non-resident in the file-system on an HSM.
stat -c%s file
Yes, I concede it doesn't account for multi-byte characters, but would add that the OP has never clarified whether that is/was an issue.
Upvotes: 0
Reputation: 1
The following script is tested and gives exactly the results, that are expected
\#!/bin/bash
echo "Enter the file name"
read file
echo "enter the word to be found"
read word
count=0
for i in \`cat $file`
do
if [ $i == $word ]
then
count=\`expr $count + 1`
fi
done
echo "The number of words are $count"
Upvotes: 0
Reputation: 25599
awk only
awk 'BEGIN{FS=""}{for(i=1;i<=NF;i++)c++}END{print "total chars:"c}' file
shell only
var=$(<file)
echo ${#var}
Ruby(1.9+)
ruby -0777 -ne 'print $_.size' file
Upvotes: 1
Reputation: 26743
#!/bin/sh
wc -m $1 | awk '{print $1}'
wc -m
counts the number of characters; the awk
command prints the number of characters only, omitting the filename.
wc -c
would give you the number of bytes (which can be different to the number of characters, as depending on the encoding you may have a character encoded on several bytes).
Upvotes: 25