Reputation: 7263
I am writing a program that will take fields of data to create usernames and passwords
Here is how the data is formatted
MWS1990 XXX-XX-XXXX STASNY, MATTHEW W SO-II BISS CPSC BS INFO TECH 412/882-0581
here is the program
for linePosition in {11..22}
do
holder=`sed -n "${linePosition}p" $1|awk '{print $1}'`
holder2=`sed -n "${linePosition}p" $1|awk '{print $12}'`
holder3=`sed -n "${linePosition}p" $1|awk '{print $7}'`
echo "UserName"
echo "$holder"
echo "password"
echo "$holder2"
echo "$holder3"
done
It returns an output like this
UserName
MWS1990
password
412/882-0581
BISS
The thing 2 things that are wrong are
I would like it to remove the year after username. So the above example would instead be just MWS. What can I add to holder=`sed -n "${linePosition}p" $1|awk '{print $1}' to make it return just the first 3 letters. (preferably in lower case but not necessary)
I would like to remove the first 6 letters of the phone number. So instead of 412/882-0581 the phone number would read 0581
Upvotes: 1
Views: 192
Reputation: 8172
So here is a revised answer
for linePosition in {11..22}
do
holder=`sed -n "${linePosition}p" $1|awk '{print $1}'`
holder2=`sed -n "${linePosition}p" $1|awk '{print $12}'`
holder3=`sed -n "${linePosition}p" $1|awk '{print $7}'`
echo "UserName"
echo `expr match "$holder" '\([A-Z|a-z]*\)'`
echo "password"
echo ${holder2: -4}
echo "$holder3"
done
Now I am sticking with the bash string substitution as described in the link I posted in the comment.
However I would like to point out the following caveat about this solution
Here's a quick description of the following line of bash scripting ...
`expr match "$holder" '\([A-Z|a-z]*\)'`
The backticks execute a subshell within your for loop and they run the expr
command passing in match
which returns that part of the string $holder
which matches the regular expression [A-Z|a-z]*
at the start of the string. Ref http://tldp.org/LDP/abs/html/string-manipulation.html
Now if your data file is not too long then this will be OK.
However, if your script has to process a large data file then I would suggest that you look at Olaf's solution.
Why?
If you are processing a massive file or if you do not know the size of the file that is to be processed by your script that it is best to avoid executing sub-shells within for loops.
Olaf's solution where he exploits awk to carry out the processing that you require has a important advantage in that all the work takes place within a single process. Whereas the for loop that forks and execs a new instance of bash for each line of your file. An expensive operation which can be risky one when placed in a for loop.
For your code we can see that currently the for loop is bound by a small set of lines but if this is ever changed or a bug was introduced into the for loop whereby it ran forever then the script could adversely affect the performance of your machine.
So although my answer may have been easier to adapt to your code. Olaf's answer is better if you have to process a large amount of data.
Upvotes: 2
Reputation: 74108
Since you already use awk
, you can reduce the involved commands
awk 'NR >= 11 && NR <= 22 {
print "UserName";
print tolower(substr($1, 1, 3));
print "password";
print substr($12, 9);
print $7;}' $1
Upvotes: 3
Reputation: 2585
If you're using Bash, you can do both of those things easily with Bash substring extraction (see also here).
In other words, something like:
echo ${holder2:0:3} # "MWS"
echo ${holder3:8:12} # "0581"
# Or, to begin indexing from the right end:
echo ${holder3:(-4)} # "0581"
As for converting a string to lowercase in Bash, see e.g. ghostdog74's answer here.
Upvotes: 2