Reputation: 1105
I would like to create an array using bash commands of names. I have a file with this form,
<span class="username js-action-profile-name">@paulburt07</span>
<span class="username js-action-profile-name">@DavidWBrown7</span>
<span class="username js-action-profile-name">@MikeLarkan</span>
<span class="username js-action-profile-name">@WeathermanABC</span>
<span class="username js-action-profile-name">@JoshHoltTEN</span>
<span class="username js-action-profile-name">@TonyAuden</span>
<span class="username js-action-profile-name">@Magdalena_Roze</span>
<span class="username js-action-profile-name">@janesweather7</span>
<span class="username js-action-profile-name">@VanessaOHanlon</span>
And I need an array like
array = ( "paulburt07" "DavidWBrown7" "MikeLarkan" "WeathermanABC" "JoshHolTEN" "TonyAuden" "Magdalena_Roze" "janesweahter7" "VansessaOHanlon" )
Any idea?
Upvotes: 0
Views: 66
Reputation: 63912
One of many possible solutions:
array=($(grep -oP '@\K(.*)(?=<)' file))
EDIT:
Not much to explanation, the grep
searches the file for the pattern defined by the regular expression. (see man grep
). The -o
prints only the matches, the -P
says, use perl-ish regexes.
The @\K(.*)(?=<)
mean:
@
\K
, (but remember the position)(.*)
<
the $(command)
called as command substitution, and array=(...)
assingns the values to the array.
EDIT2 And because you original input probably contains more HTML tags, you can employ HTML parser, for example:
array=($(perl -Mojo -E 'say $_->text for x(b("filename.html")->slurp)->find(q{span[class~="username"]})->each'))
will print the content of any <span class=username>...</span>
in any HTML, regardless it's formatting. But for the above you need to have installed Mojolicious.
Upvotes: 2
Reputation: 84561
It is fairly simple using sed
and a tmp file:
#!/bin/bash
fname=${1:-htmlnames.txt} # original html file
tmp=${2:-htmltmp.txt} # temp file to use
sed -e 's/.*@//' "$fname" > "$tmp" # remove up to '@' and place in temp
sed -i 's/[<].*$//' "$tmp" # remove remainder in place in temp
namearray=( $(<"$tmp") ) # read temp file into array
rm "$tmp" # remove temp file
for i in "${namearray[@]}"; do # print out to verify
printf " %s\n" "$i"
done
exit 0
output:
alchemy:~/scr/tmp/stack/tmp> bash htmlnames.sh
paulburt07
DavidWBrown7
MikeLarkan
WeathermanABC
JoshHoltTEN
TonyAuden
Magdalena_Roze
janesweather7
VanessaOHanlon
Upvotes: 1