Reputation: 3325
#!/bin/sh
URL1=http://runescape.com/title.ws
tot=`wget -qO- $URL1 | grep -i PlayerCount | cut -d\> -f4 | cut -d\< -f1 | sed -e's/,//'`
URL2=http://oldschool.runescape.com
b=`wget -qO- $URL2| grep "people playing" | awk '{print $4}'`
a=`expr $tot - $b`
export LC_ALL=en_US.UTF-8
a_with_comma=`echo $a | awk "{printf \"%'d\n\", \\$1}"`
echo "$a_with_comma people `date '+%r %b %d %Y'`"
This grabs 2 numbers from URL1 and URL2 and subtracts URL1 from URL2. Trying to get the "48,877 Players Online Now" from http://www.runescape.com/title.ws (URL1)
URL2 works fine I just can't get URL1.
Upvotes: 0
Views: 71
Reputation: 189327
Here's a quick attempt at refactoring the original into just two awk
instances, to get rid of most of the contortions.
#!/bin/sh
#URL1=http://runescape.com/title.ws
#tot=$(wget -qO- "$URL1" | awk 'tolower($0) ~ /playercount/ {
# # Trim anything after this expression
# gsub(/<\/span> Players Online Now<\/span>.*/, "")
# # From the remainder, trim anything up through last tag close
# gsub(/.*>/, "")
# # Should be left with a number. Remove any thousands separator
# gsub(/,/, "")
# # Should have a computer-readable number now. Print it
# print }')
URL0='http://www.runescape.com/c=eWHvvLATbvs/player_count.js?varname=iPlayerCount&callback=jQuery17201610493347980082_1378103074657&_=1378103197632'
tot=$(wget -qO- "$URL0" | awk -F '[()]' '{ print $2 }'
URL2=http://oldschool.runescape.com
wget -qO- "$URL2" | awk -v tot=$tot -v fmt="%'d people " '
/people playing/ { printf(fmt, tot-$4 )}'
date '+%r %b %d %Y'
The processing for URL1
should be somewhat more robust now, as it looks for a span
followed by Players Online Now
. They could change the formatting of the page at any time in such a way that this breaks again, of course. Thus, it would probably be even better to use a JSON API if they offer one. (Brief googling suggests this exists, but is undocumented. The main documentation seems to be at http://services.runescape.com/m=rswiki/en/Grand_Exchange_APIs but this has nothing about summary player stats.)
The comments are not strictly necessary, of course. They should help you figure out what to change if the page's source changes again, so it's not a good idea to trim them, unless you learn Awk well enough that you don't need them.
EDIT: Updated to use the JSON API for the total player count -- this should be a lot more robust, and a lot simpler, too. I left the original code commented out just in case.
Upvotes: 1
Reputation: 10460
You could change ...
tot=`wget -qO- $URL1 | grep -i PlayerCount | cut -d\> -f4 | cut -d\< -f1 | sed -e's/,//'`
... to ...
tot=`wget -qO- $URL1 | grep -i playercount | cut -d\> -f5 | cut -d\< -f1 | sed -e's/,//'`
... that is if you are in a hurry. Otherwise you might want to follow tripleee's advice. Who knows you might get an award from the ACM :-)
Upvotes: 1