Nicholas Furusho
Nicholas Furusho

Reputation: 11

Program argument is 100 but returns the value as 0100

Right now I am trying to do an assignment where I have to - Extract information from an HTML file - Save it to a scalar - Run a regular expression to find the number of seats available in the designated course (the program argument is the course number for example 100 for ICS 100) - If the course has multiple sessions, I have to find the sum of the seats available and print - The output is just the number of seats available

The problem here is that when I was debugging and checking to make sure that my variable I have the program arg saved to was storing the correct value, it was storing the values with an extra 0 behind it.

ex.) perl filename.pl 100

ARGV[0] returns as 0100

I've tried storing the True regular expression values to an array, saving using multiple scalar variables, and changing my regular expression but none worked.

die "Usage: perl NameHere_seats.pl course_number" if (@ARGV < 1);
# This variable will store the .html file contents
my $fileContents;
# This variable will store the sum of seats available in the array @seatAvailable
my $sum = 0;
# This variable will store the program argument
my $courseNum = $ARGV[0];

# Open the file to read contents all at once
open (my $fh, "<", "fa19_ics_class_availability.html") or die ("Couldn't open 'fa19_ics_class_availability.html'\n");
  # use naked brakets to limit the $/
  {
  #use local $/ to get <$fh> to read the whole file, and not one line
  local $/;
  $fileContents = <$fh>;
  }
# Close the file handle
close $fh;
# Uncomment the line below to check if you've successfully extracted the text
# print $fileContents;
# Check if the course exists
die "No courses matched...\n" if ($ARGV[0] !~ m/\b(1[0-9]{2}[A-Z]?|2[0-8][0-9][A-Z]?|29[0-3])[A-Z]?\b/);
while ($fileContents =~ m/$courseNum(.+?)align="center">(\d)</) {
  my $num = $2;
  $sum = $sum + $num;
}
print $sum;
# Use this line as error checking to make sure @ARGV[0] is storing proper number
print $courseNum;

The current output I am receiving when program argument is 100 is just 0, and I assume it's because the regular expression is not catching any values as true therefore the sum remains at a value of 0. The output should be 15... This is a link to the .html page > https://laulima.hawaii.edu/access/content/user/emeyer/ics/215/FA19/01/perl/fa19_ics_class_availability.html

Upvotes: 1

Views: 71

Answers (1)

Dave Cross
Dave Cross

Reputation: 69314

You're getting "0100" because you have two print() statements.

print $sum;
...
print $courseNum;

And because there are no newlines or other output between them, you get the two values printed out next to each other. $sum is '0' and $courseNum is '100'.

So why is $sum zero? Well, that's because your regex isn't picking up the data you want it to match. Your regex looks like this:

m/$courseNum(.+?)align="center">(\d)</

You're looking for $courseNum followed by a number of other characters, followed by 'align="center">' and then your digit. This doesn't work for a number of reasons.

  1. The string "100" appears many times in your text. Many times it doesn't even mean a course number (e.g. "100%"). Perhaps you should look for something more precise (ICS $coursenum).
  2. The .+? doesn't do what you think it does. The dot doesn't match newline characters unless you use the /s option on the match operator.
  3. But even if you fix those first two problems, it still won't work as there are a number of numeric table cells for each course and you're doing nothing to ensure that you're grabbing the last one. Your current code will get the "Curr. Enrolled" column, not the "Seats Avail" one.

This is a non-trivial HTML parsing problem. It shouldn't be addressed using regexes (HTML should never be parsed using regexes). You should look at one of the HTML parsing modules from CPAN - I think I'd use Web::Query.

Update: An example solution using Web::Query:

#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';

use File::Basename;
use Web::Query;

my $course_num = shift
  or die 'Usage: perl ' . basename $0 . " course_number\n";

my $source = 'fa19_ics_class_availability.html';
open my $fh, '<', $source
  or die "Cannot open '$source': $!\n";

my $html = do { local $/; <$fh> };

my $count_free;

wq($html)
  # Get each table row in the table
  ->find('table.listOfClasses tr')
  ->each(sub {
      my ($i, $elem) = @_;

      my @tds;

      # Get each <td> in the <tr>
      $elem->find('td')->each(sub { push @tds, $_[1] });

      # Ignore rows that don't have 13 columns
      return if @tds != 13;
      # Ignore rows that aren't about the right course
      return if $tds[2]->text ne "ICS $course_num";

      # Add the number of available places
      $count_free += $tds[8]->text;
    });

 say $count_free;

Upvotes: 1

Related Questions