Reputation: 125
I've got a Perl program. It's output looks something like this:
http://www.site.com/file1.html
http://www.site.com/file2.html
http://www.site.com/file3.html
.
.
.
v
I've got an unfinished Python program. Here it is:
import subprocess
pipe = subprocess.Popen(["perl", "perl_program.pl"])
I run the python program in my terminal like this:
whatever:~ whatever$ python python_program.py
I get the following:
http://www.site.com/file1.html
http://www.site.com/file2.html
http://www.site.com/file3.html
.
.
.
v
I want to pop these URLs into an array in my Python code and manipulate them within Python. How do I do that?
Here is the Perl program I am working with:
1 use LWP::Simple;
2 use HTML::TreeBuilder;
3 use Data::Dumper;
4
5 my $tree = url_to_tree( 'http://www.registrar.ucla.edu/schedule/schedulehome.aspx' );
6
7 my @selects = $tree->look_down( _tag => 'select' );
8 my @quarters = map { $_->attr( 'value' ) } $selects[0]->look_down( _tag => 'option' );
9 my @courses = map { my $s = $_->attr( 'value' ); $s =~ s/&/%26/g; $s =~ s/ /+/g; $s } $selects[1]->look_down( _tag => 'option' );
10
11 my $n = 0;
12
13 my %hash;
14
15 for my $quarter ( @quarters )
16 {
17 for my $course ( @courses )
18 {
19 my $tree_b = url_to_tree( "http://www.registrar.ucla.edu/schedule/crsredir.aspx?termsel=$quarter&subareasel=$course" );
20
21 my @options = map { my $s = $_->attr( 'value' ); $s =~ s/&/%26/g; $s =~ s/ /+/g; $s } $tree_b->look_down( _tag => 'option' );
22
23 for my $option ( @options )
24 {
25
26
27 print "trying: http://www.registrar.ucla.edu/schedule/detselect.aspx?termsel=$quarter&subareasel=$course&idxcrs=$option\n";
28
29 my $content = get( "http://www.registrar.ucla.edu/schedule/detselect.aspx?termsel=$quarter&subareasel=$course&idxcrs=$option" );
30
31 next if $content =~ m/No classes are scheduled for this subject area this quarter/;
32
33 $hash{"$course-$option"} = 1;
34 #my $tree_c = url_to_tree( "http://www.registrar.ucla.edu/schedule/detselect.aspx?termsel=$quarter&subareasel=$course&idxcrs=$option" );
35
36 #my $table = ($tree_c->look_down( _tag => 'table' ))[2]->as_HTML;
37
38 #print "$table\n\n\n\n\n\n\n\n\n\n";
39
40 $n++;
41 }
42 }
43 }
44
45 my $hash_count = keys %hash;
46 print "$n, $hash_count\n";
47
48 sub url_to_tree
49 {
50 my $url = shift;
51
52 my $content = get( $url );
53
54 my $tree = HTML::TreeBuilder->new_from_content( $content );
55
56 return $tree;
57 }
Upvotes: 2
Views: 313
Reputation: 6466
Try this:
pipe = subprocess.Popen(["perl", "perl_program.pl"], stdout = subprocess.PIPE)
urls, stderr = pipe.communicate()
urls = urls.split("\n")
# urls is the array that you can now manipulate
Alternatively with Python 2.7 or higher you can use check_output
urls = subprocess.check_output(["perl", "perl_program.pl"]).split("\n")
If you use strip with no arguments, as suggested by @J.F.Sebastian, the url splitting would be even better, as superfluous newlines and whitespace are also stripped from the resulting list.
urls = urls.split()
Upvotes: 3