Reputation: 1805
I was under the impression you can convert HTML to XHTML using TagSoup. I have the tagsoup jar file saved as tagsoup.jar I used the following command wget -O usa_stock.html "http://markets.usatoday.com/custom/usatoday-com/new/html-mktscreener.asp#" | java -jar tagsoup.jar usa_stock.html
When I use this command, it generates both the html and xhtml file but when I open the xhtml in firefox it's empty. I'm suspecting that when I pipeline it just doesn't know which file I was trying to convert.
Can someone help me out with this one?
Thanks.
Upvotes: 1
Views: 975
Reputation: 6751
The pipeline (|
) used in your code is wrong for sure, change it with &&
could possible solve your problem.
wget
didn't output the retrieve webpage to stdout
, so you piped nothing into tagsoup. java-jar
starts to execute, wget
is still running. The input file you specified for tagsoup isn't ready yet.So you need wget
quit with 0
exit status first before jsoup start, &&
here will serve this purpose.
Upvotes: 3