Reputation: 10865
I am wondering if you could please help with generating .cpp/.h file from the following html file in a programmatic way (using whatever scripting language, or programming language, or even using editors such as vi or emacs):
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-US">
<head>
<title>Class</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body link="blue" vlink="purple" bgcolor="#FFFABB" text="black">
<h2><font face="Helvetica">Code Fragment: Class</font></h2>
</center><br><dl><dd><pre>
<font color=#A000A0>template</font> <<font color=#A000A0>typename</font> G>
<font color=#A000A0>class</font> Components : <font color=#A000A0>public</font> DFS<G> { <font color=#0000FF>// count components</font>
<font color=#A000A0>private</font>:
<font color=#A000A0>int</font> nComponents; <font color=#0000FF>// num of components</font>
<font color=#A000A0>public</font>:
<font color=#000000>Components</font>(<font color=#A000A0>const</font> G& g): DFS<G>(g) {} <font color=#0000FF>// constructor</font>
<font color=#A000A0>int</font> <font color=#A000A0>operator</font>()(); <font color=#0000FF>// count components</font>
};
</dl>
</body>
</html>
If you could please point out how this was done in the other direction too, that would be great. Thanks a lot.
Upvotes: 0
Views: 368
Reputation: 78033
PHP script:
$doc = new DOMDocument();
$doc->loadHTMLFile("file.html");
$xpath = new DOMXpath($doc);
$str = '';
foreach ($xpath->query("//dl//text()") as $node) {
$str .= $node->nodeValue . ' ';
}
file_put_contents('file.cpp', $str);
contents of file.cpp:
template < typename G>
class Components : public DFS<G> { // count components
private :
int nComponents; // num of components
public :
Components ( const G& g): DFS<G>(g) {} // constructor
int operator ()(); // count components
};
Upvotes: 2
Reputation: 126957
Another option for going from HTML to the source code is the html2text
utility, that is often found installed in many Linux distributions.
matteo@teomint:~/Desktop$ html2text out.html
***** Code Fragment: Class *****
template <typename G>
class Components : public DFS<G> { // count components
private:
int nComponents; // num of components
public:
Components(const G& g): DFS<G>(g) {} // constructor
int operator()(); // count components
};
Upvotes: 1
Reputation: 11626
Does this work for you?
[18:56:44 jaidev@~]$ lynx --dump foo.html
Code Fragment: Class
template <typename G>
class Components : public DFS<G> { // count components
private:
int nComponents; // num of components
public:
Components(const G& g): DFS<G>(g) {} // constructor
int operator()(); // count components
};
[18:56:49 jaidev@~]$
Edit:
For the reverse direction. If you use vim as your editor, you can enter :TOhtml
to generate a syntax highlighted HTML version of your code in a new buffer. It generates a html based on your vim colorscheme. To change the colorscheme, use the :colorscheme <name>
command.
Upvotes: 8
Reputation: 1178
You could use regular expressions to...
<body>
of the HTML page,<.*>
should be removed from the file).<
, >
, &
etc.What's left should be the code you're looking for.
Upvotes: 1
Reputation: 1344
If you're trying to strip all HTML tags to get back the original, non-highlighted source code, then you have a two options that I can think of:
Upvotes: 0
Reputation: 385405
pre
code block with DOMDocument
strip_tags()
from the resultUpvotes: 0