Reputation: 1233
public String transform_XML(String type, InputStream file){
TransformerFactory tf = TransformerFactory.newInstance();
String xslfile = "/StyleSheets/" + type + ".xsl";
Transformer t = tf.newTemplates(new StreamSource(this.getClass().getResourceAsStream(xslfile))).newTransformer();
Source source = new StreamSource(file);
CharArrayWriter wr = new CharArrayWriter();
StreamResult result = new StreamResult(wr);
t.transform(source, result);
return wr.toString();
}
The above method takes an xsl and xml file as input and returns the transformed result as String. Classes from Package javax.xml.transform has been used to accomplish this.
Now can i use the same package to transform an html file? (Since the package name has xml i seriously doubt it.) What should i do to transform an html file?
Upvotes: 1
Views: 376
Reputation: 10094
As I Understand your comment, it's mainly for scraping ang getting back information
You can have a look at JSoup, which is very handy to parse and scrape a DOM from HTML
Otherwise, If you want to keep your xslts, stemm solution should be fine
Upvotes: 1
Reputation: 6050
As you understand, html documents aren't necessary valid xml. But you can transform html to xml, and after that manipulate with valid xml (after transformation - you'll get DOM tree).
I'd suggest you to use CyberNeko HTML Parser to transform html
into xml
.
Draft example:
import org.cyberneko.html.parsers.DOMParser;
import org.w3c.dom.Document;
...
public Document parseHtml(InputStream is) throws Exception {
DOMParser parser = new DOMParser();
parser.parse(new InputSource(is));
return parser.getDocument();
}
If you use maven
- you can simply add to your project CyberNeko
from repository http://mvnrepository.com/artifact/nekohtml/nekohtml
Upvotes: 1
Reputation: 3508
public class SimpleXSLT {
public static void main(String[] args) {
String inXML = "C:/tmp/temp.html";
String inXSL = "C:/tmp/temp.xsl";
String outTXT = "C:/tmp/temp_copy.html";
SimpleXSLT st = new SimpleXSLT();
try {
st.transform(inXML,inXSL,outTXT);
} catch(TransformerConfigurationException e) {
System.err.println("Invalid factory configuration");
System.err.println(e);
} catch(TransformerException e) {
System.err.println("Error during transformation");
System.err.println(e);
}
}
public void transform(String inXML,String inXSL,String outTXT)
throws TransformerConfigurationException,
TransformerException {
TransformerFactory factory = TransformerFactory.newInstance();
StreamSource xslStream = new StreamSource(inXSL);
Transformer transformer = factory.newTransformer(xslStream);
transformer.setErrorListener(new MyErrorListener());
StreamSource in = new StreamSource(inXML);
StreamResult out = new StreamResult(outTXT);
transformer.transform(in,out);
System.out.println("The generated XML file is:" + outTXT);
}
}
Upvotes: 1