Reputation: 789
I want to select all from the html except <blockquote>
element. How to do this in the simpliest way using Jsoup?
I know there is a :not
syntax, but how to use it in this example?
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
public class AppMain {
public static void main(String[] args) throws IOException {
String html = "<body> <blockquote> ...remove.this... </blockquote> ...get.this... </body>";
Document d = Jsoup.parse(html);
Element element = d.select(:not("blockquote").first(); // doesn't work
System.out.println(element.text()); // here I want get only: `...get.this...'
}
}
Upvotes: 0
Views: 1341
Reputation: 5943
You have a syntactical error in this line (your compiler should have complained about it):
d.select(:not("blockquote"); // doesn't work
This would be the valid syntax:
d.select(":not(blockquote)");
Because select
is a Java method which takes a String
argument. So you must give it a String
, e.g.:
d.select("something");
And this "something"
has to be a selector. In your case: ":not(blockquote)"
.
Another approach would be to select all <blockquote>
elements and remove them:
d.select("blockquote").remove()
// after that, work with d
Upvotes: 1