Reputation: 779
I am using this to extract SPAN tags and tell how many there is.
ublic class HtmlparserExampleActivity extends Activity {
String outputtext;
TagFindingVisitor visitor;
Parser parser = null;
private static final String TAG = "TVGuide";
private static final boolean D = true;
TextView outputTextView;
/** Called when the activity is first created. */
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
outputTextView = (TextView) findViewById(R.id.outputTextView);
if(D) Log.e(TAG, "+++ ON CREATE +++");
try {
Log.e(TAG, "In doInBackground");
parser = new Parser ("http://www.johandegraeve.net/android");
String tags[] = { "SPAN" };
visitor = new TagFindingVisitor(tags);
try {
parser.visitAllNodesWith (visitor);
outputtext = "there are " + visitor.getTags(0).length + " SPAN nodes.\n";
for (int i = 0;i<visitor.getTags(0).length;i++) {
outputtext = outputtext + visitor.getTags(0)[i].toHtml();
}
outputTextView.setText(outputtext);
} catch (ParserException e) {
if(D) Log.e(TAG, "Exception in +++ ON CREATE +++ \n" +
"parser.visitAllNodesWith (visitor) failed\n" +
e.toString());
}
} catch (ParserException e1) {
if(D) Log.e(TAG, "Exception in +++ ON CREATE +++ \n" +
"creation of parser failed\n" +
e1.toString());
} }
}
How do i change this to get the text and images and display just the text and images in their wigdets? Using this code?
EDIT: What would be the tags for a html page like this to get the text and the images URLs?
http://movies.ign.com/articles/100/1002569p1.html
EDIt: Source code.
public class HtmlparserExampleActivity extends Activity {
String outputtext;
TagFindingVisitor visitor;
Parser parser = null;
private static final String TAG = "TVGuide";
TextView outputTextView;
/** Called when the activity is first created. */
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
outputTextView = (TextView)findViewById(R.id.outputTextView);
String id = "main-article-content";
Document doc = null;
try {
doc = Jsoup.connect("http://movies.ign.com/articles/100/1002569p1.html").get();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Log.i("DOC", doc.toString().toString());
Elements elementsHtml = doc.getElementsByTag(id);
String[] temp1 = new String[99];;
int i =0;
for(Element element: elementsHtml)
{
temp1 = element.text();
i++;
outputTextView.setText(temp1[1]);
}
}
}
I tried this, it didnt work. Maybe im doing something wrong. NO text showed up in the textview. But i seen some tags in the debug from the webpage.
Upvotes: 1
Views: 4908
Reputation: 50578
Use JSoup
parser and parse the elements by tag
. JSoup is very efficient and simple for these kind of small parsings.
Edit: I dont know your situation but I'll give a try:
Document doc = Jsoup.connect("someurl").get();
Log.i("DOC", doc.toString().toString());
Elements elementsHtml = doc.getElementsByTag("tr"); <--- here you specify the html tag where is the text is located
String[] temp1 = new String[99];
int i =0;
for(Element element: elementsHtml)
{
temp1[i] = element.text();
i++;
}
//After you have collected all the elements, you set the textview
More: Go the the page you want and view the page source and there you can search the content you want then see which tag/class/id you are going to use.
I've parsed the HTML for you:
try{
Document doc = Jsoup.connect("http://movies.ign.com/articles/100/1002569p1.html").get();
Elements elementsHtml = doc.getElementsByAttributeValue("id", "main-article-content");
for(Element element: elementsHtml)
{
Log.i("PARSED ELEMENTS:",URLDecoder.decode(element.text(), HTTP.UTF_8));
outputTextView.setText(element.text());
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Is this the text you wanted to parse?
08-11 21:08:02.095: INFO/PARSED ELEMENTS(200): It's the end of an era, as Harry Potter and the Deathly Hallows - Part 2 opens this week, bringing to a close the epic film series that has spanned eight films and ten years. To mark the occasion, we have decided to take another look at the wonderful characters in the series, once more ranking our Top 25. You'll notice some tweaks and changes to this list since we first ran it a couple of years ago, as we examined and re-evaluated all that we've seen of the characters. Before we reveal our picks, a quick word about the selection process...
Upvotes: 4