Reputation: 2896

HTML Parser to extract text out of the body (in java)

I am working on this project that requires me to carry out some text manipulation out of the text that I obtain from web pages. Now, the first step towards doing this would be for me to find a parser that would extract the required body text ignoring the redundant information. I am not sure how I would do this, since I am extremely new to programming. I would really appreciate any help I could get. Thanks in advance

Upvotes: 0