Reputation: 39
I am using jsoup trying to get only the text for my research and i get a lot of "garbage" with the text and i don't know how to get rid of it. my code:
Document doc = Jsoup.connect(link).userAgent("Mozilla").ignoreHttpErrors(true).timeout(0).get();
String plainText = doc.body().text();
for this web site url(saved in the variable link in the above code) for example: http://www.besthealthmag.ca/best-eats/healthy-eating/5-reasons-to-eat-more-avocados/
the output will be:
Best Health Magazine Canada Live Better. Feel Great Best Looks Beauty Hair Nails Skin Style Best You Cold and Flu Diabetes Health Heart Health Oral Health Relationships Sleep Wellness Best Eats Cooking Diet Digestion Healthy Eating Nutrition Recipes Smoothies Swap and Drop Contests and Games Contests Games Coupons Subscribe Give a gift You are here : Home / Best Eats / Healthy Eating / 5 reasons to eat more avocados 5 reasons to eat more avocados Forget the fat'avocados are a super-healthy way to add valuable nutrients and fibre (yes, and healthy fat!) to your diet. Here's why By Best Health Some people, in their attempts to be health-conscious, avoid avocados due to the relatively high fat and calorie content of these fruits (138 calories and 14.1g fat in half a medium-sized avocado). Yet avocados are one of the best foods you can eat, packed with nutrients and heart-healthy compounds. Here are five great reasons to eat them regularly. 1. Avocados are packed with carotenoids Avocados are a great source of lutein, a carotenoid that works as an antioxidant and helps protect against eye disease. They also contain the related carotenoids zeaxanthin, alpha-carotene and beta-carotene, as well as tocopherol (vitamin E). But avocados aren’t just a rich source of carotenoids by themselves’they also help you get more of these nutrients from other foods. Carotenoids are lipophilic (soluble in fat, not water), so eating carotenoid-packed foods like fruits and vegetables along with monounsaturated-fat-rich avocados helps your body absorb the carotenoids. An easy way to do this is to add sliced avocado to a mixed salad. 2. Avocados make you feel full Half an avocado contains 3.4 grams of fibre, including soluble and insoluble, both of which your body needs to keep the digestive system running smoothly. Plus, soluble fibre slows the breakdown of carbohydrates in your body, helping you feel full for longer. Avocados also contain oleic acid, a fat that activates the part of your brain that makes you feel full. Healthier unsaturated fats containing oleic acid have been shown to produce a greater feeling of satiety than less-healthy saturated fats and trans fats found in processed foods. 3. Avocados can protect your unborn baby’and your heart One cup of avocado provides almost a quarter of your recommended daily intake of folate, a vitamin which cuts the risk of birth defects. If you’re pregnant’or planning to be’avocados will help protect your unborn baby. A high folate intake is also associated with a lower risk of heart attacks and heart disease. Does your family have a history of heart problems, or do you have risk factors (such as being overweight or smoking) for heart disease? Avocados could help keep your heart healthy. 4. Avocados can help lower your cholesterol As well as increasing feelings of fullness, the oleic acid in avocados can help reduce cholesterol levels. In one study, individuals eating an avocado-rich diet had a significant decrease in total cholesterol levels, including a decrease in LDL cholesterol. Their levels of HDL cholesterol (the healthy type) increased by 11%. High cholesterol is one of the main risk factors for heart disease. The cholesterol-lowering properties of avocado, along with its folate content, help keep your heart healthy. 5. Avocados taste great The last reason is simple’avocados are a healthy way to boost the flavour and texture of your meals. Toss chopped avocado on a salad or bowl of soup, serve guacamole as an appetizer or condiment, or try one of these healthy avocado recipes to get more healthy avocado into your diet. ‘ Avocado and Chicken Club Sandwich ‘ Avocado and Shrimp Cups ‘ Avocado Carpaccio with Wild Blueberry Cottage Cheese Mix ‘ Broiled Salmon with Avocado-Mango Salsa ‘ Classic Guacamole Don’t miss out! Sign up for our free weekly newsletters and get nutritious recipes, healthy weight-loss tips, easy ways to stay in shape and all the health news you need, delivered straight to your inbox. Filed Under: Healthy Eating Tagged With: Diet & Nutrition Secrets to Staying Healthy & Happy Spine stretch Source: Best Health Magazine, Summer 2011; Illustration by Kagan McLeodWho doesn't want better posture? This quick move can help you stand (and sit) tall with ease, as it aligns the spine. It can relieve tension in the hips and shoulders, says Margot … [Read More...] Seared Beef with Garden Vegetables Source: Cook Smart for a Healthy Heart, Reader's Digest CanadaIngredients2 garlic cloves, finely chopped1?2 teaspoon pepper1 bone-in beef sirloin steak 2 cm thick, about 1.2 kg500 g potatoes, quartered500 g green beans1 large red onion, very … [Read More...] Tropical Fruit Smoothie Ingredients1/4 cup (60 mL) fresh or canned pineapple chunks1/4 cup (60 mL) fresh mango chunks2 fresh or frozen strawberries, hulled1 cup (250 mL) plain low-fat soy milk1 Tbsp (15 mL) lime juiceDirections Blend until smooth.Serves one.Nutritions Per … [Read More...] Savoury Rosemary "Squaffles" Best Health Magazine, March/April 2012; Photo by Michael AlberstatIngredients1 cup (250 mL) whole-wheat flour1 cup (250 mL) all-purpose flour1 1/2 tsp (7 mL) dried rosemary (or 1 Tbsp/15 mL chopped fresh)1 tsp (5 mL) baking powder1 tsp (5 mL) grated … [Read More...] Frittata with Asparagus and Scallions Ingredients1 pound fresh, thin asparagus spears4 ounces prosciutto or bacon, thick slices with ample fat (about 4 slices)1/2 pound scallions3 tablespoons extra-virgin olive oil1/2 teaspoon coarse sea salt or kosher salt, or more to taste8 large … [Read More...] We Also Think You’ll Like: Kaitlyn Bristowe Talks Big Hair, Beyonc? and The Bachelor Botox, Fillers and General Skin Treatments: What to Know Before You Book Bloated? How to Know if it’s Irritable Bowel Syndrome What If No One Believed You Were Sick? #PhysiotherapyHelpsLives: Stroke Recovery Follow us on Social Networks ? ? ? ? ? ? Featured Content 5 ways to relieve back pain Suffering from an aching back? We dug into the latest research to find out what you can do to help relieve back pain Willow: A Natural Remedy for Pain Foods and Supplements to Ease Muscle and Joint Pain 5 autoimmune diseases affecting Canadians CONTESTS Best for Last Enter to win one of three (3) Prize Packs! Win A Holiday to Jamaica Enter for a chance to win a trip to Jamaica! Load thickbox... ABOUT BEST HEALTH Best Health is a health & wellness magazine from renowned publisher Reader’s Digest that brings an inspiring voice to today’s contemporary Canadian woman. Click here to learn about advertising opportunities. Your Privacy Rights | Customer Care | Subscribe | Sitemap Recent Posts Kaitlyn Bristowe Talks Big Hair, Beyonc? and The Bachelor Botox, Fillers and General Skin Treatments: What to Know Before You Book Bloated? How to Know if it’s Irritable Bowel Syndrome What If No One Believed You Were Sick? #PhysiotherapyHelpsLives: Stroke Recovery Don't miss a day of Best Health Sign up for our newsletters CONNECT WITH US ON SOCIAL ? ? ? ? ? ? Show Full Site Copyright © 2016 The Reader's Digest Association, Inc.
with a lot of garbage: ? question marks, and [Read More...] and more..
what can i do about it?
Upvotes: 0
Views: 258
Reputation: 43013
You should narrow down to extract only the text you want.
For the website in your example try this:
String plainText = doc.select("article[itemProp=blogPost]").text();
Since you're targeting 500+ sites, the above CSS query may not work everywhere. You would need to adapt for every sites :-\
.
Upvotes: 0
Reputation: 6119
You may use Scanner and skip tokens you don't like. Iterate over tokens like this
Scanner s = new Scanner(": Best Health Magazine Canada Live Better. Feel Great Best Looks Beauty Hair Nails Skin Style Best You Cold and Flu Diabetes Health Heart Health Oral Health Relationships Sleep Wellness Best Eats Cooking Diet Digestion Healthy Eating Nutrition Recipes Smoothies Swap and Drop Contests and Games Contests Games Coupons Subscribe Give a gift You are here : Home / Best Eats / Healthy Eating / 5 reasons to eat more avocados 5 reasons to eat more avocados Forget the fat'avocados are a super-healthy way to add valuable nutrients and fibre (yes, and healthy fat!) to your diet. Here's why By Best Health Some people, in their attempts to be health-conscious, avoid avocados due to the relatively high fat and calorie content of these fruits (138 calories and 14.1g fat in half a medium-sized avocado). Yet avocados are one of the best foods you can eat, packed with nutrients and heart-healthy compounds. Here are five great reasons to eat them regularly. 1. Avocados are packed with carotenoids Avocados are a great source of lutein, a carotenoid that works as an antioxidant and helps protect against eye disease. They also contain the related carotenoids zeaxanthin, alpha-carotene and beta-carotene, as well as tocopherol (vitamin E). But avocados aren’t just a rich source of carotenoids by themselves’they also help you get more of these nutrients from other foods. Carotenoids are lipophilic (soluble in fat, not water), so eating carotenoid-packed foods like fruits and vegetables along with monounsaturated-fat-rich avocados helps your body absorb the carotenoids. An easy way to do this is to add sliced avocado to a mixed salad. 2. Avocados make you feel full Half an avocado contains 3.4 grams of fibre, including soluble and insoluble, both of which your body needs to keep the digestive system running smoothly. Plus, soluble fibre slows the breakdown of carbohydrates in your body, helping you feel full for longer. Avocados also contain oleic acid, a fat that activates the part of your brain that makes you feel full. Healthier unsaturated fats containing oleic acid have been shown to produce a greater feeling of satiety than less-healthy saturated fats and trans fats found in processed foods. 3. Avocados can protect your unborn baby’and your heart One cup of avocado provides almost a quarter of your recommended daily intake of folate, a vitamin which cuts the risk of birth defects. If you’re pregnant’or planning to be’avocados will help protect your unborn baby. A high folate intake is also associated with a lower risk of heart attacks and heart disease. Does your family have a history of heart problems, or do you have risk factors (such as being overweight or smoking) for heart disease? Avocados could help keep your heart healthy. 4. Avocados can help lower your cholesterol As well as increasing feelings of fullness, the oleic acid in avocados can help reduce cholesterol levels. In one study, individuals eating an avocado-rich diet had a significant decrease in total cholesterol levels, including a decrease in LDL cholesterol. Their levels of HDL cholesterol (the healthy type) increased by 11%. High cholesterol is one of the main risk factors for heart disease. The cholesterol-lowering properties of avocado, along with its folate content, help keep your heart healthy. 5. Avocados taste great The last reason is simple’avocados are a healthy way to boost the flavour and texture of your meals. Toss chopped avocado on a salad or bowl of soup, serve guacamole as an appetizer or condiment, or try one of these healthy avocado recipes to get more healthy avocado into your diet. ‘ Avocado and Chicken Club Sandwich ‘ Avocado and Shrimp Cups ‘ Avocado Carpaccio with Wild Blueberry Cottage Cheese Mix ‘ Broiled Salmon with Avocado-Mango Salsa ‘ Classic Guacamole Don’t miss out! Sign up for our free weekly newsletters and get nutritious recipes, healthy weight-loss tips, easy ways to stay in shape and all the health news you need, delivered straight to your inbox. Filed Under: Healthy Eating Tagged With: Diet & Nutrition Secrets to Staying Healthy & Happy Spine stretch Source: Best Health Magazine, Summer 2011; Illustration by Kagan McLeodWho doesn't want better posture? This quick move can help you stand (and sit) tall with ease, as it aligns the spine. It can relieve tension in the hips and shoulders, says Margot … [Read More...] Seared Beef with Garden Vegetables Source: Cook Smart for a Healthy Heart, Reader's Digest CanadaIngredients2 garlic cloves, finely chopped1?2 teaspoon pepper1 bone-in beef sirloin steak 2 cm thick, about 1.2 kg500 g potatoes, quartered500 g green beans1 large red onion, very … [Read More...] Tropical Fruit Smoothie Ingredients1/4 cup (60 mL) fresh or canned pineapple chunks1/4 cup (60 mL) fresh mango chunks2 fresh or frozen strawberries, hulled1 cup (250 mL) plain low-fat soy milk1 Tbsp (15 mL) lime juiceDirections Blend until smooth.Serves one.Nutritions Per … [Read More...] Savoury Rosemary Squaffles Best Health Magazine, March/April 2012; Photo by Michael AlberstatIngredients1 cup (250 mL) whole-wheat flour1 cup (250 mL) all-purpose flour1 1/2 tsp (7 mL) dried rosemary (or 1 Tbsp/15 mL chopped fresh)1 tsp (5 mL) baking powder1 tsp (5 mL) grated … [Read More...] Frittata with Asparagus and Scallions Ingredients1 pound fresh, thin asparagus spears4 ounces prosciutto or bacon, thick slices with ample fat (about 4 slices)1/2 pound scallions3 tablespoons extra-virgin olive oil1/2 teaspoon coarse sea salt or kosher salt, or more to taste8 large … [Read More...] We Also Think You’ll Like: Kaitlyn Bristowe Talks Big Hair, Beyonc? and The Bachelor Botox, Fillers and General Skin Treatments: What to Know Before You Book Bloated? How to Know if it’s Irritable Bowel Syndrome What If No One Believed You Were Sick? #PhysiotherapyHelpsLives: Stroke Recovery Follow us on Social Networks ? ? ? ? ? ? Featured Content 5 ways to relieve back pain Suffering from an aching back? We dug into the latest research to find out what you can do to help relieve back pain Willow: A Natural Remedy for Pain Foods and Supplements to Ease Muscle and Joint Pain 5 autoimmune diseases affecting Canadians CONTESTS Best for Last Enter to win one of three (3) Prize Packs! Win A Holiday to Jamaica Enter for a chance to win a trip to Jamaica! Load thickbox... ABOUT BEST HEALTH Best Health is a health & wellness magazine from renowned publisher Reader’s Digest that brings an inspiring voice to today’s contemporary Canadian woman. Click here to learn about advertising opportunities. Your Privacy Rights | Customer Care | Subscribe | Sitemap Recent Posts Kaitlyn Bristowe Talks Big Hair, Beyonc? and The Bachelor Botox, Fillers and General Skin Treatments: What to Know Before You Book Bloated? How to Know if it’s Irritable Bowel Syndrome What If No One Believed You Were Sick? #PhysiotherapyHelpsLives: Stroke Recovery Don't miss a day of Best Health Sign up for our newsletters CONNECT WITH US ON SOCIAL ? ? ? ? ? ? Show Full Site Copyright © 2016 The Reader's Digest Association, Inc.");
StringBuilder nogarbage = new StringBuilder();
StringBuilder trashbin = new StringBuilder();
StringBuilder receiver = null;
while (s.hasNext())
{
String token = s.next();
if (isGood(token)) //implement isGood somewhere
{
receiver = nogarbage;
}
else
{
receiver = trash;
}
receiver.append(token);
}
System.out.println(nogarbage);
System.out.println("Trashbin contents "+trashbin);
Upvotes: 1