Questions
Questions

Reputation: 301

Android: How do I parse an HTML page?

I new to android programming. I would like to know how to parse a webpage and extract specific content into a ListView. What is the easiest and best way to do it? Can someone show me an example using what's given below?

URL = "Something.com".

I want to extract the names of the cities and href link for each one.

ann arbor
battle creek
central michigan
detroit metro
flint
grand rapids

Thank you guys and sorry for asking this basic question.

Upvotes: 1

Views: 8365

Answers (3)

Khurram W. Malik
Khurram W. Malik

Reputation: 2905

I remember I did this once and luckily I found that code. You would just have to give a call to this Intentservice from your activity and you would need to specify the website name at the top ( in url variable )

    public class parser extends IntentService {

public String url="http://www.mywebsite.com";

@Override
public void onDestroy() {
    // TODO Auto-generated method stub
    super.onDestroy();
//  shutdown();
}



@Override
public int onStartCommand(Intent intent, int flags, int startId) {
    // TODO Auto-generated method stub


    return super.onStartCommand(intent, flags, startId);
}


private Timer t = new Timer();
byte[] buffer;


public timeCell() {
    super("name"); 
    // TODO Auto-generated constructor stub
}


@Override
protected void onHandleIntent(Intent arg0) {
    // TODO Auto-generated method stub




        t.schedule(new TimerTask(){
            @Override
            public void run() {
                // TODO Auto-generated method stub
                  BufferedReader reader=null;
    try {
        reader = new BufferedReader(
                new InputStreamReader(
                    new URL(url).openStream()));
    } catch (MalformedURLException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

    String line = null;
    try {
        line = reader.readLine();
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }


    Log.v("line was ", line);  // printing is done here ;)


    }

    }

Upvotes: 0

Janmejoy
Janmejoy

Reputation: 2731

look the code below and let me know if you have any doubts and see this link it may help you

http://wptrafficanalyzer.in/blog/android-lazy-loading-images-and-text-in-listview-from-http-json-data/

public class MainActivity extends Activity {

    ListView mListView;

    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        // URL to the JSON data
        String strUrl = "ur url/countries";

        // Creating a new non-ui thread task to download json data
        DownloadTask downloadTask = new DownloadTask();

        // Starting the download process
        downloadTask.execute(strUrl);

        // Getting a reference to ListView of activity_main
        mListView = (ListView) findViewById(R.id.lv_countries);

    }

    /** A method to download json data from url */
    private String downloadUrl(String strUrl) throws IOException{
        String data = "";
        InputStream iStream = null;
        try{
            URL url = new URL(strUrl);

            // Creating an http connection to communicate with url
            HttpURLConnection urlConnection = (HttpURLConnection) url.openConnection();

            // Connecting to url
            urlConnection.connect();

            // Reading data from url
            iStream = urlConnection.getInputStream();

            BufferedReader br = new BufferedReader(new InputStreamReader(iStream));

            StringBuffer sb  = new StringBuffer();

            String line = "";
            while( ( line = br.readLine())  != null){
                sb.append(line);
            }

            data = sb.toString();

            br.close();

        }catch(Exception e){
            Log.d("Exception while downloading url", e.toString());
        }finally{
            iStream.close();
        }

        return data;
    }

    /** AsyncTask to download json data */
    private class DownloadTask extends AsyncTask<String, Integer, String>{
        String data = null;
        @Override
        protected String doInBackground(String... url) {
            try{
                data = downloadUrl(url[0]);
            }catch(Exception e){
                Log.d("Background Task",e.toString());
            }
            return data;
        }

        @Override
        protected void onPostExecute(String result) {

            // The parsing of the xml data is done in a non-ui thread
            ListViewLoaderTask listViewLoaderTask = new ListViewLoaderTask();

            // Start parsing xml data
            listViewLoaderTask.execute(result);
        }
    }

    /** AsyncTask to parse json data and load ListView */
    private class ListViewLoaderTask extends AsyncTask<String, Void, SimpleAdapter>{

        JSONObject jObject;
        // Doing the parsing of xml data in a non-ui thread
        @Override
        protected SimpleAdapter doInBackground(String... strJson) {
            try{
                jObject = new JSONObject(strJson[0]);
                CountryJSONParser countryJsonParser = new CountryJSONParser();
                countryJsonParser.parse(jObject);
            }catch(Exception e){
                Log.d("JSON Exception1",e.toString());
            }

            // Instantiating json parser class
            CountryJSONParser countryJsonParser = new CountryJSONParser();

            // A list object to store the parsed countries list
            List<HashMap<String, Object>> countries = null;

            try{
                // Getting the parsed data as a List construct
                countries = countryJsonParser.parse(jObject);
            }catch(Exception e){
                Log.d("Exception",e.toString());
            }

            // Keys used in Hashmap
            String[] from = { "country"

            // Ids of views in listview_layout
            int[] to = { R.id.tv_country};

            // Instantiating an adapter to store each items
            // R.layout.listview_layout defines the layout of each item
            SimpleAdapter adapter = new SimpleAdapter(getBaseContext(), countries, R.layout.lv_layout, from, to);

            return adapter;
        }

        /** Invoked by the Android on "doInBackground" is executed */
        @Override
        protected void onPostExecute(SimpleAdapter adapter) {

            // Setting adapter for the listview
            mListView.setAdapter(adapter);

            for(int i=0;i<adapter.getCount();i++){
                HashMap<String, Object> hm = (HashMap<String, Object>) adapter.getItem(i);
                                   HashMap<String, Object> hmDownload = new HashMap<String, Object>();
                hm.put("flag_path",imgUrl);
                hm.put("position", i);


            }
        }
    }

             @Override
        protected void onPostExecute(HashMap<String, Object> result) {
            // Getting the path to the downloaded image
            String path = (String) result.get("flag");

            // Getting the position of the downloaded image
            int position = (Integer) result.get("position");

            // Getting adapter of the listview
            SimpleAdapter adapter = (SimpleAdapter ) mListView.getAdapter();

            // Getting the hashmap object at the specified position of the listview
            HashMap<String, Object> hm = (HashMap<String, Object>) adapter.getItem(position);

            // Overwriting the existing path in the adapter
            hm.put("flag",path);

            // Noticing listview about the dataset changes
            adapter.notifyDataSetChanged();
        }
    }

    @Override
    public boolean onCreateOptionsMenu(Menu menu) {
        getMenuInflater().inflate(R.menu.activity_main, menu);
        return true;
    }
}

Upvotes: 3

Hades
Hades

Reputation: 3936

Use something like http://jsoup.org/ to get the html content.

Then use something like

http://jsoup.org/cookbook/extracting-data/selector-syntax

extract the urls.

then

:matches(regex): find elements whose text matches the specified regular expression; e.g. div:matches((?i)login)

do a regular expression for the url you are looking for.

I'm not sure if this is what you want.

Upvotes: 2

Related Questions