Brad Elliott
Brad Elliott

Reputation: 35

Use Python to return data from a Webpage's Ajax call

I'm writing a program in Python that needs to use a site's advanced search options. Specifically, the search page is the NVC advanced search page . I know the names of the projects and versions I need to search for, so ideally the program would select the project names and versions numbers from the dropdown lists, then return the results page(s).

I'm totally unfamiliar with HTML and Javascript, and I'm fairly new to Python, so I don't know if there's a way to 'click' these dropdown menus via Python, then return the results. The fact that the Javascript makes an Ajax call further complicates the situation, since I can't just load the page's source and parse out the list of project names and version.

Can anyone with some Python/Javascript/Ajax experience send me in the right direction?

An example use of this program would be that I start out with the project "glibc' and its version number '2.3.6' The program would make sure that this combination is listed at all (which isn't guaranteed), then return the results page (which has about 13 results).

Upvotes: 1

Views: 362

Answers (3)

beerbajay
beerbajay

Reputation: 20270

If a human user is using that search page, they click on one of the product links, which then load the list of products from another page, e.g.:

http://web.nvd.nist.gov/view/vuln/cpe/cpe-chooser?index=0&component=Vendor

This page is unfortunately not using JSON, so they have some custom javascript parsing for the response. The data from this response is then displayed as a drop-down for the user. When the user selects a product, the browser selects the correct value, so that when the form is submitted, it will be part of the query. e.g.:

http://web.nvd.nist.gov/view/vuln/search-results?adv_search=true&cves=on&cpe_vendor=cpe%3A%2F%3Aa-a-s_application_access_server

In this, cpe_vendor=cpe%3A%2F%3Aa-a-s_application_access_server is the important part. The part before the = sign is the field name, the part after is the selected value (which originally came from the ajax request). The funny %3A bits are URL-encoding.

So you don't actually need to interact with the page, since you know the names of the vendors and products for which you want to search; you just need to look up the field name (cpe_vendor for vendors) and the value for the specific products/vendors (cpe:/:a-a-s_application_access_server for my example above), then do a request to the normal search URL.

Upvotes: 0

Drakekin
Drakekin

Reputation: 1348

The advanced search options page sends the options via GET to the results page, giving you the URL (linebreaks mine to make it clearer):

http://web.nvd.nist.gov/view/vuln/search-results?
adv_search=true&
cves=on&
cve_id=&
query=&
cwe_id=&
cpe_vendor=cpe%3A%2F%3Aian_bezanson&
cpe_product=cpe%3A%2Fa%3Aian_bezanson%3Adropbox&
cpe_version=cpe%3A%2Fa%3Aian_bezanson%3Adropbox%3A0.0.3_beta&
pub_date_start_month=0&
pub_date_start_year=2005&
pub_date_end_month=2&
pub_date_end_year=2009&
mod_date_start_month=2&
mod_date_start_year=2007&
mod_date_end_month=9&
mod_date_end_year=2009&
cvss_sev_base=&cvss_av=&
cvss_ac=&
cvss_au=&
cvss_c=&
cvss_i=&
cvss_a=

It would then take a bit of sleuthing to figure out what bit of the url is what information from the form but that should let you then just scrape the results page.

Upvotes: 0

enderskill
enderskill

Reputation: 7674

The Mechanize Python library is perfect for form automation. There is an example of how to edit and submit forms on the examples page.

Upvotes: 1

Related Questions