Benjamin Tamasi
Benjamin Tamasi

Reputation: 712

How to extract all form info from HTML with PHP

I need a way to extract all form information on a webpage, via a PHP script. so I have:

$url = "http://somewebpage.com/";

the info I need is:
A list of all the forms on the webpage, and their options/atributes like:
A sample output would be as follows:

Form1: Form name: "login", action: "login.php", method: "GET"

  1. Input type: "text", name: "usrname"
  2. Input type: "password", name: "pass"

Form2: Form name: "login2", action: "login2.php", method: "POST"

  1. Input type: "text", name: "usr"
  2. Input type: "password", name: "pwd"

I use the following method to put the HTML contents of the webpage, into a variable:


// cURL
$browser_id = "some crazy browser";
$curl_handle = curl_init();
$options = array
(
CURLOPT_URL=>$url,
CURLOPT_HEADER=>true,
CURLOPT_RETURNTRANSFER=>true,
CURLOPT_FOLLOWLOCATION=>true,
CURLOPT_USERAGENT=>$browser_id
);
curl_setopt_array($curl_handle,$options);
$server_output = curl_exec($curl_handle);
curl_close($curl_handle);

Then I use this to remove the header info, and just keep the HTML stuff, cause otherwise DOM always gives me errors.

$server_output2 = substr($server_output, stripos($server_output, "<html"));

The for finding the forms, I use DOM

$dom = new DomDocument;
$dom->preserveWhiteSpace = FALSE;
$dom->loadHTML($server_output2);
$params = $dom->getElementsByTagName('form'); // Find Sections
$k=0;
foreach ($params as $param){
$forms[$k][0] = $params->item($k)->getAttribute('name');
$forms[$k][1] = $params->item($k)->getAttribute('action');
$forms[$k][2] = $params->item($k)->getAttribute('method');
$k++;
}

However my problem is, I often get errors from DOM, about unclosed tags, or other info. And I don't want to get this info. How can I make it work? Also my current code, only outputs the form info, not the inputs in a form, which I also want to know. How can I make this work? Thank you for your help. You can view my project Remote Attack Vector (this is what I need it for) at http://sourceforge.net/projects/rav/files/ Or check out my website: http://tamasiweb.hu

Upvotes: 1

Views: 4109

Answers (1)

XoR
XoR

Reputation: 132

well, download this php lib

http://sourceforge.net/projects/snoopy/

class usage :

    $uri = "http://anysite.com/form";

    $snoopy = new Snoopy;

    if($snoopy->fetchform($uri)){
        $result = $snoopy->results;
    }
    echo $result; 

hope that helps

Upvotes: 1

Related Questions