Gustavo Filgueiras
Gustavo Filgueiras

Reputation: 411

How to get href value from string

I'm trying to get the href from this string, but I can't because it has space inside the link. I try to do it from regex but I'm not an expert in regex. I tried using an example from the internet but I didn't get the value I was after.

<table class="grid border" cellspacing="0" border="0" id="ctl00_ContentBody_grvStudentResult" style="width:100%;border-collapse:collapse;">
<tbody>
    <tr>
        <th align="left" valign="middle" scope="col">Code</th>
        <th align="left" valign="middle" scope="col">Subject</th>
        <th align="left" valign="middle" scope="col">Status</th>
        <th align="center" valign="middle" scope="col">Score</th>
        <th align="center" valign="middle" scope="col">Result Date</th>
    </tr>
    <tr class="detail1">
        <td align="left" valign="middle">
            DipPM15PQ
        </td>
        <td align="left" valign="middle">
            <span class="">
            1561|
            <a onclick="return hs.htmlExpand( this, {  objectType: 'iframe', width: 800, height: 600,  outlineWhileAnimating: true, preserveContent: false } )" href="DetailResults.aspx?sid=90651&amp;id=1769095&amp;nsub= [Project Quality] &amp;Subjectid=1561" title="Approved "> 
            <img alt="" style="display: online" src="../Images/Common/r_Approved.gif" border="0">
            [Project Quality]   </a>
            </span>
            <span class="selected">
            </span>
        </td>
        <td align="left" valign="middle">
            <span class="enable">
            Competent
            </span>
            <center style="display: none">
                <span disabled="disabled"><input id="ctl00_ContentBody_grvStudentResult_ctl02_chkAP" type="checkbox" name="ctl00$ContentBody$grvStudentResult$ctl02$chkAP" checked="checked" disabled="disabled"><label for="ctl00_ContentBody_grvStudentResult_ctl02_chkAP"> </label></span>
            </center>
        </td>
        <td align="center" valign="middle">
            75.00
        </td>
        <td align="center" valign="middle">
            11/11/2018
        </td>
    </tr>
    <tr class="detail1">
        <td align="left" valign="middle">
            DipPM15PC
        </td>
        <td align="left" valign="middle">
            <span class="">
            1559|
            <a onclick="return hs.htmlExpand( this, {  objectType: 'iframe', width: 800, height: 600,  outlineWhileAnimating: true, preserveContent: false } )" href="DetailResults.aspx?sid=90898&amp;id=1769088&amp;nsub= [Project Costs] &amp;Subjectid=1559" title="NAN "> 
            <img alt="" style="display: online" src="../Images/Common/r_.gif" border="0">
            [Project Costs]   </a>
            </span>
            <span class="selected">
            [progress]
            </span>
        </td>
        <td align="left" valign="middle">
            <span class="disable">
            </span>
            <center style="display: none">
            </center>
        </td>
        <td align="center" valign="middle">
        </td>
        <td align="center" valign="middle">
        </td>
    </tr>
</tbody>

Upvotes: 0

Views: 143

Answers (2)

Nick
Nick

Reputation: 147236

A better way to parse HTML is to use DOMDocument. You can use it to process your HTML and find the hrefs from all the <a> tags in your HTML. I'm assuming your HTML is in a variable called $html:

$doc = new DOMDocument();
$doc->loadHTML($html);
$anchors = $doc->getElementsByTagName('a');
foreach ($anchors as $a) {
    $urls[] = $a->attributes->getNamedItem('href')->nodeValue . "\n";
}
foreach ($urls as $url) {
    echo $url;
}

Output

DetailResults.aspx?sid=90651&id=1769095&nsub= [Project Quality] &Subjectid=1561 
DetailResults.aspx?sid=90898&id=1769088&nsub= [Project Costs] &Subjectid=1559

Demo on 3v4l.org

If you have to use regex, this will work for your sample data:

preg_match_all('/href="([^"]+)/', $html, $matches);
print_r($matches[1]);

Output:

Array ( 
    [0] => DetailResults.aspx?sid=90651&amp;id=1769095&amp;nsub= [Project Quality] &amp;Subjectid=1561
    [1] => DetailResults.aspx?sid=90898&amp;id=1769088&amp;nsub= [Project Costs] &amp;Subjectid=1559 
)

Demo on 3v4l.org

Upvotes: 1

Mahfuzar Rahman
Mahfuzar Rahman

Reputation: 303

I'm not expert but this work for me

$string ='<a onclick="return hs.htmlExpand( this, {  objectType: \'iframe\', width: 800, height: 600,  outlineWhileAnimating: true, preserveContent: false } )" href="DetailResults.aspx?sid=90651&amp;id=1769095&amp;nsub= [Project Quality] &amp;Subjectid=1561" title="Approved "> 
            <img alt="" style="display: online" src="../Images/Common/r_Approved.gif" border="0">
            [Project Quality]   </a>';
preg_match_all( '~<a .*?href=[\'"](.*?)[\'"].*?>~', $string, $match );

$urls=array();//array of link
foreach($match as $m){
 if (isset($m[0])) {
    $url[]= $m[0];
}}

Upvotes: 0

Related Questions