user3018583
user3018583

Reputation: 15

PHP? Given two strings, display only the text between the given strings

Trying to build a small dashboard for my ops crew to display the current NOTAM Runway Surface Conditions (RSC) for a given airport. Our operational needs can and do change dependent on the weather conditions, and the runway conditions.

Unfortunately the agency which publishes the RSC data, uses generic IDs, names, and classes for the DIVs for all the data on the page. So I'm left with ALL the data, even the stuff I don't want.

Here's the current script I have in which I pull the page with the data, and then hide all the elements I can:

<script type="text/javascript" src="jquery-1.5.2.min.js"></script>
<script type="text/javascript">
$(function() {
    $("input.printcheckbox").remove();

    $("div#notam_station_whole_section").appendTo("div#NOTAMRSC");

    $("div#RAW").remove();
});
</script>

<?php

$str = file_get_contents("http://www.flightplanning.navcanada.ca/cgi-bin/Fore-    obs/ewx_traiter_notam.cgi?Recall=ni_File&Langue=anglais&TypeBrief=L&Rayon=50&Station=CYVR");

echo "<div id=\"RAW\">";

echo $str;

echo "</div>";

?>

<div id="NOTAMRSC"></div>

If you run the script for yourself, I'm left with a heap of text, and nearly all of it is irrelevant.

For this example, I'm trying to pull the RSC data for CYXX (Abbotsford Airport), in which the only information that is relevant is this text:

CYXX RSC 01/19 100 PCT DRY SN TRACE. 1312091630
CYXX RSC 07/25 100 PCT DRY SN TRACE. 1312091630
RMK: TWY ALPHA, DRY SNOW 100 PCT TRACE  ALPHA 1, DRY SNOW 100 PCT 
TRACE  BRAVO, DRY SNOW 100 PCT TRACE  CHARLIE, DRY SNOW 100 PCT 
TRACE  CHARLIE 1, DRY SNOW 100 PCT TRACE  CHARLIE 4, DRY SNOW 100 
PCT TRACE  DELTA, DRY SNOW 100 PCT TRACE  GOLF, DRY SNOW 100 PCT 
TRACE
RMK: APN APRON I, DRY SNOW 100 PCT TRACE APRON RUN-UP, DRY SNOW 100 
PCT TRACE

I've been trying to figure out a way to pull this ^^ text only in php, but since all the DIVs are the same, I can't seem to get any DOM or REGEX to work.

What I'm after is a script to display all the text between two given strings, and ignore the rest.

For this example, the first string would be:

000000 CYXX ABBOTSFORD

which ALWAYS precedes the RSC data I am looking for, for any airport (with the airport identifier, CYXX in this case, and the name, "ABBOTSFORD" in this case, changing).

What I'd then want is to display any of the text after that string, and before the next instance of:

</pre>
</span></div>

This would then allow me use this script for future airports as we expand, just by changing the first string to match the new airport.

Any and all help is greatly appreciated.

Upvotes: 1

Views: 131

Answers (1)

elixenide
elixenide

Reputation: 44841

Try this:

echo preg_replace('#(?s)^.*?000000\s+CYXX\s+ABBOTSFORD(.+?)</pre>\s*</span>\s*</div>.*$#', '\1', $str);

Regular expression visualization

Debuggex Demo

Explanation: match anything and everything up to and through 000000 CYXX ABBOTSFORD, then grab everything (in a non-greedy way) until you hit </pre></span></div> (with or without whitespace), and ignore everything else until the end.

Note that you can replace CYXX and ABBOTSFORD with whatever you want.

EDIT That should have had a \1 above.

Upvotes: 2

Related Questions