user1222589
user1222589

Reputation: 87

how to get the <script> code use regular expression from the page source?

i want to get the script code ,except the outer reference script . for example: i want to get the code of :

<script type="text/javascript">alert("test");</script> 

from next s:

var s ='<script type="text/javascript" src="image/plupload/plupload.js"></script><script type="text/javascript" src="image/plupload/plupload.flash.js"></script><script type="text/javascript" src="image/plupload/plupload.html4.js"></script><script type="text/javascript" src="image/plupload/plupload.html5.js"></script><script type="text/javascript">alert("test");</script>'

p = /<(script)\s+((language=['"]?javascript['"]?)|(type=['"]?text\/javascript['"]?))?\s*\/?>.*(?:<\/\1>)?/gi;
var arr = new Array();
while(arr = p.exec(s)) 
alert(arr[1]+','+arr[1]+','+arr[2]);

but the regular expression is wrong ,i can't get the right result

Upvotes: 0

Views: 927

Answers (2)

Scott Weaver
Scott Weaver

Reputation: 7351

but parsing out non-nested script tags from an arbitrary string with regular expressions and javascript is easy:

var input = '<div>random html in here</div><script type="text/javascript" src="image/plupload/plupload.js"></script><script type="text/javascript" src="image/plupload/plupload.flash.js"></script><script type="text/javascript" src="image/plupload/plupload.html4.js"></script><script type="text/javascript" src="image/plupload/plupload.html5.js"><div>random html in here</div></script><script type="text/javascript">alert("test");</script><div>random html in here</div>';
var pattern = /<script[^>]+?>.*?<\/script>/gi; //whole thing.
var matches = input.match(pattern);
var result = "";
for (var i in matches) result += "Match:" + matches[i] + "\n";
alert(result);

Upvotes: 2

Manishearth
Manishearth

Reputation: 16188

Never parse HTML with regex. The <center> cannot hold.

Either give your scripts a unique id and use document.getElementById(MyId).textContent, or just for-loop through them with document.getElementsByTagName('script')[i].textContent.

Upvotes: 3

Related Questions