Reputation: 456

Telling if a requested file is Javascript

I have a program that logs every GET/POST request made by a website during the page load process. I want to go through these requests one by one, execute them, and then determine if the file that was returned is a Javascript. Given that it won't have a .js ending (because of scripts like this, yanked from google.com a minute ago), how can I parse the file gotten from the request and identify if it is a Javascript file?

Thanks!

EDIT: It is better to get a false positive than a false negative. That is, I would rather have some non-JS included in the JS-list than cut some real JS from the list.

Upvotes: 1

Answers (2)

cnvzmxcvmcx

Reputation: 1111

The javascript link that you referred does not have a content type, nor does it have the js extension. Any text file can be considered javascript if it can get executed which can make detection from scratch very difficult. There are two methods that come to mind.

Run a linter on the file contents. If the error is a syntax error or a Parsing error, it is not javascript. If there are no syntax error or parsing error, it should be considered javascript
Parse the AST (Abstract syntax tree) for the file contents. A javascript file would parse without errors. There should be a number of AST libraries available. I haven't worked with JS AST, so can't recommend any one of them but a quick search should give you some options.

I am not sure but probably a linter would also run AST before doing syntax checks. In this case, running AST seems like a lighter option.

Upvotes: 1

Webber

Reputation: 5514

The easiest way would be to check if there was anything identifying javascript files by their URI, because the alternatives are a lot heavier. But since you said this isn't an option, you can always check the syntax of the contents of each file using some heuristic tool. You can also check the response headers for its content-type.

Upvotes: 0

Telling if a requested file is Javascript

Answers (2)

Related Questions