Reputation: 13
Small PDF files are (under 200KB) are being used to test the performance of the SophosLabs Intelix.
The official example code has been used: https://github.com/sophoslabs/intelix-lambda-example/blob/master/resources/intelix_file_check.py
According to the steps outlined in the example code, any other score shall require next level of analysis. All the PDF files submitted require full analysis from "Cloud Lookup" to "Dynamic Analysis", based on the scores returned.
This makes the scanning experience a bit slow (each file takes 5 minutes to complete scanning). Is there better way to perform scanning?
# 1. Cloud Lookup
# 2. Static Analysis
# 3. Dynamic Analysis
# If the score is <20 then the file is malicious
# If the score is >70 then the file is clean
# Any other score and the next level of analysis is required
One of the PDF used is a small PDF from Government: https://www.housingauthority.gov.hk/en/common/pdf/global-elements/forms/flat-application/HD300.pdf
Upvotes: 1
Views: 119
Reputation: 1636
Great question. To answer this, it is worth understanding the different APIs, their benefits, and their limitations.
CLoud Lookups
File Hash Lookup is a simple lookup of the SHA against a database run by SophosLabs.
File Hash Lookup is naturally very fast but relies on SophosLabs having seen the file previously. Cloud Lookups are not very helpful for document files (including PDF) as each time the document is modified, the SHA will change.
Static Analysis
Static Analysis combines techniques for analysing the file features and structure, combining traditional anti-malware technology with newer technologies from SophosAI.
As Static Analysis scans the file, it doesn't rely on Sophos having seen the file before to deliver a verdict. However, because the file is not detonated, the result can be ambiguous (for example, Likely Clean or Suspicious).
Dynamic Analysis
Dynamic Analysis submits the file to SophosLabs Sandbox to analyse the behaviour of the file as the file detonates.
Dynamic Analysis will always give a definite verdict (Clean or Malicious) as the file will either exhibit malicious behaviours or not. However, this takes the longest time to run. By requiring the file to detonate in a VM, the scan time will always be in minutes.
Picking the correct API
Now, back to the original question. Yes, scanning a simple file with all the services will take minutes. There is a tradeoff here between performance and security. Many people will find that the Static Analysis will be sufficient to pick up malicious files in their environment. Therefore they will not run the Dynamic Analysis, dramatically reducing the scan time. Other people will run the Dynamic Analysis out of line to prevent this from impacting a user.
Of course, there will always be an overhead in sending a file to SophosLabs and running an analysis. If this overhead is too high for your application, consider running an anti-malware solution inside your infrastructure.
In summary, if you want to scan document files quickly, submit them to the static analysis API only.
Upvotes: 1