Reputation: 4688
For an excercisement i had to look for a certain byte pattern within an byte array, easy enough but i wonder if the code can be simplified or even be optimized:
package anti_virus;
import java.nio.file.Files;
import java.nio.file.Paths;
public class Main {
public static void main(String[] args) throws Exception {
byte[] virus = Files.readAllBytes(Paths.get("C:/Users/Nick/Desktop/Uni/infected.com"));
byte[] payload = new byte[]{0x56, 0x69, 0x72, 0x75, 0x73, (byte)0xB4, 0x40, (byte) 0xBB, 0x01,
0x00, (byte) 0xB9, 0x05, 0x00, (byte) 0xBA, 0x0, 0x0, (byte) 0xCD, 0x21};
// payload[14] and payload[14] have varying values
for (int i = 0; i < virus.length; i++) {
if ((virus[i] == payload[0]) && (virus[i+1] == payload[1]) && (virus[i+2] == payload[2]) &&
(virus[i+3] == payload[3]) && (virus[i+4] == payload[4]) && (virus[i+5] == payload[5]) &&
(virus[i+6] == payload[6]) && (virus[i+7] == payload[7]) && (virus[i+8] == payload[8]) &&
(virus[i+9] == payload[9]) && (virus[i+10] == payload[10]) && (virus[i+11] == payload[11]) &&
(virus[i+12] == payload[12]) && (virus[i+13] == payload[13]) && (virus[i+16] == payload[16]) &&
(virus[i+17] == payload[17])) {
System.out.println("This file is probably a Virus!");
return;
}
}
System.out.println("This file is no Virus.");
}
}
Upvotes: 1
Views: 1613
Reputation: 2275
Your code is quite good, It gives a reasonable 21 ms on a non-virus 6 MB file. But I found it better to either make some pre-loop for the first 14 bytes. Moreover, you have to take care at the ending bytes.
begin = System.currentTimeMillis();
for (i = 0; i < virus.length-payload.length; i++) {
for (j = 0; j < 14; j++) {
// payload[14] and payload[15] have varying values
if (virus[i+j] != payload[j]) {
bFound = false;
break;
}
}
if ((bFound) && (virus[i+16] == payload[16]) && (virus[i+17] == payload[17])) {
end = System.currentTimeMillis();
System.out.println("time : "+(end-begin)+" ms");
System.out.println("This file is probably a Virus!");
return;
}
}
end = System.currentTimeMillis();
System.out.println("time : "+(end-begin)+" ms");
System.out.println("This file is not a Virus.");
This first optim give a a reasonnable 14 ms (-33% of CPU).
Another optimization if you can afford reading your file as integer, is to make wide comparison (4 bytes) at a time. You should pad also the payload to a multiple of 4.
begin = System.currentTimeMillis();
for (i = 0; i < virusInt.length-payloadInt.length; i++) {
if ((virusInt[i] == payloadInt[0]) &&
(virusInt[i+1] == payloadInt[1]) &&
(virusInt[i+2] == payloadInt[2]) &&
((virusInt[i+3]&0xFFFF0000) == payloadInt[3]) &&
((virusInt[i+4]&0xFFFF0000) == payloadInt[4])) {
end = System.currentTimeMillis();
System.out.println("time : "+(end-begin)+" ms");
System.out.println("This file is probably a Virus!");
return;
}
}
end = System.currentTimeMillis();
System.out.println("time : "+(end-begin)+" ms");
System.out.println("This file is not a Virus.");
This give me an even more reasonable 2 ms (-90% of CPU). Of course, I do not count the time to transform into array of int as I suppose you load as array of int and your payload is an array of int too. I have not tried with long (which are 64 bits in JAVA) but it could be a little bit faster.
Upvotes: 1
Reputation: 82461
Yes, it can be simplified/optimized:
O(payload.length + virus.length)
for arbitrary payload
instead of O(payload.length * virus.length)
. (Your code is working more efficiently than O(payload.length * virus.length)
for only one reason: 0x56
occurs only as the first element of your array)ArrayIndexOutOfBoundsException
in your loop (you may access indices i, ..., i+13, i+16, i+17
of the virus
array and your loop condition allows i
to get as large as virus.length-1
).Upvotes: 3
Reputation: 16605
Something like this would check for the signature anywhere within the array, it has not been thoroughly tested though
public static void main(String[] args) throws Exception {
byte[] virus = FileUtil.readBytes(new File("c:/x.txt"));
byte[] payload = "def".getBytes();
for (int i = 0; i < virus.length; i++) {
if ((i + payload.length) <= virus.length) {
boolean found = true;
for (int j = 0; j < payload.length; j++) {
if (virus[i + j] != payload[j]) {
found = false;
break;
}
}
if (found) {
System.out.println("This file is probably a Virus!");
return;
}
} else {
break;
}
}
System.out.println("This file is no Virus.");
}
Upvotes: 0
Reputation: 109547
(Here I assume that virus is the virus signature, and payload any data. I might be wrong seeing your code.)
One has to walk the payload array for paylöadIndex in [0, payload.length - virus.length] (!) and at every step check the virus array again, in a for-loop, with a virusIndex.
Problem solution strategy. Think how you would do it on paper. You would shift the virus array over the payload array.
Upvotes: 0