Shawn
Shawn

Reputation: 13

How to segment file input into portions in Java

I need to separate each rule in the file below. How can I do that in Java?

This is the file contents

rule apt_regin_2011_32bit_stage1 {
meta:
copyright = "Kaspersky Lab"
 description = "Rule to detect Regin 32 bit stage 1 loaders"
 version = "1.0"
 last_modified = "2014-11-18"
strings:
$key1={331015EA261D38A7}
$key2={9145A98BA37617DE}
$key3={EF745F23AA67243D}
$mz="MZ"
condition:
($mz at 0) and any of ($key*) and filesize < 300000
}


rule apt_regin_rc5key {
meta:
copyright = "Kaspersky Lab"
 description = "Rule to detect Regin RC5 decryption keys"
 version = "1.0"
 last_modified = "2014-11-18"
strings:
$key1={73 23 1F 43 93 E1 9F 2F 99 0C 17 81 5C FF B4 01}
$key2={10 19 53 2A 11 ED A3 74 3F C3 72 3F 9D 94 3D 78}
condition:
any of ($key*)
}



rule apt_regin_vfs {
meta:
copyright = "Kaspersky Lab"
 description = "Rule to detect Regin VFSes"
 version = "1.0"
 last_modified = "2014-11-18"
strings:
$a1={00 02 00 08 00 08 03 F6 D7 F3 52}
$a2={00 10 F0 FF F0 FF 11 C7 7F E8 52}
$a3={00 04 00 10 00 10 03 C2 D3 1C 93}
$a4={00 04 00 10 C8 00 04 C8 93 06 D8}
condition:
($a1 at 0) or ($a2 at 0) or ($a3 at 0) or ($a4 at 0)
}


rule apt_regin_dispatcher_disp_dll {
meta:
copyright = "Kaspersky Lab"
 description = "Rule to detect Regin disp.dll dispatcher"
 version = "1.0"
 last_modified = "2014-11-18"
strings:
$mz="MZ"
 $string1="shit"
 $string2="disp.dll"
 $string3="255.255.255.255"
 $string4="StackWalk64"
 $string5="imagehlp.dll"
condition:
($mz at 0) and (all of ($string*))
}

As per seen in the file, I need to separate each of the 4 rules found in the file input, any idea how can i do this? Please be patient with me. I am a newbie Appreciated in advance!

After separating all of the 4 rules, I need to put each rule into an arraylist.

For example: Arraylist[0]

rule apt_regin_2011_32bit_stage1 {
meta:
copyright = "Kaspersky Lab"
 description = "Rule to detect Regin 32 bit stage 1 loaders"
 version = "1.0"
 last_modified = "2014-11-18"
strings:
$key1={331015EA261D38A7}
$key2={9145A98BA37617DE}
$key3={EF745F23AA67243D}
$mz="MZ"
condition:
($mz at 0) and any of ($key*) and filesize < 300000
}

Arraylist[1]

rule apt_regin_rc5key {
meta:
copyright = "Kaspersky Lab"
 description = "Rule to detect Regin RC5 decryption keys"
 version = "1.0"
 last_modified = "2014-11-18"
strings:
$key1={73 23 1F 43 93 E1 9F 2F 99 0C 17 81 5C FF B4 01}
$key2={10 19 53 2A 11 ED A3 74 3F C3 72 3F 9D 94 3D 78}
condition:
any of ($key*)
}

Arraylist[2]

rule apt_regin_vfs {
meta:
copyright = "Kaspersky Lab"
 description = "Rule to detect Regin VFSes"
 version = "1.0"
 last_modified = "2014-11-18"
strings:
$a1={00 02 00 08 00 08 03 F6 D7 F3 52}
$a2={00 10 F0 FF F0 FF 11 C7 7F E8 52}
$a3={00 04 00 10 00 10 03 C2 D3 1C 93}
$a4={00 04 00 10 C8 00 04 C8 93 06 D8}
condition:
($a1 at 0) or ($a2 at 0) or ($a3 at 0) or ($a4 at 0)
}

and so on.

How can I do this?

Upvotes: 1

Views: 185

Answers (1)

GhostCat
GhostCat

Reputation: 140525

Just for the record: if your problem is only to "segment" the "rules" in your input, then just do:

List<List<String>> sections = new ArrayList<>();
List<String> currentSection = null;

try (BufferedReader br = new BufferedReader(new FileReader(file))) {
  String line;
  while ((line = br.readLine()) != null) {
    if(line.startsWith("rule ")) {
      if (currentSection != null) {
        // we are finished with the previous section!
        sections.add(currentSection);
      }
      currentSection = new ArrayList<>();
      currentSection.add(line);
    } else {
      if(! line.trim().isEmpty()) {
        // any non-empty line goes into the current section
        currentSection.add(line);          
      }
    }
 }
} // end of try/while ... I am too lazy to count my braces ;-)
if (currentSelection != null) {
  // make sure to add the final section, too!
  sections.add(currentSelection); 
}

But then: you are not very precise about your real requirements. I am pretty sure that your real problem is not about "segmenting" that input file.

Most likely, your actual task is to read that file, and for each of the sections within that file, you need to fetch some/all of its content for further processing.

In other words: you are actually asking "how do I parse/process" this input. And we can't answer that question; as you didn't tell us what exactly you want to do with that data.

In essence, this is your option space:

  1. If there is really such a fixed layout, then "parsing" boils down to understand "first comes rule, then comes meta, which looks like ...". Meaning: you "hard-code" the structure of your data into your code. Example: you exactly "know" that the third line contains copyright = "some value". Then you start using regular expressions (or simple String methods like indexOf(), substring()) to extract the information you are interested in.
  2. If the file format is actually some kind of "standard" (such as XMl, JSON, YAML, ...) then you might simply pick up some 3rd party library to parse such files. For your example ... I can't say; this is definitely not a format I am familiar with.
  3. Worst case, you need write your own parser. Writing parsers is a complex, but "well researched" topic, see here for example.

Upvotes: 1

Related Questions