Shira'H
Shira'H

Reputation: 33

Pattern matching parser

Actually, I built a Java code to parse the following text file:

     (FAMIX.Attribute (id: 22)
(name 'obj_I')
(parentType (ref: 11))
(declaredType (ref: 27))
(isPrivate true)
   )

   (FAMIX.Attribute (id: 38)
(name 'obj_k')
(parentType (ref: 34))
(declaredType (ref: 43))
(isPrivate true)
   )

  (FAMIX.Attribute (id: 56)
(name 'obj_K')
(parentType (ref: 46))
(declaredType (ref: 43))
(isPrivate true)
    )

  (FAMIX.Attribute (id: 73)
(name 'obj_L')
(parentType (ref: 64))
(declaredType (ref: 45))
(isPrivate true)
    )

 (FAMIX.Attribute (id: 67)
(name 'obj_G')
(parentType (ref: 64))
(declaredType (ref: 46))
(isPrivate true)
    )

 (FAMIX.Attribute (id: 93)
(name 'classD')
(parentType (ref: 85))
(declaredType (ref: 94))
(isPrivate true)
   )

  (FAMIX.Attribute (id: 99)
(name 'classC')
(parentType (ref: 86))
(declaredType(ref: 86))
(isPackage true)
    )

 (FAMIX.Attribute (id: 114)
(name 'classB')
(parentType (ref: 94))
(declaredType (ref: 11))
(isPrivate true)
    )

  (FAMIX.Attribute (id: 107)
(name 'obj_c')
(parentType (ref: 94))
(declaredType (ref: 86))
(isPrivate true)
     )

The Java code:

// Find Attributes

Pattern p111 = Pattern.compile("FAMIX.Attribute");

Matcher m111 = p111.matcher(line);
while (m111.find()) {

    FAMIXAttribute obj = new FAMIXAttribute();              
    Pattern p222 = Pattern.compile("id:\\s*([0-9]+)");
    Matcher m222 = p222.matcher(line);

    while (m222.find()) {
        System.out.print(m222.group(1));
    }

    while ((line = br.readLine()) != null && !(line.contains("FAMIX"))) {

        Pattern p333 = Pattern.compile("name\\s*'([\\w]+)\\s*'");
        Matcher m333 = p333.matcher(line);

        while (m333.find()) {       

            System.out.print(m333.group(1));
        }

        Pattern p555 = Pattern.compile("parentType\\s*\\(ref:\\s*([0-9]+)\\)");
        Matcher m555 = p555.matcher(line);
        while (m555.find()) {
           System.out.print(m555.group(1));
        }

        Pattern p666 =   Pattern.compile("declaredType\\s*\\(ref:\\s*([0-9]+)\\)");
        Matcher m666 = p666.matcher(line);
        while (m666.find()) {
           System.out.print(m666.group(1));
        } 

    }

} // exit from finding Attribute

The output:

     ***************** Attributes *****************
       obj_k    38   34   43
       obj_L    73   64   45
       classD   93   85   94
       classB   114  94   11   

Based on the output, the problem is the parser skip some output (jump)

Please let me know if the problem is unclear, and I will try to further clarify it.

Upvotes: 0

Views: 372

Answers (2)

nhahtdh
nhahtdh

Reputation: 56819

If you are sure that the file contains the lines in the exact format as specified:

  • Each of id, name, parentType, declaredType must be fully declared on a single line. i.e. you don't have such input:

    (FAMIX.Attribute (id:
    38)
    (name 
    'obj_k')
    (parentType 
      (ref: 34))
    (declaredType (ref: 43))
    (isPrivate true)
    )
    

    But this is allowed:

    (FAMIX.Attribute (id: 38)
    (name 'obj_k') (parentType (ref: 34)) (declaredType (ref: 43)) (isPrivate true))
    

This is the pre-condition for the modification below to work. This assumption is derived from your current code.

String line;

FAMIXAttribute obj = new FAMIXAttribute();
boolean isModified = false;

while ((line = br.readLine()) != null) {
    if (line.contains("FAMIX.Attribute")) {
        if (isModified) {
            // TODO: Save the previous obj

            obj = new FAMIXAttribute();
            isModified = false;
        } 
    }

    // TODO: Add the block of code to parse id here
    // TODO: Add id attribute to obj, set isModified to true

    // TODO: Add the block of code to parse other stuffs here
    // TODO: Add those attributes to obj, set isModified to true
}

if (isModified) {
    // TODO: Save the last obj
}

Upvotes: 0

ilomambo
ilomambo

Reputation: 8350

You forgot the regex to check for the IsPrivate or IsPackage part

Edit: A few steps will tell you what went wrong Add a printout of the line to see exactly what lines are failing and how the Pattern sees them

     // Find Attributes
                System.out.print("***"+line+"***"); 
                Pattern p111 = Pattern.compile("FAMIX.Attribute");
                Matcher m111 = p111.matcher(line);
                while (m111.find()) {

The "***" will give you a sense of the exact beginning and end of the line, regarding java. Sometimes characters that seem identical to the eye are different for the matcher.

Edit 2: Your code is missing the outer loop, where line gets its first read. Do you realize that the code:

                  while ((line = br.readLine()) != null && !(line.contains("FAMIX"))) {

consumes the next line where "FAMIX.Attribute" appears? If you do another read in the (missing) outer loop, you will be missing every other record.

Upvotes: 1

Related Questions