Reputation: 33
Actually, I built a Java code to parse the following text file:
(FAMIX.Attribute (id: 22)
(name 'obj_I')
(parentType (ref: 11))
(declaredType (ref: 27))
(isPrivate true)
)
(FAMIX.Attribute (id: 38)
(name 'obj_k')
(parentType (ref: 34))
(declaredType (ref: 43))
(isPrivate true)
)
(FAMIX.Attribute (id: 56)
(name 'obj_K')
(parentType (ref: 46))
(declaredType (ref: 43))
(isPrivate true)
)
(FAMIX.Attribute (id: 73)
(name 'obj_L')
(parentType (ref: 64))
(declaredType (ref: 45))
(isPrivate true)
)
(FAMIX.Attribute (id: 67)
(name 'obj_G')
(parentType (ref: 64))
(declaredType (ref: 46))
(isPrivate true)
)
(FAMIX.Attribute (id: 93)
(name 'classD')
(parentType (ref: 85))
(declaredType (ref: 94))
(isPrivate true)
)
(FAMIX.Attribute (id: 99)
(name 'classC')
(parentType (ref: 86))
(declaredType(ref: 86))
(isPackage true)
)
(FAMIX.Attribute (id: 114)
(name 'classB')
(parentType (ref: 94))
(declaredType (ref: 11))
(isPrivate true)
)
(FAMIX.Attribute (id: 107)
(name 'obj_c')
(parentType (ref: 94))
(declaredType (ref: 86))
(isPrivate true)
)
The Java code:
// Find Attributes
Pattern p111 = Pattern.compile("FAMIX.Attribute");
Matcher m111 = p111.matcher(line);
while (m111.find()) {
FAMIXAttribute obj = new FAMIXAttribute();
Pattern p222 = Pattern.compile("id:\\s*([0-9]+)");
Matcher m222 = p222.matcher(line);
while (m222.find()) {
System.out.print(m222.group(1));
}
while ((line = br.readLine()) != null && !(line.contains("FAMIX"))) {
Pattern p333 = Pattern.compile("name\\s*'([\\w]+)\\s*'");
Matcher m333 = p333.matcher(line);
while (m333.find()) {
System.out.print(m333.group(1));
}
Pattern p555 = Pattern.compile("parentType\\s*\\(ref:\\s*([0-9]+)\\)");
Matcher m555 = p555.matcher(line);
while (m555.find()) {
System.out.print(m555.group(1));
}
Pattern p666 = Pattern.compile("declaredType\\s*\\(ref:\\s*([0-9]+)\\)");
Matcher m666 = p666.matcher(line);
while (m666.find()) {
System.out.print(m666.group(1));
}
}
} // exit from finding Attribute
The output:
***************** Attributes *****************
obj_k 38 34 43
obj_L 73 64 45
classD 93 85 94
classB 114 94 11
Based on the output, the problem is the parser skip some output (jump)
Please let me know if the problem is unclear, and I will try to further clarify it.
Upvotes: 0
Views: 372
Reputation: 56819
If you are sure that the file contains the lines in the exact format as specified:
Each of id
, name
, parentType
, declaredType
must be fully declared on a single line. i.e. you don't have such input:
(FAMIX.Attribute (id:
38)
(name
'obj_k')
(parentType
(ref: 34))
(declaredType (ref: 43))
(isPrivate true)
)
But this is allowed:
(FAMIX.Attribute (id: 38)
(name 'obj_k') (parentType (ref: 34)) (declaredType (ref: 43)) (isPrivate true))
This is the pre-condition for the modification below to work. This assumption is derived from your current code.
String line;
FAMIXAttribute obj = new FAMIXAttribute();
boolean isModified = false;
while ((line = br.readLine()) != null) {
if (line.contains("FAMIX.Attribute")) {
if (isModified) {
// TODO: Save the previous obj
obj = new FAMIXAttribute();
isModified = false;
}
}
// TODO: Add the block of code to parse id here
// TODO: Add id attribute to obj, set isModified to true
// TODO: Add the block of code to parse other stuffs here
// TODO: Add those attributes to obj, set isModified to true
}
if (isModified) {
// TODO: Save the last obj
}
Upvotes: 0
Reputation: 8350
You forgot the regex to check for the IsPrivate
or IsPackage
part
Edit: A few steps will tell you what went wrong Add a printout of the line to see exactly what lines are failing and how the Pattern sees them
// Find Attributes
System.out.print("***"+line+"***");
Pattern p111 = Pattern.compile("FAMIX.Attribute");
Matcher m111 = p111.matcher(line);
while (m111.find()) {
The "***"
will give you a sense of the exact beginning and end of the line, regarding java.
Sometimes characters that seem identical to the eye are different for the matcher.
Edit 2: Your code is missing the outer loop, where line gets its first read. Do you realize that the code:
while ((line = br.readLine()) != null && !(line.contains("FAMIX"))) {
consumes the next line where "FAMIX.Attribute" appears? If you do another read in the (missing) outer loop, you will be missing every other record.
Upvotes: 1