Reputation: 21
Currently after running my ListFiles() i got the list of file names which were taken from a directory and required it as a input , below are the xml files which i got.
My Code where i got the list of Filenames is:
String dirPath = "D:\\Input_Split_xml";
File dir = new File(dirPath);
String[] files = dir.list();
for (String aFile : files)
{
System.out.println("file names are "+aFile);
}
Currently all the File names are stored in "aFile" :
file names are 51090323-005_low_level.xml
file names are 90406990_low_level.xml
file names are 90406991_low_level.xml
file names are TC_CADBOM_51090323-005_low_level_BOM.xml
file names are TC_CADBOM_90406990_low_level_BOM.xml
file names are TC_CADBOM_90406991_low_level_BOM.xml
file names are TC_CADDESIGN_51090323-005_low_level.xml
file names are TC_CADDESIGN_90406990_low_level.xml
file names are TC_CADDESIGN_90406991_low_level.xml
Now I need to sort these filenames in the below way for considering them as a input to parse the xml file.
1)For Ex: Based on "51090323-005" number i need to group all the file names coming under that number and take them as input one after the other and use it to get the node count of each xml. i.e. These are the 3 XML's coming under this number so i will collect all these and use them one after the other.
a)51090323-005_low_level.xml
b)TC_CADBOM_51090323-005_low_level_BOM.xml
c)TC_CADDESIGN_51090323-005_low_level.xml
Experts require your help in how to solve this
Upvotes: 0
Views: 152
Reputation: 1772
for (String aFile : files)
{
if(aFile.contains("51090323-005")) {
System.out.println("file names are " + aFile);
}
}
Output:
file names are 51090323-005_low_level.xml
file names are TC_CADBOM_51090323-005_low_level_BOM.xml
file names are TC_CADDESIGN_51090323-005_low_level.xml
// Extract the numbers
// This HashSet will contain all the numbers. HashSet -> To avoid duplicate numbers
Set<String> baseFiles = new HashSet<>();
System.out.println("Files numbers:");
// Iterate all files to extract the numbers
// Assumption: The base file have the number at beginning, so we will use a pattern that try to match numbers at the beginning of the name
for (String aFile : files)
{
// Create a pattern that match the strings that have at the beginning numbers and/or -
// "matcher" will split the string in groups based on the given pattern
Matcher matcher = Pattern.compile("^([0-9-]+)(.*)").matcher(aFile);
// Verify if the string has the wanted pattern
if(matcher.matches()) {
// Group 0 is the original string
// Group 1 is the number
// Group 2 the rest of the filename
String number = matcher.group(1);
System.out.println(number);
// Add the number to the HashSet
baseFiles.add(number);
}
}
// Iterate all the numbers to create the groups
for (String baseFile : baseFiles)
{
System.out.println("Group " + baseFile);
// Search the filenames that contain the given number
for (String aFile : files)
{
// Verify if the current filename has the given number
if(aFile.contains(baseFile)) {
System.out.println("file names are " + aFile);
}
}
}
Output:
Files numbers:
51090323-005
90406990
90406991
Group 90406991
file names are 90406991_low_level.xml
file names are TC_CADBOM_90406991_low_level_BOM.xml
file names are TC_CADDESIGN_90406991_low_level.xml
Group 51090323-005
file names are 51090323-005_low_level.xml
file names are TC_CADBOM_51090323-005_low_level_BOM.xml
file names are TC_CADDESIGN_51090323-005_low_level.xml
Group 90406990
file names are 90406990_low_level.xml
file names are TC_CADBOM_90406990_low_level_BOM.xml
file names are TC_CADDESIGN_90406990_low_level.xml
Upvotes: 0
Reputation: 4475
This function returns a map where each entry corresponds to a set of related files. Thanks to regular expressions it is easy to verify the filename pattern and to extract the number part (see group(1) )
// key=number, value=array of matching files, sorted
public static Map<String, File[]> process(String fileLocation) {
Map<String, File[]> fileMap = new HashMap<>();
Pattern startFileNamePattern = Pattern.compile("([0-9-]+)_low_level.xml");
File dir = new File(fileLocation);
File[] startFiles = dir.listFiles((File file, String name) -> startFileNamePattern.matcher(name).matches());
for (File f : startFiles) {
Matcher m = startFileNamePattern.matcher(f.getName());
if (m.matches()) {
String number = m.group(1);
File[] allFiles = dir.listFiles((File arg0, String name) -> name.contains(number));
Arrays.sort(allFiles);
fileMap.put(number, allFiles);
}
}
return fileMap;
}
Upvotes: 1
Reputation: 2850
Convert your String[] files
to a List
and remove entries that do not contain the number.
List<String> fileNames = Arrays.asList(files);
public static List<String> groupFiles(String number, List<String> fileNames){
fileNames.removeIf(n -> (!n.contains(number)));
return fileNames;
}
Output:
[51090323-005_low_level.xml, TC_CADBOM_51090323-005_low_level_BOM.xml, TC_CADDESIGN_51090323-005_low_level.xml]
Additionally, if you need to get the numbers programmatically, you can use something like:
public static List<String> getNumbers(List<String> fileNames){
List<String> numbers = new ArrayList<>();
fileNames.removeIf(n -> (!Character.isDigit(n.substring(0, 1).charAt(0))));
fileNames.forEach(name -> {
numbers.add(name.substring(0, 7));
});
return numbers;
}
Output:
[5109032, 9040699, 9040699]
This removes files that do not start with digits from the array and then gets the 8 character substring from remaining files.
Upvotes: 0
Reputation: 1
Adding to Cray's answer. You could obtain the numbers by using
String prefix = aFile.split("_")[0];
if (Character.isDigit(prefix.charAt(0))) {
// prefix contains a number that we can filter.
}
Upvotes: 0