Reputation: 1215
I am looking for a regex to recognize strings that could be a two dimensional array of integers that has columns with same length.
for example this is a string that I want to convert it to a two dimensional array:
0 4 8 4\n9 6 5 7\n9 5 5 1
which could be :
0 4 8 4
9 6 5 7
9 5 5 1
So I came up with this:"(([0-9]+[ \t]?)+(\n|\r)?){1,}"
however it does not check if columns have the same length.
thank you for your help.
Upvotes: 2
Views: 2705
Reputation: 89629
You can do it with this kind of pattern (add optional CR if needed):
(?m)^(?>(?>\\d+([ \\t]|$)(?=.*\\n(\\2?+\\d+\\1)))+\\n(?=\\2$))+.*
For each item in the first line the lookahead checks if an item in the same column exists in the next line. To know if the columns are the same, the capture group 2 contains an optional self reference \\2?+
. In this way, the capture group 2 grows each time the "item" group is repeated (and reaches the next column).
details:
(?m) # use the multiline mode
^ # start of the line
(?> # group for a complete line
(?> # group for an item
\\d+ ([ \\t]|$) # a number followed by a space/tab or the end of the line
(?= # looakead
.*\\n # reach the next line
(\\2?+\\d+\\1) # capture group 2
)
)+ # repeat the item group
\\n
(?=\\2$) # check if there isn't more columns in the next line
)+ # repeat the line group
.* # match the next line
Note: this pattern checks if separators are unique (not repeated) and always the same with ([ \\t]|$)
and \\1
(in the capture group 2). Leading and trailing white-spaces aren't allowed. But you can write it in a more flexible way:
(?m)^(?>[ \\t]*(?>\\d+[ \\t]*(?=.*\\r?\\n(\\1?+\\d+(?:[ \\t]+|[ \\t]*$))))+\\r?\\n(?=\\1$))+.*\\2$))+.*
These patterns can be used either with matches()
to check a whole string or find()
to find eventual arrays in a larger string.
Upvotes: 3
Reputation: 9041
If you wanted to do straight regex
for validation of a 2d array, you can build patterns that validate specific "x by y" 2d arrays.
public static void main(String[] args) throws Exception {
String data = "0 4 8 4\n9 6 5 7\n9 5 5 1";
// Check if the data is either a 2 x 2 - 10 x 10 array
for (int row = 2; row <= 10; row++) {
for (int col = 2; col <= 10; col++) {
Matcher matcher = Pattern.compile(buildPattern(row, col)).matcher(data);
if (matcher.matches()) {
System.out.printf("Valid %d x %d array%n", row, col);
return;
}
}
}
System.out.println("Invalid 2d array");
}
public static String buildPattern(int row, int col) {
StringBuilder patternBuilder = new StringBuilder();
for (int r = 0; r < row; r++) {
for (int c = 0; c < col; c++) {
patternBuilder.append("\\d+");
if (c + 1 < col) patternBuilder.append("[ ]");
}
if (r + 1 < row) patternBuilder.append("\n");
}
return patternBuilder.toString();
}
Results:
Valid 3 x 4 array
I would do 2 splits.
From there, I would get the number of rows that have the same number of columns as the first row. If the result of that is equal to the number of rows from split 1, then we know it's a 2d array. Otherwise, it's a jagged array.
public static void main(String[] args) throws Exception {
String data = "0 4 8 4\n9 6 5 7\n9 5 5 1";
// Get the rows
String[] rows = data.split("[\r]?[\n]");
// Get the number of columns in the first row
int colCount = rows[0].split(" ").length;
// Check if all rows have the same number of columns as the first row
if (Arrays.stream(rows)
.filter(row -> row.split(" ").length == colCount)
.count() == rows.length) {
System.out.println("Valid 2d array");
} else {
System.out.println("Jagged array");
}
}
Results:
Valid 2d array
Upvotes: 2