Reputation: 271
I am trying regex to extract data backwards using lookahead & lookbehind.
In the below, I am interested in only column store error
, i.e., with pattern as : search table error:
, I need to extract string until previous :
.
Error processing. Reason: Exception: Job aborted due to failure: xxxxx (asasdasd): com.db.jdbc.exceptions.JDBCDriverException: DBTech JDBC: [2048]: column store error: search table error: [123]
I am currently stuck with (?<=:)(.*?)(?=(: search table error))
. This is extracting from the first occurrence of :
from beginning.
Thank you for any help.
Upvotes: 2
Views: 334
Reputation: 18611
Use
:\s*([^:]*?)\s*: search table error
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
: ':'
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
[^:]*? any character except: ':' (0 or more
times (matching the least amount
possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
: search table error ': search table error'
import re
regex = r":\s*([^:]*?)\s*: search table error"
test_str = "Error processing. Reason: Exception: Job aborted due to failure: xxxxx (asasdasd): com.db.jdbc.exceptions.JDBCDriverException: DBTech JDBC: [2048]: column store error: search table error: [123]"
match = re.search(regex, test_str)
if match:
print(match.group(1))
Results: column store error
Also:
(?<=:)[^:]*(?=:\s+search\s+table\s+error:)
See this regex proof.
EXPLANATION
--------------------------------------------------------------------------------
(?<= look behind to see if there is:
--------------------------------------------------------------------------------
: ':'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
[^:]* any character except: ':' (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
: ':'
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1
or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
search 'search'
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1
or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
table 'table'
--------------------------------------------------------------------------------
\s+ whitespace (\n, \r, \t, \f, and " ") (1
or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
error: 'error:'
--------------------------------------------------------------------------------
) end of look-ahead
with Python code like
import re
regex = r"(?<=:)[^:]*(?=:\s+search\s+table\s+error:)"
test_str = "Error processing. Reason: Exception: Job aborted due to failure: xxxxx (asasdasd): com.db.jdbc.exceptions.JDBCDriverException: DBTech JDBC: [2048]: column store error: search table error: [123]"
match = re.search(regex, test_str)
if match:
print(match.group().strip())
Upvotes: 2