allamiro
allamiro

Reputation: 58

RegEx for Formating Rsyslog Logs to work with Arcsight Template

I have been trying to get rid of spaces or characters to be read by the arcsight connector I have tried to use the template with regex expression with no luck - the problem is the arcsight parses every thing in one field because it doesnt recognize the format as CEF

I have been using two modules omfwd and omfile tried on both didnt work

OMFWD RAW LOG SAMPLE

 CEF:0|Symantec|Messaging Gateway||ASA|CEF: 0\|CISCO\|ASA\|\|305011\|Built dynamic TCP translation\|Low\| eventId=41069435 proto=TCP

OMFILE RAW LOG SAMPLE

2019-05-08T20:55:04.913701+00:00  CEF: 0|CISCO|ASA||302013|Built outbound TCP connection|Low| eventId=17363056 externalId=116395008 proto=TCP 

I would like to format the message in this way

CEF:0|CISCO|ASA||302013|Built outbound TCP connection|Low| eventId=17363056 externalId=116395008 proto=TCP

with no spaces or any other things

Here is the templates we attempted to use :

$template outfmt,"%msg:R,ERE,1:(.*) CEF: --end% CEF: %msg:R,ERE,1: CEF: (.*)--end%\n"


$template outfmt,"%msg:R,ERE,1,\?(.*)\sCEF\:\s\?(.*)--end% CEF: %msg:R,ERE,1,\?(.*)CEF\:\?(.*)--end%\n"

Any one who can help with this the documentation are really poor on the rsyslog website ..

Upvotes: 0

Views: 1931

Answers (2)

Jan Sláma
Jan Sláma

Reputation: 11

could't you use just CEF SYSLOG connector to cut out syslog header and process? Maybe set it as forwarder.

Upvotes: 1

Emma
Emma

Reputation: 27723

If you wish to design an expression to remove the undesired spaces, this expression might give you an idea.

^(.+)([A-Z]{3}:)(\s+)([A-Z0-9|=]+)(.*\S\s*?)

which you can simplify it or add more boundaries to it, if you want.

I've assumed some extra undesired spaces in your input strings. I only saw two instances of undesired spaces one in the $3 group and some at the end, which I have captured them using groups (), and you can as simple as that remove those spaces. If there might be more spaces, you can just add these capturing groups wherever extra spaces may exist.

My boundaries are relaxed, such as ([A-Z0-9|=]+), which simply swipes some letters and digits without a logic. I did so since I don't know what your instances may look like. You can simply restrict them, if you wish.

enter image description here

Graph

This graph shows how the expression would work and you can visualize other expressions in this link:

enter image description here

Performance Test

This JavaScript snippet shows the performance of that expression using a simple 1-million times for loop.

const repeat = 1000000;
const start = Date.now();

for (var i = repeat; i >= 0; i--) {
	var string = '2019-05-08T20:55:04.913701+00:00   CEF:    0|CISCO|ASA||302013|Built outbound TCP connection|Low| eventId=17363056 externalId=116395008 proto=TCP               ';
	var regex = /^(.+)([A-Z]{3}:)(\s+)([A-Z0-9|=]+)(.*\S\s*?)(.*)/gm;
	var match = string.replace(regex, "$2$4$5");
}

const end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match 💚💚💚 ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test. 😳 ");

Upvotes: 0

Related Questions