Reputation: 187
I am working in AWS S3 storage where we have buckets and files are being added to the buckets. The Bucket information is logged into another bucket in text format.
I would like to convert the log information stored in text files to JSON. there however is no Key-Pair Information in the file.
The contents of the LogFile is as below: -
fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341 s3Samplebucket [10/Mar/2021:03:27:29 +0000] 171.60.235.108 fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341 MX1XP335Q5YFS06H REST.HEAD.BUCKET - "HEAD /s3Samplebucket HTTP/1.1" 200 - - - 13 13 "-" "S3Console/0.4, aws-internal/3 aws-sdk-java/1.11.964 Linux/4.9.230-0.1.ac.224.84.332.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.282-b08 java/1.8.0_282 vendor/Oracle_Corporation" - AMNo4/b/T+5JdEVQpLkqz0SV8VDXyd3odEFmK+5LvanuzgIXW2Lv87OBl5r5tbSZ/yjW5zfFQsA= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader s3-us-west-2.amazonaws.com TLSv1.2
The individual values for the Log file are as below: -
Log fields
Bucket Owner: fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341
Bucket: S3SampleBucket
Time: [11/Mar/2021:** 06:** 52:** 33 +0000]
Remote IP: 183.87.60.172
Requester: arn:** aws:** iam:** :** 486031527132:** user/jdoe
Request ID: 9YQ1MWABKNRPX3MP
Operation: REST.GET.LOCATION
Key: - (BLANK)
Request-URI: "GET /?location HTTP/1.1"
HTTP status: 200
Error Code: - (BLANK)
Bytes Sent: 137
Object Size: - (BLANK)
Total Time: 17
Turn-Around Time: - (BLANK)
Referer: "-" (BLANK)
User-Agen: "AWSPowerShell/4.1.9.0 .NET_Runtime/4.0 .NET_Framework/4.0 OS/Microsoft_Windows_NT_10.0.18363.0 WindowsPowerShell/5.0 ClientSync"
Version Id: - (BLANK)
Host Id: Q5WBxJNrwsspFmtOG+d2YN0xAtvbq1sdqm9vh6AflXdMCmny5VC3bZmyTBZavKGpO3J/uz+IfK0=
Signature Version: SigV4
Cipher Suite: ECDHE-RSA-AES128-GCM-SHA256
Authentication Type: AuthHeader
Host Header: S3SampleBucket.s3.us-west-2.amazonaws.com
TLS version: TLSv1.2
I can add the Value in a Configuration file is what I can think of. I would like to do this in either PowerShell or Python.
Any assistance wold be of great help.
Upvotes: 0
Views: 293
Reputation: 174485
The log format can be interpreted as a CSV (with a whitespace delimiter), so you could parse it using Import-Csv
/ConvertFrom-Csv
:
$columns = 'Bucket Owner', 'Bucket', 'Time', 'Remote IP', 'Requester', 'Request ID', 'Operation', 'Key', 'Request-URI', 'HTTP status', 'Error Code', 'Bytes Sent', 'Object Size', 'Total Time', 'Turn-Around Time', 'Referer', 'User-Agen', 'Version Id', 'Host Id', 'Signature Version', 'Cipher Suite', 'Authentication Type', 'Host Header', 'TLS version'
$data = @'
fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341 s3Samplebucket [10/Mar/2021:03:27:29 +0000] 171.60.235.108 fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341 MX1XP335Q5YFS06H REST.HEAD.BUCKET - "HEAD /s3Samplebucket HTTP/1.1" 200 - - - 13 13 "-" "S3Console/0.4, aws-internal/3 aws-sdk-java/1.11.964 Linux/4.9.230-0.1.ac.224.84.332.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.282-b08 java/1.8.0_282 vendor/Oracle_Corporation" - AMNo4/b/T+5JdEVQpLkqz0SV8VDXyd3odEFmK+5LvanuzgIXW2Lv87OBl5r5tbSZ/yjW5zfFQsA= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader s3-us-west-2.amazonaws.com TLSv1.2
'@
$parsedLog = $data |ConvertFrom-Csv -Delimiter ' ' -Header $columns
Now the resulting object is easily converted to JSON:
PS ~> $parsedLog |ConvertTo-Json
{
"Bucket Owner": "fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341",
"Bucket": "s3Samplebucket",
"Time": "[10/Mar/2021:03:27:29",
"Remote IP": "+0000]",
"Requester": "171.60.235.108",
"Request ID": "fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341",
"Operation": "MX1XP335Q5YFS06H",
"Key": "REST.HEAD.BUCKET",
"Request-URI": "-",
"HTTP status": "HEAD /s3Samplebucket HTTP/1.1",
"Error Code": "200",
"Bytes Sent": "-",
"Object Size": "-",
"Total Time": "-",
"Turn-Around Time": "13",
"Referer": "13",
"User-Agen": "-",
"Version Id": "S3Console/0.4, aws-internal/3 aws-sdk-java/1.11.964 Linux/4.9.230-0.1.ac.224.84.332.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.282-b08 java/1.8.0_282 vendor/Oracle_Corporation",
"Host Id": "-",
"Signature Version": "AMNo4/b/T+5JdEVQpLkqz0SV8VDXyd3odEFmK+5LvanuzgIXW2Lv87OBl5r5tbSZ/yjW5zfFQsA=",
"Cipher Suite": "SigV4",
"Authentication Type": "ECDHE-RSA-AES128-GCM-SHA256",
"Host Header": "AuthHeader",
"TLS version": "s3-us-west-2.amazonaws.com"
}
In your case, to read the file from disk, simply replace $data = ...
and $data |ConvertFrom-Csv
statements with Import-Csv
:
$parsedLog = Import-Csv -Path .\path\to\s3requests.log -Delimiter ' ' -Header $columns
Upvotes: 2