NottyHead
NottyHead

Reputation: 187

Convert a Text File to JSON using PowerShell

I am working in AWS S3 storage where we have buckets and files are being added to the buckets. The Bucket information is logged into another bucket in text format.

I would like to convert the log information stored in text files to JSON. there however is no Key-Pair Information in the file.

The contents of the LogFile is as below: -

fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341 s3Samplebucket [10/Mar/2021:03:27:29 +0000] 171.60.235.108 fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341 MX1XP335Q5YFS06H REST.HEAD.BUCKET - "HEAD /s3Samplebucket HTTP/1.1" 200 - - - 13 13 "-" "S3Console/0.4, aws-internal/3 aws-sdk-java/1.11.964 Linux/4.9.230-0.1.ac.224.84.332.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.282-b08 java/1.8.0_282 vendor/Oracle_Corporation" - AMNo4/b/T+5JdEVQpLkqz0SV8VDXyd3odEFmK+5LvanuzgIXW2Lv87OBl5r5tbSZ/yjW5zfFQsA= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader s3-us-west-2.amazonaws.com TLSv1.2

The individual values for the Log file are as below: -
Log fields

Bucket Owner: fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341
Bucket: S3SampleBucket
Time: [11/Mar/2021:** 06:** 52:** 33 +0000]
Remote IP: 183.87.60.172
Requester: arn:** aws:** iam:** :** 486031527132:** user/jdoe
Request ID: 9YQ1MWABKNRPX3MP
Operation: REST.GET.LOCATION
Key: - (BLANK)
Request-URI: "GET /?location HTTP/1.1"
HTTP status: 200
Error Code: - (BLANK)
Bytes Sent: 137
Object Size: - (BLANK)
Total Time: 17
Turn-Around Time: - (BLANK)
Referer: "-" (BLANK)
User-Agen: "AWSPowerShell/4.1.9.0 .NET_Runtime/4.0 .NET_Framework/4.0 OS/Microsoft_Windows_NT_10.0.18363.0 WindowsPowerShell/5.0 ClientSync"
Version Id: - (BLANK)
Host Id: Q5WBxJNrwsspFmtOG+d2YN0xAtvbq1sdqm9vh6AflXdMCmny5VC3bZmyTBZavKGpO3J/uz+IfK0=
Signature Version: SigV4
Cipher Suite: ECDHE-RSA-AES128-GCM-SHA256
Authentication Type: AuthHeader
Host Header: S3SampleBucket.s3.us-west-2.amazonaws.com
TLS version: TLSv1.2

I can add the Value in a Configuration file is what I can think of. I would like to do this in either PowerShell or Python.

Any assistance wold be of great help.

Upvotes: 0

Views: 293

Answers (1)

Mathias R. Jessen
Mathias R. Jessen

Reputation: 174485

The log format can be interpreted as a CSV (with a whitespace delimiter), so you could parse it using Import-Csv/ConvertFrom-Csv:

$columns = 'Bucket Owner', 'Bucket', 'Time', 'Remote IP', 'Requester', 'Request ID', 'Operation', 'Key', 'Request-URI', 'HTTP status', 'Error Code', 'Bytes Sent', 'Object Size', 'Total Time', 'Turn-Around Time', 'Referer', 'User-Agen', 'Version Id', 'Host Id', 'Signature Version', 'Cipher Suite', 'Authentication Type', 'Host Header', 'TLS version'

$data = @'
fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341 s3Samplebucket [10/Mar/2021:03:27:29 +0000] 171.60.235.108 fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341 MX1XP335Q5YFS06H REST.HEAD.BUCKET - "HEAD /s3Samplebucket HTTP/1.1" 200 - - - 13 13 "-" "S3Console/0.4, aws-internal/3 aws-sdk-java/1.11.964 Linux/4.9.230-0.1.ac.224.84.332.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.282-b08 java/1.8.0_282 vendor/Oracle_Corporation" - AMNo4/b/T+5JdEVQpLkqz0SV8VDXyd3odEFmK+5LvanuzgIXW2Lv87OBl5r5tbSZ/yjW5zfFQsA= SigV4 ECDHE-RSA-AES128-GCM-SHA256 AuthHeader s3-us-west-2.amazonaws.com TLSv1.2
'@

$parsedLog = $data |ConvertFrom-Csv -Delimiter ' ' -Header $columns

Now the resulting object is easily converted to JSON:

PS ~> $parsedLog |ConvertTo-Json
{
    "Bucket Owner":  "fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341",
    "Bucket":  "s3Samplebucket",
    "Time":  "[10/Mar/2021:03:27:29",
    "Remote IP":  "+0000]",
    "Requester":  "171.60.235.108",
    "Request ID":  "fd89d80d676948bd913040b667965ef6a50a9c80a12f38c504f497953aedc341",
    "Operation":  "MX1XP335Q5YFS06H",
    "Key":  "REST.HEAD.BUCKET",
    "Request-URI":  "-",
    "HTTP status":  "HEAD /s3Samplebucket HTTP/1.1",
    "Error Code":  "200",
    "Bytes Sent":  "-",
    "Object Size":  "-",
    "Total Time":  "-",
    "Turn-Around Time":  "13",
    "Referer":  "13",
    "User-Agen":  "-",
    "Version Id":  "S3Console/0.4, aws-internal/3 aws-sdk-java/1.11.964 Linux/4.9.230-0.1.ac.224.84.332.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.282-b08 java/1.8.0_282 vendor/Oracle_Corporation",
    "Host Id":  "-",
    "Signature Version":  "AMNo4/b/T+5JdEVQpLkqz0SV8VDXyd3odEFmK+5LvanuzgIXW2Lv87OBl5r5tbSZ/yjW5zfFQsA=",
    "Cipher Suite":  "SigV4",
    "Authentication Type":  "ECDHE-RSA-AES128-GCM-SHA256",
    "Host Header":  "AuthHeader",
    "TLS version":  "s3-us-west-2.amazonaws.com"
}

In your case, to read the file from disk, simply replace $data = ... and $data |ConvertFrom-Csv statements with Import-Csv:

$parsedLog = Import-Csv -Path .\path\to\s3requests.log -Delimiter ' ' -Header $columns

Upvotes: 2

Related Questions