Reputation: 13
I am trying to extract/capture all 7-character alphanumeric strings that come after a specific pattern of text from a text file. I am getting the 7-character strings but I am also getting the other "pattern" characters with them which I don't need. I am new to regex and PowerShell and I really tried before posting here.
Here is how the text file looks like:
{"hash":"hvwRn2V","title":"","description":null,"has_sound":false,"width":1920,"height":1080,"size":444751,"ext":".jpg","animated":false,"prefer_video":false,"looping":false,"datetime":"2016-02-08 09:27:18","edited":"0"},{"hash":"GakvoVT","title":"","description":null,"has_sound":false,"width":1920,"height":1080,"size":189987,"ext":".jpg","animated":false,"prefer_video":false,"looping":false,"datetime":"2016-02-08 09:27:14","edited":"0"},{"hash":"bn0lqId","title":"","description":null,"has_sound":false,"width":1920,"height":1080,"size":466105,"ext":".jpg","animated":false,"prefer_video":false,"looping":false,"datetime":"2016-02-08 09:27:11","edited":"0"},
I need to get all the 7-character strings that fall between two double quotes but only if they come after hash":
. Example from above text I need to get hvwRn2V
from hash":"hvwRn2V"
and so on.
I am using this PowerShell code and it works but it also gives me the pattern text hash":
in the result which I don't want:
$input_path = 'C:\Users\Jack\textfile.txt'
$output_file = 'C:\Users\Jack\output.txt'
$regex = 'hash":"([a-zA-Z_0-9]){7}'
Select-String -Path $input_path -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } > $output_file
What I am getting is:
hash":"hvwRn2V
hash":"GakvoVT
hash":"bn0lqId
What am I doing wrong?
Thank you in advance for your help.
Upvotes: 0
Views: 1226
Reputation: 27443
That text is json. If you enclose the text in square brackets [ ], it will be an array of objects. Print out the hash property in each object like this:
cat file.json | convertfrom-json | % hash
hvwRn2V
GakvoVT
bn0lqId
Upvotes: 0
Reputation: 626896
You need to wrap the 7-character pattern with capturing parentheses and write $_.Groups[1].Value
to the output file:
$input_path = 'C:\Users\Jack\textfile.txt'
$output_file = 'C:\Users\Jack\output.txt'
$regex = 'hash":"([a-zA-Z0-9_]{7})"'
Select-String -Path $input_path -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Groups[1].Value } > $output_file
Note I also added "
at the end of the pattern to make sure the values you extract are exactly 7-char strings inside double quotation marks.
Upvotes: 1