kaztrofic
kaztrofic

Reputation: 13

PowerShell how to capture a string of text after a certain pattern but without including the pattern itself in the results

I am trying to extract/capture all 7-character alphanumeric strings that come after a specific pattern of text from a text file. I am getting the 7-character strings but I am also getting the other "pattern" characters with them which I don't need. I am new to regex and PowerShell and I really tried before posting here.

Here is how the text file looks like:

{"hash":"hvwRn2V","title":"","description":null,"has_sound":false,"width":1920,"height":1080,"size":444751,"ext":".jpg","animated":false,"prefer_video":false,"looping":false,"datetime":"2016-02-08 09:27:18","edited":"0"},{"hash":"GakvoVT","title":"","description":null,"has_sound":false,"width":1920,"height":1080,"size":189987,"ext":".jpg","animated":false,"prefer_video":false,"looping":false,"datetime":"2016-02-08 09:27:14","edited":"0"},{"hash":"bn0lqId","title":"","description":null,"has_sound":false,"width":1920,"height":1080,"size":466105,"ext":".jpg","animated":false,"prefer_video":false,"looping":false,"datetime":"2016-02-08 09:27:11","edited":"0"},

I need to get all the 7-character strings that fall between two double quotes but only if they come after hash":. Example from above text I need to get hvwRn2V from hash":"hvwRn2V" and so on.

I am using this PowerShell code and it works but it also gives me the pattern text hash": in the result which I don't want:

$input_path = 'C:\Users\Jack\textfile.txt'
$output_file = 'C:\Users\Jack\output.txt'
$regex = 'hash":"([a-zA-Z_0-9]){7}'
Select-String -Path $input_path -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Value } > $output_file

What I am getting is:

hash":"hvwRn2V
hash":"GakvoVT
hash":"bn0lqId

What am I doing wrong?

Thank you in advance for your help.

Upvotes: 0

Views: 1226

Answers (2)

js2010
js2010

Reputation: 27443

That text is json. If you enclose the text in square brackets [ ], it will be an array of objects. Print out the hash property in each object like this:

cat file.json | convertfrom-json | % hash

hvwRn2V
GakvoVT
bn0lqId

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626896

You need to wrap the 7-character pattern with capturing parentheses and write $_.Groups[1].Value to the output file:

$input_path = 'C:\Users\Jack\textfile.txt'
$output_file = 'C:\Users\Jack\output.txt'
$regex = 'hash":"([a-zA-Z0-9_]{7})"'
Select-String -Path $input_path -Pattern $regex -AllMatches | % { $_.Matches } | % { $_.Groups[1].Value } > $output_file

Note I also added " at the end of the pattern to make sure the values you extract are exactly 7-char strings inside double quotation marks.

Upvotes: 1

Related Questions