Reputation: 5487
I'm trying to extract text from a set of files on Windows using the Powershell (version 4):
PS > Select-String -AllMatches -Pattern <mypattern-with(capture)> -Path file.jsp | Format-Table
So far, so good. That gives a nice set of MatchInfo
objects:
IgnoreCase LineNumber Line Filename Pattern Matches
---------- ---------- ---- -------- ------- -------
True 30 ... file.jsp ... {...}
Next, I see that the captures are in the matches member, so I take them out:
PS > Select-String -AllMatches -Pattern <mypattern-with(capture)> -Path file.jsp | ForEach-Object -MemberName Matches | Format-Table
Which gives:
Groups Success Captures Index Length Value
------ ------- -------- ----- ------ -----
{...} True {...} 49 47 ...
or as list with | Format-List
:
Groups : {matched text, captured group}
Success : True
Captures : {matched text}
Index : 39
Length : 33
Value : matched text
Here's where I stop, I have no idea how to go further and obtain a list of captured group elements.
I've tried adding another | ForEach-Object -MemberName Groups
, but it seems to return the same as the above.
The closest I get is with | Select-Object -Property Groups
, which indeed gives me what I'd expect (a list of sets):
Groups
------
{matched text, captured group}
{matched text, captured group}
...
But then I'm unable to extract the captured group from each of them, I tried with | Select-Object -Index 1
I get only one of those sets.
It seems that by adding | ForEach-Object { $_.Groups.Groups[1].Value }
I got what I was looking for, but I don't understand why - so I can't be sure I would be able to get the right result when extending this method to whole sets of files.
Why is it working?
As a side note, this | ForEach-Object { $_.Groups[1].Value }
(i.e. without the second .Groups
) gives the same result.
I'd like to add that, upon further attempts, it seems the command can be shortened by removing the piped | Select-Object -Property Groups
.
Upvotes: 116
Views: 107763
Reputation: 721
This worked for my situation.
Using the file: __test.txt__
// autogenerated by script
char VERSION[21] = "ABCDEFGHIJKLMNOPQRST";
char NUMBER[16] = "123456789012345";
Get the NUMBER and VERSION from the file.
PS C:\> Select-String -Path test.txt -Pattern 'VERSION\[\d+\]\s=\s\"(.*)\"' | %{$_.Matches.Groups[1].value}
ABCDEFGHIJKLMNOPQRST
PS C:\> Select-String -Path test.txt -Pattern 'NUMBER\[\d+\]\s=\s\"(.*)\"' | %{$_.Matches.Groups[1].value}
123456789012345
Upvotes: 9
Reputation: 30287
According to the powershell docs on Regular Expressions > Groups, Captures, and Substitutions:
When using the -match
operator, powershell will create an automatic variable named $Matches
PS> "The last logged on user was CONTOSO\jsmith" -match "(.+was )(.+)"
The value returned from this expression is just true
|false
, but PS will add the $Matches
hashtable
So if you output $Matches
, you'll get all capture groups:
PS> $Matches
Name Value
---- -----
2 CONTOSO\jsmith
1 The last logged on user was
0 The last logged on user was CONTOSO\jsmith
And you can access each capture group individually with dot notation like this:
PS> "The last logged on user was CONTOSO\jsmith" -match "(.+was )(.+)"
PS> $Matches.2
CONTOSO\jsmith
Additional Resources:
[regex]
typeUpvotes: 26
Reputation: 99041
Late answer, but to loop multiple matches and groups I use:
$pattern = "Login:\s*([^\s]+)\s*Password:\s*([^\s]+)\s*"
$matches = [regex]::Matches($input_string, $pattern)
foreach ($match in $matches)
{
Write-Host $match.Groups[1].Value
Write-Host $match.Groups[2].Value
}
Upvotes: 9
Reputation: 6978
This script will grab a regex's specified capture group from a file's content and output its matches to console.
$file
is the file you want to load
$cg
is capture group you want to grab
$regex
is the regular expression pattern
Example file and its content to load:
This is the especially special text in the file.
Example Use: .\get_regex_capture.ps1 -file "C:\some\file.txt" -cg 1 -regex '\b(special\W\w+)'
Output: special text
Param(
$file=$file,
[int]$cg=[int]$cg,
$regex=$regex
)
[int]$capture_group = $cg
$file_content = [string]::Join("`r`n", (Get-Content -Raw "$file"));
Select-String -InputObject $file_content -Pattern $regex -AllMatches | % { $_.Matches.Captures } | % { echo $_.Groups[$capture_group].Value }
Upvotes: -1
Reputation: 72680
Have a look at the following
$a = "http://192.168.3.114:8080/compierews/" | Select-String -Pattern '^http://(.*):8080/(.*)/$'
$a
is now a MatchInfo
($a.gettype()
) it contain a Matches
property.
PS ps:\> $a.Matches
Groups : {http://192.168.3.114:8080/compierews/, 192.168.3.114, compierews}
Success : True
Captures : {http://192.168.3.114:8080/compierews/}
Index : 0
Length : 37
Value : http://192.168.3.114:8080/compierews/
in the groups member you'll find what you are looking for so you can write :
"http://192.168.3.114:8080/compierews/" | Select-String -Pattern '^http://(.*):8080/(.*)/$' | % {"IP is $($_.matches.groups[1]) and path is $($_.matches.groups[2])"}
IP is 192.168.3.114 and path is compierews
Upvotes: 124