Reputation: 11
I have a text file with thousands of lines, which looks something like this:
# RandomLocation.xaml:1234
msgid "RandomString"
msgstr ""
# AnotherLocation.cs:123
msgstr ""
I need to find and remove every block which doesn't have msgid
in it, and I'm trying to accomplish that by using regex.
$temp | ForEach-Object{
Select-String -Path $($DestinationPath + $culturename + ".po") -Pattern '#[: ](.)\w+.[cx][sa][m]{0,1}[l]{0,1}:\d+\nmsgstr ".*"' -AllMatches | ForEach-Object {
$_.Matches | ForEach-Object{
$temp2 = $_.Value
$delete.Add($_.Value)
}
}
}
If I remove \nmsgstr ".*"'
from pattern, it works correctly and detects every # RandomLocation:1234
, however it doesn't work when I'm trying to find two lines. Any ideas what am I doing wrong?
@edit: It works, however I can't remove these lines from file. It's an arraylist, and while removing single line with $file.Remove($_.Value)
works, it doesn't when $_.Value
has two lines.
Upvotes: 1
Views: 247
Reputation: 338158
Select-String
will break up the file into lines. There is no \n
anymore when it applies the regex.
If you need \n
to be present, read the file into one large string using Get-Content -Raw
(without -Raw
, Get-Content
will also break the file into lines), and then pass that string to Select-String
.
Get-Content -Path "..." -Raw | Select-String -Pattern "...\n"
That being said, your regex looks a bit suspect
#[: ](.)\w+.[cx][sa][m]{0,1}[l]{0,1}:\d+\nmsgstr ".*"
[m]
is the same as m
, the character class []
has no effect on a single letter{0,1}
is the same as ?
[cx]
is "c
OR x
", not sure if you had that in mind.
means "any character", not "dot" - a dot would be \.
If you meant something like ".cs
or .xml
or .xaml
or .saml
", it's much better to just write that instead of making it complicated.
#[: ](.)\w+\.(cs|xml|xaml|saml):\d+\nmsgstr ".*"
Upvotes: 1