Reputation: 1341
I am loading HTML emails and at first I remove the HTML tags, I replace each
by a space and I reduce the double spaces by a single space - that works.
But now I have a lot of empty lines which I cannot remove. I have seen the examples which remove empty lines while reading a file, but I don't have any empty lines before I remove the HTML tags and the spaces.
I do:
$m = [IO.File]::ReadAllText("$emailFolder\$fName")
$m = $m -replace "<((?!@).)*?>" # removes all html tag but not adr: <[email protected]>
$m = $m -replace " "," "
$m = $m.Replace(' ',' ').Replace(' ',' ').Replace(' ',' ')
$m = $m.Replace('`r','').Replace('`n`n','`n').Replace('`n`n','`n') # does nothing :(
I tried various version, none of them removed the empty lines. Any idea, how I can achieve that?
Beside that I tried to use the regex multiplier to find spaces in a row and failed.
What I'm doing wrong?
$m = $m.Replace(' +',' ') # does not work
$m = $m.Replace('\s+',' ') # does not work either
Upvotes: 6
Views: 16421
Reputation: 354
You need to include the flag: -Raw
$m = Get-Content "$emailFolder\$fName" -Raw #<- You need to include this
$m = $m -creplace '\s+', ' '
Upvotes: 0
Reputation: 123
I know this is an old post however I found another post that has an easier method and others may benefit. Your array that you imported using get-content for example
$array = Get-content C:\list.txt
$array displays
Name 1
Name 2
Name 3
Name 4
Do this...
$array = $array | where-object {$_}
This will output as you were after.
Source is http://techibee.com/powershell/remove-empty-items-from-array-in-powershell/2431
Upvotes: 1
Reputation: 47832
If I understand you correctly, you don't want to remove all line breaks, just "empty" lines (lines that consist of nothing but whitespace).
Consider this sample string:
$multiLine = "Line 1`r`nLine 2`nLine 3`r`n`r`n `n `t `r`nLine 7`r`n"
When displayed, it will look like this on screen:
Line 1
Line 2
Line 3
Line 7
Line 4 is actually a blank line, with nothing but a CRLF. Line 5 is a space followed by a single LF, Line 6 is a space, a tab, a space, then a CRLF. I mixed line endings because HTML can be a mess; it's good to be prepared for anything!
To handle all of these, you can do a replace like this:
$multiLine -creplace '(?m)^\s*\r?\n',''
-creplace
is just the case-sensitive version of -replace
(I like to be explicit).(?m)
is an inline way to set regular expression modes. The m
mode stands for multi-line, and it lets the ^
and $
anchors match the beginning/end of each line in a string (rather than the beginning and end of the string). This is the key to your issue, I think.^
to match the beginning of each line, then matching 0 or more whitespace using the \s
class, which includes tab.^
will catch them throughout the string.Line 1
Line 2
Line 3
Line 7
Upvotes: 21
Reputation: 327
This works on me (what I mean is using the -replace).
$message.Body = (Get-Content "C:\Documents\Folder\email.txt") | ForEach-Object {
$_ -replace ('\[NAME\]' , $name)`
-replace ('\[AGE\]' , $age)`
-replace ('\[CITY\]' , $city)`
-replace ('\[STATE\]' , $state)`
-replace ('\[POSTAL\]' , $postal)
}
Upvotes: 0
Reputation: 7046
You are passing the backtick inside single quotes, I got the same failure/result until I tried double quotes. I believe the problem lies in how the backtick is parsed while inside single quotes as opposed to not being parsed when from double quotes.
I'll say this is a feature and not a bug.
$m = "`r`n`n`r`r`n`r`n"
$m = $m.Replace("`r",'')
$m = $m.Replace("`n",'')
$m
Upvotes: 1