Reputation: 4517
I have a working script in PowerShell:
$file = Get-Content -Path HKEY_USERS.txt -Raw
foreach($line in [System.IO.File]::ReadLines("EXCLUDE_HKEY_USERS.txt"))
{
$escapedLine = [Regex]::Escape($line)
$pattern = $("(?sm)^$escapedLine.*?(?=^\[HKEY)")
$file -replace $pattern, ' ' | Set-Content HKEY_USERS-filtered.txt
$file = Get-Content -Path HKEY_USERS-filtered.txt -Raw
}
For each line in EXCLUDE_HKEY_USERS.txt
it is performing some changes in file HKEY_USERS.txt
. So with every loop iteration it is writing to this file and re-reading the same file to pull the changes. However, Get-Content
is notorious for memory leaks, so I wanted to refactor it to StreamReader
and StreamWriter
, but I'm a having a hard time to make it work.
As soon as I do:
$filePath = 'HKEY_USERS-filtered.txt';
$sr = New-Object IO.StreamReader($filePath);
$sw = New-Object IO.StreamWriter($filePath);
I get:
New-Object : Exception calling ".ctor" with "1" argument(s): "The process cannot access the file
'HKEY_USERS-filtered.txt' because it is being used by another process."
So it looks like I cannot use StreamReader and StreamWriter on same file simultaneously. Or can I?
Upvotes: 1
Views: 894
Reputation: 439238
tl;dr
Get-Content -Raw
reads a file as a whole and is fast and consumes little unwanted memory.
[System.IO.File]::ReadLines()
is a faster and more memory-efficient alternative to line-by-line reading with Get-Content
(without -Raw
), but you need to ensure that the input file is passed as a full path, because .NET's working directory usually differs from PowerShell's.
Convert-Path
resolves a given relative path to a full, file-system-native one.
A PowerShell-native alternative to using [System.IO.File]::ReadLines()
is the switch
statement with the -File
parameter, which performs similarly well while avoiding the working-directory discrepancy pitfall, and offers additional features.
There is no need to save the modified file content to disk after each iteration - just update the $file
variable, and, after exiting the loop, save the value of $file
to the output file.
$fileContent = Get-Content -Path HKEY_USERS.txt -Raw
# Be sure to specify a *full* path.
$excludeFile = Convert-Path -LiteralPath 'EXCLUDE_HKEY_USERS.txt'
foreach($line in [System.IO.File]::ReadLines($excludeFile)) {
$escapedLine = [Regex]::Escape($line)
$pattern = "(?sm)^$escapedLine.*?(?=^\[HKEY)"
# Modify the content and save the result back to variable $fileContent
$fileContent = $fileContent -replace $pattern, ' '
}
# After all modifications have been performed, save to the output file
$fileContent | Set-Content HKEY_USERS-filtered.txt
Building on Santiago Squarzon's helpful comments:
Get-Content
does not cause memory leaks, but it can consume a lot of memory that isn't garbage-collected until an unpredictable later point in time.
-Raw
switch is used - it decorates each line read with PowerShell ETS (Extended Type System) properties containing metadata about the file of origin, such as its path (.PSPath
) and the line number (.ReadCount
).-Raw
is efficient, because the entire file content is read into a single, multi-line string, which means that the decoration is only performed once.So it looks like I cannot use StreamReader and StreamWriter on same file simultaneously. Or can I?
No, you cannot. You cannot simultaneously read from a file and overwrite it.
To update / replace an existing file you have two options (note that, for a fully robust solution, all attributes of the original file (except the last write time and size) should be retained, which requires extra work):
Read the old content into memory in full, perform the desired modification in memory, then write the modified content back to the original file, as shown in the top section.
More safely, write the modified content to a temporary file and, upon successful completion, replace the original file with the temporary one.
Upvotes: 3