datowlcs
datowlcs

Reputation: 101

Powershell - Efficient way to keep content and append to the same file?

I want to keep the first comment section lines of a file and overwrite everything else. Currently this section is 27 lines long.

Each line begins with a # (think of it as a giant comment section).

What I want to do is keep the initial comment section, delete everything following the comment section, then append a new string to this file just below this comment section.

I found a way to hardcode it, but I think this is pretty ineffecient. I don't think it's best to hardcode in 27 as a literal.

The way I've handled it is:

$fileProc = Get-Content $someFile
                   
$keep = $fileProc[0..27]

$keep | Set-Content $someFile
                   
Add-Content $someFile "`n`n# Insert new string here"
Add-Content $someFile "`n EMPTY_PROCESS.EXE"

Is there a more efficient way to handle this?

Upvotes: 1

Views: 577

Answers (3)

TessellatingHeckler
TessellatingHeckler

Reputation: 29033

Efficient way [...] pretty inefficient [...] a more efficient way

  • Don't open the file many times, paying the cost of ACL security and AntiVirus checks and disk access delays.
  • Avoid PowerShell cmdlets and scriptblocks.
  • Avoid loops in PowerShell, push work to lower layers.
  • Avoid heavyweight searches like regex and wildcard.
  • Avoid making arrays of string for the lines.

Open file once, do a single linear scan and truncate when the pattern is found then write new data. Assuming no other comment lines in the data the pattern is "the last "\n#" is the start of the last comment, then the newline after that is the cutoff". e.g.:

$f = [System.IO.FileStream]::new('d:\test.txt', 'open')

$content = [System.IO.StreamReader]::new($f).ReadToEnd()

$lastComment = $content.LastIndexOf("`n#")
$nextLine    = $content.IndexOf("`n", 1+$lastComment)
$f.SetLength($nextLine) # truncate

$w = [System.IO.StreamWriter]::new($f)
$w.WriteLine("new next Line")
$w.Close()

If there could be other comment lines, redesign the file so there is a sentinal value to find - easier than finding the absence of a thing.

Compared to mklement0's answer this doesn't cost any PowerShell cmdlet startup time, uses no subshells, no wildcard pattern matching, no arrays of string, and doesn't open the file twice. On a file with 10,000 comment lines:

  • your original code takes ~0.4 seconds
  • mklement0's code takes ~0.04 seconds
  • this code takes ~0.02 seconds.

A more efficient way - QED.

Upvotes: 0

mklement0
mklement0

Reputation: 439193

You can use a switch statement to efficiently extract the section of comment lines at the start.

Set-Content out.txt -Value $(
  @(
    switch -Wildcard -File $someFile {
      '#*' { $_ }
      default { break } # End of comments section reached.
    }
  ) + "`n`n# Insert new string here", "`n EMPTY_PROCESS.EXE"
)

Note:

  • To be safe, the above writes to a new file, out.txt, but you can write directly back to $someFile, if desired.

  • Wildcard expression #* assumes that each line in the comment section starts with #, with no preceding whitespace; if you need to account for preceding whitespace, use the -Regex switch in lieu of -Wildcard, and use regex '^\s*#' in lieu of '#*'

Upvotes: 1

vIra glugulan
vIra glugulan

Reputation: 34

Not sure about limiting it to first set of 27 or so lines but this should work.

First line below is to only keep the lines of file that start with '#'.

(Get-Content $somefile) | Where { $_ -match "^#" } | Set-Content $somefile

Add-Content $somefile "`n`nblah blah"
Add-Content $somefile "`nglug glug blug glug"

You can then use Add-Content for additional lines. Hope this helps :]

Upvotes: 0

Related Questions