Reputation: 371
i am looking to extract data from a txt file and output it to other txt files. here is the content of the txt file
HAC 06: CATHETHER-ASSOCIATED URINARY TRACT INFECTION (UTI)
SECONDARY DIAGNOSIS
T8351XA CC Infection and inflammatory reaction due to indwelling urinary catheter, initial encounter
OR SECONDARY DIAGNOSIS
B3741 Candidal cystitis and urethritis
B3749 Other urogenital candidiasis
N10 CC Acute tubulo-interstitial nephritis
N340 CC Urethral abscess
N390 CC Urinary tract infection, site not specified
WITH SECONDARY DIAGNOSIS
T8351XA CC Infection and inflammatory reaction due to indwelling urinary catheter, initial encounter
HAC 07: VASCULAR CATHETHER-ASSOCIATED INFECTION
SECONDARY DIAGNOSIS
T80211A CC Bloodstream infection due to central venous catheter, initial encounter
T80212A CC Local infection due to central venous catheter, initial encounter
T80218A CC Other infection due to central venous catheter, initial encounter
T80219A CC Unspecified infection due to central venous catheter, initial encounter
HAC 08: SURGICAL SITE INFECTION-MEDIASTINITIS AFTER CORONARY BYPASS GRAFT (CABG)
PROCEDURES
0210093 Bypass Coronary Artery, One Site from Coronary Artery with Autologous Venous Tissue, Open Approach
0210098 Bypass Coronary Artery, One Site from Right Internal Mammary with Autologous Venous Tissue, Open Approach
I want to extract it into three files for contents under HAC 06, HAC 07 and HAC 08
HAC 06 will have
HAC 06: CATHETHER-ASSOCIATED URINARY TRACT INFECTION (UTI)
SECONDARY DIAGNOSIS
T8351XA CC Infection and inflammatory reaction due to indwelling urinary catheter, initial encounter
OR SECONDARY DIAGNOSIS
B3741 Candidal cystitis and urethritis
B3749 Other urogenital candidiasis
N10 CC Acute tubulo-interstitial nephritis
N340 CC Urethral abscess
N390 CC Urinary tract infection, site not specified
WITH SECONDARY DIAGNOSIS
T8351XA CC Infection and inflammatory reaction due to indwelling urinary catheter, initial encounter
HAC 07 will have and so on
HAC 07: VASCULAR CATHETHER-ASSOCIATED INFECTION
SECONDARY DIAGNOSIS
T80211A CC Bloodstream infection due to central venous catheter, initial encounter
T80212A CC Local infection due to central venous catheter, initial encounter
T80218A CC Other infection due to central venous catheter, initial encounter
T80219A CC Unspecified infection due to central venous catheter, initial encounter
I started with some code
$filename = "HAC.txt"
$output_file = "extract_$HAC06"
$extract = @()
select-string -path $filename -pattern "HAC" -context 0,1 |
foreach-object {
$extract += $_.line
$extract += $_.context.postcontext
}
$extract | out-file $output_file
but i am stuck....any help
Upvotes: 0
Views: 271
Reputation: 36332
You can import all of the text as one multi-line string, split it on the HAC lines, and then export each based on the HAC number listed in the first line. Something like this:
$AllText = (Get-Content "HAC.txt") -join "`r`n"
$AllText -Split "(?=HAC \d)"| Where{$_ -match "^(HAC \d+)"} | ForEach{Set-Content -Value $_ -Path ($Matches[1]+'.txt')}
That will output 3 files named after the HAC codes with exactly what you were looking for as content.
Edit: Ok, if you want to modify where the files are output we can add a path like this:
$OutFolder = 'C:\Path\For\Output\'
$AllText = (Get-Content "HAC.txt") -join "`r`n"
$AllText -Split "(?=HAC \d)"| Where{$_ -match "^(HAC \d+)"} | ForEach{Set-Content -Value $_ -Path ($OutFolder + $Matches[1] + '.txt')}
Upvotes: 1