Reputation: 787
I have a text file I'm trying to process with vbscript, it looks like this:
111 , , ,Yes ,Yes
222 , , ,Yes ,Yes
333 , , ,Yes ,Yes
444 , , ,Yes ,Yes
555 , , ,Yes ,Yes
666 , , ,Yes ,Yes
What I want is to remove the carriage returns and tabs, commas and 'yes' (or the regex "\t,\t,\t\t,Yes\t,Yes") to give this output:
('111','222','333','444','555','666')
I'm using this code:
Const ForReading = 1
Const ForWriting = 2
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile(filePath, ForReading)
strText = objFile.ReadAll
objFile.Close
'chr(010) = line feed chr(013) = carriage return
strNewText = Replace(strText, "\t,\t,\t\t,Yes\t,Yes" & chr(013) & chr(010), "','")
Set objFile = objFSO.OpenTextFile(filePath, ForWriting)
objFile.WriteLine strNewText
objFile.Close
This isn't giving the desired output however, If I take the ""\t,\t,\t\t,Yes\t,Yes" &" out of the replace it removes the carriage returns, which is fine but I also need the commas tabs and 'yes' removed, as well as having a (' at the start and ') at the end. I'm guessing it's the way I've used the regex but I've not used much vbscript so I'm not sure
Upvotes: 0
Views: 2783
Reputation: 38745
Instead of hunting down what you don't want, it's easier and less errorprone to concentrate on what you want:
Dim sExp : sExp = "('111','222','333','444','555','666')"
Dim aLines : aLines = Array( _
"111 , , ,Yes ,Yes" _
, "222 , , ,Yes ,Yes" _
, "333 , , ,Yes ,Yes" _
, "444 , , ,Yes ,Yes" _
, "555 , , ,Yes ,Yes" _
, "666 , , ,Yes ,Yes" _
)
Dim sAll : sAll = Join( aLines, vbCrLf )
WScript.Echo sAll
Dim reCut : Set reCut = New RegExp
reCut.Global = True
reCut.MultiLine = True
reCut.Pattern = "^\d+"
Dim oMTS : Set oMTS = reCut.Execute( sAll )
If 0 = oMTS.Count Then
WScript.Echo "Bingo A!"
Else
ReDim aNums( oMTS.Count - 1 )
Dim nI
For nI = 0 To UBound( aNums )
aNums( nI ) = oMTS( nI ).Value
Next
Dim sRes : sRes = "('" & Join( aNums, "','" ) & "')"
If sRes = sExp Then
WScript.Echo "QED:", sRes
Else
WScript.Echo "Bingo B!"
End If
End If
output:
111 , , ,Yes ,Yes
222 , , ,Yes ,Yes
333 , , ,Yes ,Yes
444 , , ,Yes ,Yes
555 , , ,Yes ,Yes
666 , , ,Yes ,Yes
QED: ('111','222','333','444','555','666')
Annotations:
I use an array to build my string to process (sAll). Your string (strText) comes from a file. So:
Dim sAll : sAll = Join( aLines, vbCrLf )
==>
Dim sAll : sAll = objFile.ReadAll
The string is parsed by an RegExp (reCut), its pattern ^\d+ looks for a sequence (+) of digits (\d) at the start (^) of a line (not the whole string; that's why the MultiLine attribute is set to True). The result of .Execute is a Match Collection (oMTS), containg Matches.
To make the the concatenation of the expected result easier, the values of the Matches are copied to an array (aNums).
The "('" & Join( aNums, "','" ) & "')"
expression combines the array's
elements using the separator (combinator?) ',' - to complete the result,
we need just a suitable head (' resp. tail ').
Upvotes: 1
Reputation: 92986
Try this
(.*?)(?:\s*,){3}Yes\s*,Yes\r?
you need to take care of the linebreaks, with Regexr \r
was fine. I put the line breaks into the regex because I wanted to have it optional using the ?
afterwards. Otherwise the last row will not be replaced if it does not end with a line break.
and replace it with
'$1',
Here you will get a additional comma at the end. I am at the moment not sure how to handle this.
$1
is the content of the first capturing group, in your case the part before the first comma should be in it.
See it here on Regexr
Upvotes: 0