Reputation: 482
I am trying to use GREP on a text file in OS/X as a test. It is known to contain the following text, including whitespace characters.
(10) Business Day
My regex search pattern is as follows:
[\(][0-9]{1,3}[\)] business day
However, this doesn't work:
$ grep -Eoi '[\(][0-9]{1,3}[\)] business day' *.txt
If I remove "day" from the above, I get this:
$ grep -Eoi '[\(][0-9]{1,3}[\)] business' *.txt
(10) Business
Which is the expected output from egrep -oi or grep -Eoi for the above line.
Neither this:
$ grep -Eoi '[\(][0-9]{1,3}[\)]\sbusiness\sday' *.txt
Nor this:
$ grep -Eoi '[\(][0-9]{1,3}[\)] business\sday' *.txt
Nor this:
$ grep -Eoi '[\(][0-9]{1,3}[\)][[:space:]]business[[:space:]]day' *.txt
Nor this:
$ grep -Eoi '[\(][0-9]{1,3}[\)] business[[:space:]]day' *.txt
yield the desired result, which is:
(10) Business Day
Instead, they yeild this:
$
(nothing)
I have wasted hours pounding my head on my desk for hours over this. Grep is clearly not rocket surgery, so what am I missing here?????
Upvotes: 1
Views: 1674
Reputation: 482
Solved it. I need to thank vielmetti and suku for pointing me in the right direction, though.
The problem was multiple-fold.
First, the problem was in relation to the encoding of the text file when saved from a Word document on the Mac operating system. You need to save it as MS-DOS format, and DO NOT insert line breaks.
Once that got resolved, the command started finding the desired text, and, once I had figured out the MACScript approach so I could put the grep command into vba properly, everything fell into place.
So, to review - when saving a MS-Word document on the MAC as a Text file, make sure to use MS-DOS formatting withOUT line feeds.
Here's the VBA command to save it:
Document.SaveAs FileName:=filePath & docName & ".txt", _
FileFormat:=wdFormatText, _
LockComments:=False, _
Password:="", _
AddToRecentFiles:=False, _
WritePassword:="", _
ReadOnlyRecommended:=False, _
EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, _
SaveFormsData:=False, _
SaveAsAOCELetter:=False, _
Encoding:=437, _
InsertLineBreaks:=False, _
AllowSubstitutions:=False, _
LineEnding:=wdCROnly
The key settings InsertLineBreaks := False and, potentially, LineEnding:=wdCROnly.
Upvotes: 1