Steven C. Britton
Steven C. Britton

Reputation: 482

Mac OS/X, Grep and Whitespace issues

I am trying to use GREP on a text file in OS/X as a test. It is known to contain the following text, including whitespace characters.

(10) Business Day

My regex search pattern is as follows:

[\(][0-9]{1,3}[\)] business day

However, this doesn't work:

$ grep -Eoi '[\(][0-9]{1,3}[\)] business day' *.txt

If I remove "day" from the above, I get this:

$ grep -Eoi '[\(][0-9]{1,3}[\)] business' *.txt
(10) Business

Which is the expected output from egrep -oi or grep -Eoi for the above line.

Neither this:

$ grep -Eoi '[\(][0-9]{1,3}[\)]\sbusiness\sday' *.txt

Nor this:

$ grep -Eoi '[\(][0-9]{1,3}[\)] business\sday' *.txt

Nor this:

$ grep -Eoi '[\(][0-9]{1,3}[\)][[:space:]]business[[:space:]]day' *.txt

Nor this:

$ grep -Eoi '[\(][0-9]{1,3}[\)] business[[:space:]]day' *.txt

yield the desired result, which is:

(10) Business Day

Instead, they yeild this:

$

(nothing)

I have wasted hours pounding my head on my desk for hours over this. Grep is clearly not rocket surgery, so what am I missing here?????

Upvotes: 1

Views: 1674

Answers (1)

Steven C. Britton
Steven C. Britton

Reputation: 482

Solved it. I need to thank vielmetti and suku for pointing me in the right direction, though.

The problem was multiple-fold.

First, the problem was in relation to the encoding of the text file when saved from a Word document on the Mac operating system. You need to save it as MS-DOS format, and DO NOT insert line breaks.

Once that got resolved, the command started finding the desired text, and, once I had figured out the MACScript approach so I could put the grep command into vba properly, everything fell into place.

So, to review - when saving a MS-Word document on the MAC as a Text file, make sure to use MS-DOS formatting withOUT line feeds.

Here's the VBA command to save it:

        Document.SaveAs FileName:=filePath & docName & ".txt", _
                        FileFormat:=wdFormatText, _
                        LockComments:=False, _
                        Password:="", _
                        AddToRecentFiles:=False, _
                        WritePassword:="", _
                        ReadOnlyRecommended:=False, _
                        EmbedTrueTypeFonts:=False, _
                        SaveNativePictureFormat:=False, _
                        SaveFormsData:=False, _
                        SaveAsAOCELetter:=False, _
                        Encoding:=437, _
                        InsertLineBreaks:=False, _    
                        AllowSubstitutions:=False, _
                        LineEnding:=wdCROnly         

The key settings InsertLineBreaks := False and, potentially, LineEnding:=wdCROnly.

Upvotes: 1

Related Questions