SRK
SRK

Reputation: 53

Brace needs to be escaped with \ inside single quotes

I expect the following to work:

ls -l | grep '^.{38}<some date>'

It should give me the files which have said date in modification time. But it does not work. The following works:

ls -l | grep '^.\{38\}<some date>'

Isn't '...' supposed to turn off special meaning for all the meta characters? Why should we have to escape braces?

Upvotes: 1

Views: 253

Answers (2)

Gordon Davisson
Gordon Davisson

Reputation: 126088

There are many variants of regular expression syntax. By default, grep uses the "basic" ("BRE" or "obsolete") regular expression syntax, in which braces must be escaped to be treated as repetition bounds (what you're trying to do here); without the escapes, they're treated as just literal characters. In the "extended" ("ERE" or "modern"), Perl-compatible ("PCRE"), and ... well, pretty much all other variants, it's the other way around: escaped braces are treated as literal characters, and unescaped ones define repetition bounds.

grep '^.{38}<some date>'      # Matches any character followed by literal braces around "38"
grep '^.\{38\}<some date>'    # Matches 38 characters
grep -E '^.{38}<some date>'   # Matches 38 characters (-E invokes "extended" syntax)
egrep '^.{38}<some date>'     # Matches 38 characters (egrep uses "extended" syntax)

BTW, parentheses are the same: literal unless escaped in the basic syntax, literal if escaped in the extended syntax. And there are a few other differences; see the re_format man page. There are also many other syntax variants (Perl-compatible, etc). It's important to know what variant the tool you're using accepts, and format your RE appropriately for it.

BTW2, as @Charles Duffy pointed out in a comment, parsing ls output isn't a good idea. In this case, the number of characters before the date will depend on the width of other fields (user, group, size), which will not be consistent, so skipping 38 characters might skip part of the date field or not skip enough. You'd be much better off using something like find with the -mtime or -mmin tests, or at least using stat instead of ls (since you can control the fields with the format string, and e.g. put the date at the beginning of the line) (but stat will still have some of ls's other problems).

Upvotes: 1

chepner
chepner

Reputation: 532418

The regular expression .{38}, as interpreted here by grep, matches an arbitrary string of exactly 38 characters. To match literal braces, you need to escape them.

.\{38\}

In order to ensure that that exact 7-character sequence is seen by grep, you need to quote the string so that the shell doesn't perform quote removal and reduce it to .{38} before grep gets a chance to see it.


Misunderstanding the question, it appears grep is using basic regular expressions, in which unescaped braces are the literal characters and the escaped ones introduce a brace expression. In extended regular expressions, it's the other way around. In either case, though, the single quotes are protecting all enclosed characters from special treatment by the shell; whether grep treats them specially is another question.

Upvotes: 1

Related Questions