carlspring
carlspring

Reputation: 32617

Regex for uppercase matches with exclusions

I'm trying to come up with a regex for the following case: I need to find any matching paths using grep for the following paths:

Is this possible with grep?

Upvotes: 2

Views: 167

Answers (4)

Johann Chang
Johann Chang

Reputation: 1391

If you only want to get the matching files. I'll do it like this.

find . -type f -regex '.*[A-Z].*' | while read -r line; do echo "$line" | sed 's/SNAPSHOT//g' | grep -q '.*[A-Z].*' && echo "$line"; done

Upvotes: 1

Tamas Rev
Tamas Rev

Reputation: 7166

Well, you need find to list all paths. Then you can do it with grep with two runs. One includes all capital cases. The other one excludes that contain no capitals except SNAPSHOT:

find . | grep '[A-Z]' | grep -v '.*\/[^A-Z]*SNAPSHOT[^A-Z]*$'

I think only the last grep needs some explanation:

  • grep -v excludes the matching lines
  • .*\/ greedily matches everything up to the first slash. There'll always be a slash due to find .
  • [^A-Z]* finds all characters that are non-capital letters. So we apply it before and after the SNAPSHOT literal, up to the end of the string.

Here you can play with it online.

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 203597

Just use awk:

$ cat file
com/foo/Bar/1.2.3-SNAPSHOT/Bar-1.2.3-SNAPSHOT.jar
com/foo/bar/1.2.3-SNAPSHOT/bar-1.2.3-SNAPSHOT.jar

With GNU awk or mawk for gensub():

$ awk 'gensub(/SNAPSHOT/,"","g")~/[[:upper:]]/' file
com/foo/Bar/1.2.3-SNAPSHOT/Bar-1.2.3-SNAPSHOT.jar

With other awks:

$ awk '{r=$0; gsub(/SNAPSHOT/,"",r)} r~/[[:upper:]]/' file
com/foo/Bar/1.2.3-SNAPSHOT/Bar-1.2.3-SNAPSHOT.jar

Upvotes: 1

Andreas Louv
Andreas Louv

Reputation: 47099

Something like this might do:

grep -vE '^([^[:upper:]]*(SNAPSHOT)?)*$'

Breakdown:

-v will reverse the match (show all non matched lines. -E enabled Extended Regular Expressions.

^                             # Start of line
 (                        )*  # Capturing group repeated zero or more times
  [^[:upper:]]*               # Match all but uppercase zero or more times
               (SNAPSHOT)?    # Followed by literal SNAPSHOT zero or one time
                            $ # End of line

Upvotes: 5

Related Questions