Amin Z
Amin Z

Reputation: 37

how to grep everything between single quotes?

I am having trouble figuring out how to grep the characters between two single quotes .

I have this in a file version: '8.x-1.0-alpha1'

and I like to have the output like this (the version numbers can be various):

8.x-1.0-alpha1

I wrote the following but it does not work:

cat myfile.txt | grep -e 'version' | sed 's/.*\?'\(.*?\)'.*//g'

Thank you for your help.

Addition: I used the sed command sed -n "s#version:\s*'\(.*\)'#\1#p" I also like to remove 8.x- which I edited to sed -n "s#version:\s*'8.x-\(.*\)'#\1#p".

This command only works on linux and it does not work on MAC. How to change this command to make it works on MAC?

sed -n "s#version:\s*'8.x-\(.*\)'#\1#p"

Upvotes: 3

Views: 6727

Answers (5)

John
John

Reputation: 2425

Try something like this: sed -n "s#version:\s*'\(.*\)'#\1#p" myfile.txt. This avoids the redundant cat and grep by finding the "version" line and extracting the contents between the single quotes.

Explanation:

the -n flag tells sed not to print lines automatically. We then use the p command at the end of our sed pattern to explicitly print when we've found the version line.

Search for pattern: version:\s*'\(.*\)'

  • version:\s* Match "version:" followed by any amount of whitespace
  • '\(.*\)' Match a single ', then capture everything until the next '

Replace with: \1; This is the first (and only) capture group above, containing contents between single quotes.

Upvotes: 2

Walter A
Walter A

Reputation: 20022

When your only want to look at he quotes, you can use cut.

grep -e 'version' myfile.txt | cut -d "'" -f2

Upvotes: 2

glenn jackman
glenn jackman

Reputation: 247042

I'd use GNU grep with pcre regexes:

grep -oP "version: '\\K.*(?=')" file

where we are looking for "version: '" and then the \K directive will forget what it just saw, leaving .*(?=') to match up to the last single quote.

Upvotes: 5

Hkoof
Hkoof

Reputation: 796

grep can almost do this alone:

grep -o "'.*'" file.txt

But this may also print lines you don't want to: it will print all lines with 2 single quotes (') in them. And the output still has the single quotes (') around it:

'8.x-1.0-alpha1'

But sed alone can do it properly:

sed -rn "s/^version: +'([^']+)'.*/\1/p" file.txt

Upvotes: 1

kvantour
kvantour

Reputation: 26521

If you just want to have that information from the file, and only that you can quickly do:

awk -F"'" '/version/{print $2}' file

Example:

$ echo "version: '8.x-1.0-alpha1'" | awk -F"'" '/version/{print $2}'
8.x-1.0-alpha1

How does this work?

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of commands.

  1. -F "'": Here we tell to define the field separator FS to be a <single quote> '. This means the all lines will be split in fields $1, $2, ... ,$NF and between each field there is a '. We can now reference these fields by using $1 for the first field, $2 for the second ... etc and this till $NF where NF is the total number of fields per line.

  2. /version/{print $2}: This is the condition-action pair.

    • condition: /version/:: The condition reads: If a substring in the current record/line matches the regular expression /version/ then do action. Here, this is simply translated as if the current line contains a substring version

    • action: {print $2}:: If the previous condition is satisfied, then print the second field. In this case, the second field would be what the OP requests.

There are now several things that can be done.

  1. Improve the condition to be /^version :/ && NF==3 which reads _If the current line starts with the substring version : and the current line has 3 fields then do action

  2. If you only want the first occurance, you can tell the system to exit immediately after the find by updating the action to {print $2; exit}

Upvotes: 8

Related Questions