Reputation: 2641
My output:
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
jenkins/jenkins lts 806f56c84444 8 days ago 703MB
mongo latest 0da05d84b1fe 2 weeks ago 394MB
I would like to just cut the image ID alone from the output.
I tried using cut
:
docker images | cut -d " " -f1
REPOSITORY
jenkins/jenkins
The -f1
just gives me the repository names, if I use -f3
it tends to be empty. Since the delimiter is not a single space I don't see how to get the desired output.
Can we cut
based on field names?
I read the documentation and did not see anything relevant. I also saw that there is a way to achieve this using sed/AWK which i'm still figuring out.
In the meanwhile is there a easier way to achieve this using the cut
command?
I'm new to Unix/Linux, how can I determine which of Sed/AWK/Cut to prefer?
Upvotes: 3
Views: 3241
Reputation: 37464
Can we cut
based on field names? No.
How can I determine which of Sed/AWK/Cut to prefer? YMMV. For this particular input where fields are separated by two or more spaces, using awk you could set field separator to " +"
(two or more spaces), look for desired field name (IMAGE ID
below) and print only that particular field:
$ awk -F" +" ' # set field separator
{
if(f=="") # while we have not determined the desired field
for(i=1;i<=NF;i++) # ... keep looking
if($i=="IMAGE ID")
f=i
if(f!="") # once found
print $f # start printing it
}' file
Output:
IMAGE ID
806f56c84444
0da05d84b1fe
As one-liner:
$ awk -F" +" '{if(f=="")for(i=1;i<=NF;i++)if($i=="IMAGE ID")f=i;if(f!="")print $f}' file
Upvotes: 0
Reputation: 189749
In the general case, avoid parsing output meant for human consumption. Many modern utilities offer an option to produce output in some standard format like JSON or XML, or even CSV (though that is less strictly specified, and exists in multiple "dialects").
docker
in particular has a generalized --format
option which allows you to specify your own output format:
docker images --format "{{.ID}}"
If you cannot avoid writing your own parser (are you really sure!? Look again!), cut
is suitable for output with a specific single-character delimiter, or otherwise fairly regular output. For everything else, I would go with Awk. Out of the box, it parses columns from sequences of whitespace, so it does precisely what you specifically ask for:
docker images | awk 'NR>1 { print $3 }'
(NR>1
skips the first line, which contains the column headers.)
In the case of fixed-width columns, it allows you to pull out a string by index:
docker images | awk 'NR>1 { print substr($0, 41, 12) }'
... though you could do that with cut
, too:
docker images | cut -c41-53
... but notice that Docker might adjust column widths depending on your screen size!
Awk lets you write regular expression extractions, too:
awk 'NR>1 { sub(/^([^[:space:]]*[[:space:]]+){2}/, ""); sub(/[[:space]].*/, ""); print }'
This is where it overlaps with sed
:
sed -n '2,$s/^[^ ]\+[ ]\+[^ ]\+[ ]\+\([^ ]\+\)[ ].*/\1/p'
though sed
is significantly less human-readable, especially for nontrivial scripts. (This is still pretty trivial.)
If you haven't used regex before, the above will seem cryptic, but it really isn't very hard to pick apart. We are looking for sequences of non-spaces (a field in a column) followed by sequences of spaces (a column separator) - two before the ID field and whatever comes after it, starting from the first space after the ID column.
If you want to learn shell scripting, you should probably also learn at least the basics of Awk (and a passing familiarity with sed
). If you just want to get the job done, and perhaps aren't specifically interested in learning U*x tools (though you probably should be anyway!), perhaps instead learn a modern scripting language like Python or Ruby.
... Here's a Python docker
library:
import docker
client = docker.from_env()
for image in client.images.list():
print(image.id)
Upvotes: 0
Reputation: 433
With Procedural Text Edit
it's :
forEach line {
if (contains ci "REPOSITORY") { remove }
keepRange word 2 1
}
removeEmptyLines // <- optional
Upvotes: 0
Reputation: 446
Try this:
docker images | tr -s ' ' | cut -f3 -d' '
The command tr -s ' '
convert multiple spaces into a single one and after with cut you can grab your field. This work fine if values in your field haven't spaces.
Upvotes: 1
Reputation: 4004
You have to "squeeze" the space padding in the default output to single space.
1 2
== 1-space-space-2
== Field 1 before 1st space, Field between 1st and 2nd space, Field 3 after 2nd space.
cut -d' ' -f1
==> '1'
cut -d' ' -f2
==> '' empty field between 1st and 2nd delimiter
cut -d' ' -f3
==> '2'
So, in your case use sed
to replace consecutive spaces with 1:
docker images | sed 's/ */ /g' | cut -d " " -f1,3
If the output is fixed columns widths, then you can use this variant of cut:
docker images | cut -c1-20,41-60
This will cut out columns 41 to 60, where we find the Image ID.
If ever the output uses TAB
for padding, you should use expand -t n
to make the output consistently space padded then apply the appropriate cut -cx,y
, e.g. (numbers may need adjusting):
docker images | expand -t 4 | cut -c1-20,41-60
Upvotes: 2
Reputation: 50795
Your input seems to have a fixed width of 20 chars for each field, so you can make use of gawk's FIELDWIDTHS
feature.
$ awk -v FIELDWIDTHS="20 20 20 20 20" '{ print $3 }' file
IMAGE ID
806f56c84444
0da05d84b1fe
$
$ awk -v FIELDWIDTHS="20 20 20 20 20" '{ printf "%20s%20s\n", $1, $3 }' file
REPOSITORY IMAGE ID
jenkins/jenkins 806f56c84444
mongo 0da05d84b1fe
From man gawk
:
If the FIELDWIDTHS variable is set to a space-separated list of numbers, each field is expected to have fixed width, and gawk splits up the record using the specified widths. Each field width may optionally be preceded by a colon-separated value specifying the number of characters to skip before the field starts. The value of FS is ignored. Assigning a new value to FS or FPAT overrides the use of FIELDWIDTHS.
Upvotes: 2