Reputation: 439
I have a text file, the format is below, and I would like to only capture the numbers after the .txt
. I did awk '{print $2}' filename
and it gave me the wrong result.
For some of the lines, it gave me :
instead of the number. For example, in the second line I will get :
instead of 914
.
Is there any other way that I can extract the numbers after .txt
? I am not referring to numbers in the rgb
part.
image/Subject01.txt:1310 : image/Subject01/Scene4/Color/rgb7
image/Subject01.txt: 914 : image/Subject01/Scene4/Color/rgb3
...
Upvotes: 1
Views: 2871
Reputation: 133428
Could you please try following.
awk '
match($0,/\.txt:[^:]*/){
val=substr($0,RSTART,RLENGTH)
sub(/[^0-9]+/,"",val)
print val
}
' Input_file
2nd solution: Using field separator.
awk 'BEGIN{FS="[.:]"} $2=="txt"{print $3+0}' Input_file
Upvotes: 1
Reputation: 26
You can also use the cut command
cut -d ':' -f 2 filename
This will set the [d]elimiter to [:] and then will take the [2]nd [f]ield
Upvotes: 1
Reputation: 52112
You could use GNU grep for this:
$ grep -Po '\.txt: *\K[[:digit:]]+' infile
1310
914
-P
enables Perl-compatible regular expressions (required for the \K
), and -o
retains only the match.
The regular expression looks for the string .txt:
, followed by any number of blanks (including zero); this part of the match is discarded (\K
), and then we match as many digits as we can find.
Upvotes: 0
Reputation: 50750
You forgot specifying a custom field separator. Like
awk -F ' *: *' '{print $2}' file
Upvotes: 4