Reputation: 71
I want to create a bash script to parse data returned by a this command :
openvpn --show-pkcs11-ids /usr/lib/libeTPkcs11.so
The typical output is :
The following objects are available for use.
Each object shown below may be used as parameter to
--pkcs11-id option please remember to use single quote mark.
Certificate
DN: XXX
Serial: XXXX
Serialized id: XXXX
Certificate
DN: XXXX
Serial: XXXX
Serialized id: XXXX
Certificate
DN: XXXXX
Serial: XXXX
Serialized id: XXXX
I want to get an array in bash containing 3 elements : the 3 "Certificate" blocks. I tried a lot of method of splitting but all of them only output an echo command, not an actual array.
Any ideas ?
Thx !
Upvotes: 0
Views: 143
Reputation: 84579
This is one where it would be much simpler and (much much faster) to use awk
. awk
provide arrays and is much more capable at processing input records than read
. With awk
you simply write rules to be applied to each line of input. In your case you just need to recognize whether the line begins with "DN:"
, "Serial:"
, or "Serialized"
. You can then store the associated value in a separate array, say arrays dn
, serial
, and serid
. To accomplish this in awk
you need nothing more than:
awk '
$1 == "Certificate" {n++}; # increment n
NF == 2 { # fill dn & serial array
$1 == "DN:" && dn[n]=$2
$1 == "Serial:" && serial[n]=$2
}
NF == 3 { # fill serid array
$1 == "Serialized" && serid[n]=$3
}
END { # output results
print "\nDN:\t\tSerial:\t\tSerialized id:"
for (i in dn) print dn[i], "\t\t", serial[i], "\t\t", serid[i]
}' file
Above if the first field ($1
) is "Certificate"
you just increment a counter. If there are 2 fields in the line (NF == 2
) then you check if the line begins with "DN:"
or "Serial"
and add the 2nd field to the proper array. If the line has 3-fields ("Serialized", "id:"
and your value) you store the value in the serid
array.
With all values stored, you can iterate over the arrays in the END
rule, providing any output you need. Above it simply outputs the content in tabular form. You can just copy/middle-mouse-paste in the command line to test.
Example Use/Output
$ awk '
> $1 == "Certificate" {n++}; # increment n
> NF == 2 { # fill dn & serial array
> $1 == "DN:" && dn[n]=$2
> $1 == "Serial:" && serial[n]=$2
> }
> NF == 3 { # fill serid array
> $1 == "Serialized" && serid[n]=$3
> }
> END { # output results
> print "\nDN:\t\tSerial:\t\tSerialized id:"
> for (i in dn) print dn[i], "\t\t", serial[i], "\t\t", serid[i]
> }' file
DN: Serial: Serialized id:
XXX XXXX XXXX
XXXX XXXX XXXX
XXXXX XXXX XXXX
For large file processing, awk
will be orders of magnitude faster that looping in a shell script. Let me know if this satisfies your needs of if you need additional help.
Edit Per-Comment
If you are dealing with a file that has mixed tabs and spaces being used a separators, that can present problem with awk
parsing using a default field separator (space). To consider a sequence of mixed spaces/tabs as a separator, with GNU awk
you can provide a regular expression for the separator. For instance considering a sequence of one or more spaces or tabs can be specified as -F'[ \t]+'
. The example below makes use of the separator. (note: the field numbers will change as a result)
awk -F'[ \t]+' '
$1 == "Certificate" {n++}; # increment n
NF == 3 { # fill dn & serial array
$2 == "DN:" && dn[n]=$3
$2 == "Serial:" && serial[n]=$3
}
NF == 4 { # fill serid array
$2 == "Serialized" && serid[n]=$4
}
END { # output results
print "\nDN:\t\tSerial:\t\tSerialized id:"
for (i in dn) print dn[i], "\t\t", serial[i], "\t\t", serid[i]
}' f
Example Use/Output
With your same data you would then have:
$ awk -F'[ \t]+' '
> $1 == "Certificate" {n++}; # increment n
> NF == 3 { # fill dn & serial array
> $2 == "DN:" && dn[n]=$3
> $2 == "Serial:" && serial[n]=$3
> }
> NF == 4 { # fill serid array
> $2 == "Serialized" && serid[n]=$4
> }
> END { # output results
> print "\nDN:\t\tSerial:\t\tSerialized id:"
> for (i in dn) print dn[i], "\t\t", serial[i], "\t\t", serid[i]
> }' f
DN: Serial: Serialized id:
XXX XXXX XXXX
XXXX XXXX XXXX
XXXXX XXXX XXXX
Not knowing what the space/tab makeup of your posted text actually is, this should handle either case.
Further Update Posting Input Contents Taken From Question
The following is the input file f
(or file
) used with the examples above. It was taken from your question, but there is no guarantee the space/tab translation is the same give the copy/paste into the question. The last example above should handle it regardless. The only other caveat is if you have a file with DOS line ending you are feeding to awk
-- it won't work. You can check by running the utility file yourfilename
and it will report is DOS CRLF
line endings are present. You can then use dos2unix yourfilename
to correct the problem and convert the file to Unix/POSIX line endings.
Example Input File
$ cat f
The following objects are available for use.
Each object shown below may be used as parameter to
--pkcs11-id option please remember to use single quote mark.
Certificate
DN: XXX
Serial: XXXX
Serialized id: XXXX
Certificate
DN: XXXX
Serial: XXXX
Serialized id: XXXX
Certificate
DN: XXXXX
Serial: XXXX
Serialized id: XXXX
Hexdump of Contents
$ hexdump -Cv f
00000000 54 68 65 20 66 6f 6c 6c 6f 77 69 6e 67 20 6f 62 |The following ob|
00000010 6a 65 63 74 73 20 61 72 65 20 61 76 61 69 6c 61 |jects are availa|
00000020 62 6c 65 20 66 6f 72 20 75 73 65 2e 0a 45 61 63 |ble for use..Eac|
00000030 68 20 6f 62 6a 65 63 74 20 73 68 6f 77 6e 20 62 |h object shown b|
00000040 65 6c 6f 77 20 6d 61 79 20 62 65 20 75 73 65 64 |elow may be used|
00000050 20 61 73 20 70 61 72 61 6d 65 74 65 72 20 74 6f | as parameter to|
00000060 0a 2d 2d 70 6b 63 73 31 31 2d 69 64 20 6f 70 74 |.--pkcs11-id opt|
00000070 69 6f 6e 20 70 6c 65 61 73 65 20 72 65 6d 65 6d |ion please remem|
00000080 62 65 72 20 74 6f 20 75 73 65 20 73 69 6e 67 6c |ber to use singl|
00000090 65 20 71 75 6f 74 65 20 6d 61 72 6b 2e 0a 0a 43 |e quote mark...C|
000000a0 65 72 74 69 66 69 63 61 74 65 0a 20 20 20 20 20 |ertificate. |
000000b0 20 20 44 4e 3a 20 20 20 20 20 20 20 20 20 20 20 | DN: |
000000c0 20 20 58 58 58 0a 20 20 20 20 20 20 20 53 65 72 | XXX. Ser|
000000d0 69 61 6c 3a 20 20 20 20 20 20 20 20 20 58 58 58 |ial: XXX|
000000e0 58 0a 20 20 20 20 20 20 20 53 65 72 69 61 6c 69 |X. Seriali|
000000f0 7a 65 64 20 69 64 3a 20 20 58 58 58 58 0a 0a 43 |zed id: XXXX..C|
00000100 65 72 74 69 66 69 63 61 74 65 0a 20 20 20 20 20 |ertificate. |
00000110 20 20 44 4e 3a 20 20 20 20 20 20 20 20 20 20 20 | DN: |
00000120 20 20 58 58 58 58 0a 20 20 20 20 20 20 20 53 65 | XXXX. Se|
00000130 72 69 61 6c 3a 20 20 20 20 20 20 20 20 20 58 58 |rial: XX|
00000140 58 58 0a 20 20 20 20 20 20 20 53 65 72 69 61 6c |XX. Serial|
00000150 69 7a 65 64 20 69 64 3a 20 20 58 58 58 58 0a 0a |ized id: XXXX..|
00000160 43 65 72 74 69 66 69 63 61 74 65 0a 20 20 20 20 |Certificate. |
00000170 20 20 20 44 4e 3a 20 20 20 20 20 20 20 20 20 20 | DN: |
00000180 20 20 20 58 58 58 58 58 0a 20 20 20 20 20 20 20 | XXXXX. |
00000190 53 65 72 69 61 6c 3a 20 20 20 20 20 20 20 20 20 |Serial: |
000001a0 58 58 58 58 0a 20 20 20 20 20 20 20 53 65 72 69 |XXXX. Seri|
000001b0 61 6c 69 7a 65 64 20 69 64 3a 20 20 58 58 58 58 |alized id: XXXX|
000001c0 0a |.|
000001c1
Let me know the results of your file examination.
Upvotes: 2
Reputation: 2611
You can use AWK to do that. It is a tool specifically created for transforming table-like output.
openvpn --show-pkcs11-ids /usr/lib/libeTPkcs11.so | grep 'Certificate\|DN:\|Serial:\|Serialized id:' | awk -v RS="Certificate" '{print $2,$4,$7}'
Explanation:
grep 'Certificate\|DN:\|Serial:\|Serialized id:' - Choose only interesting lines of output
awk -v RS="Certificate" '{print $2,$4,$7}' - See below comment
Comment: AWK enables you to change the record separator using "-v RS=" parameter. By default it is a newline, so each line of the file is a record, but it can be changed to any string e.g. "Certificate".
Output is not an array, but every certificate is described in separate line you can further pipe to another tool.
Upvotes: 1