Reputation: 59
I am using a SIPP server simulator to verify incoming calls. What I need to verify is the caller ID and the dialed digits. I've logged this information to a file, which now contains, for example, the following:
From: <sip:972526134661@server>;tag=60=.To: <sip:972526134662@server>}
in each line.
What I want is to modify it to a csv file containing simply the two phone numbers, such as follows:
972526134661,972526134662
and etc.
I've tried using the awk -F
command, but then I can only use the sip:
as a delimiter or the @
or /
as delimiters.
While, basically what I want to do is to take all the strings which begin with a <
and end with >
, and then take all the strings that follow the sip:
delimiter.
using the cut
command is also not an option, as I understand that it cannot use strings as delimiters.
I guess it should be really simple but I haven't find quite the right thing to use.. Would appreciate the help, thanks!
Upvotes: 1
Views: 3344
Reputation: 13076
OK, for fun, picking some random data (from your original post) and using awk -F
as you originally wanted.
To note, because your file is "generated", we can assume a regular format for the data and not expect the "short" patterns to cause mis-hits.
[g]awk -F'sip:|@' -v OFS="," '{print $2,$4}' yourlogfile
It uses both sip:
and @
as the Field Separator, by means of the alternation operator |
. It can easily be extended to allow further characters or strings to also be used to separate fields in the input if required. The built-in variable FS can contain a regular expression/regexp like this.
For that first sample in your question, it yields this:
972526134661,972526134662
For the latest (revision 8) version, and guessing what you want:
[g]awk -F'sip:|@|to_number:' -v OFS="," '{print $2,$5}' yourlogfile
Yields this:
from_number,972526134662
The [g]awk is because I used gawk on my machine, and got same behaviour with awk.
Slight amendment in style, suggested by @fedorqui, to use the command-line option -v to set the value for the Output Field Separator (an AWK built-in variable which can be amended using -v like any other variable) and separating the print fields with a comma, so that they are treated in the output as fields, rather than building a string with a hard-coded "," and treating it as one field.
Upvotes: 2
Reputation: 1754
You can use a regex replace, as long as the format stays the same (order is always From/To):
sed -E "s/^.*sip:([0-9]+)@.*sip:([0-9]+)@.*$/\1,\2/"
It's not a very specific or perfect solution, but in most cases an approach like this is enough.
Upvotes: 0
Reputation: 74595
I would suggest using sed to extract the two numbers:
$ sed -n 's/^From: <sip:\([0-9]*\).*To: <sip:\([0-9]*\).*/\1,\2/p' file
972526134661,972526134662
The regular expression matches a line beginning with From
and captures the two numbers after <sip:
. If the spaces are variable, you may want to add *
to those places.
Upvotes: 1