Denys
Denys

Reputation: 4557

Convert timestamp column in a csv

I have a tab separated csv. The rows look like following:

57760234    [email protected]  3791    text_value  2016-04-25 07:56:59+02  2
57767500    [email protected]  3784    text_value  2016-04-25 07:30:49+02  2

How do i remove the +02 (i assume it can be any number, not only +02) bit from the timestamp column for all the rows?

P.S. What if I there where two timestamps in one row? Like

57760234    [email protected]  3791    text_value  2016-04-25 07:56:59+02  2016-04-25 07:56:59+02  2  

?

Upvotes: 0

Views: 149

Answers (3)

Ed Morton
Ed Morton

Reputation: 203334

Since + followed by a number doesn't occur in any other field (column) we don't have to worry about which field we affect:

$ cat file
57760234    [email protected]  3791    text_value  2016-04-25 07:56:59+02  2
57767500    [email protected]  3784    text_value  2016-04-25 07:30:49+02  2
57760234    [email protected]  3791    text_value  2016-04-25 07:56:59+02  2016-04-25 07:56:59+02  2
$
$ sed 's/+[0-9]*//' file
57760234    [email protected]  3791    text_value  2016-04-25 07:56:59  2
57767500    [email protected]  3784    text_value  2016-04-25 07:30:49  2
57760234    [email protected]  3791    text_value  2016-04-25 07:56:59  2016-04-25 07:56:59+02  2
$
$ sed 's/+[0-9]*//g' file
57760234    [email protected]  3791    text_value  2016-04-25 07:56:59  2
57767500    [email protected]  3784    text_value  2016-04-25 07:30:49  2
57760234    [email protected]  3791    text_value  2016-04-25 07:56:59  2016-04-25 07:56:59  2
$
$ awk '{sub(/+[0-9]*/,"")}1' file
57760234    [email protected]  3791    text_value  2016-04-25 07:56:59  2
57767500    [email protected]  3784    text_value  2016-04-25 07:30:49  2
57760234    [email protected]  3791    text_value  2016-04-25 07:56:59  2016-04-25 07:56:59+02  2
$
$ awk '{gsub(/+[0-9]*/,"")}1' file
57760234    [email protected]  3791    text_value  2016-04-25 07:56:59  2
57767500    [email protected]  3784    text_value  2016-04-25 07:30:49  2
57760234    [email protected]  3791    text_value  2016-04-25 07:56:59  2016-04-25 07:56:59  2

If that's not what you need then edit your question to include some more truly representative sample input and expected output.

Upvotes: 0

Slava Semushin
Slava Semushin

Reputation: 15204

Try this:

sed -i 's|+[0-9]\+\([[:space:]]\+[0-9]\+\)$|\1|' file

Here I used regexp to replace +02 2 at the end of the line to just 2

Important: it would work with any numbers after plus sign, but it's important to have this plus sign, without it it wouldn't work.

Updated:

P.S. What if I there where two timestamps in one row? Like

In this case it wouldn't work and you could use another approach, that is based on replacing date with timestamps by dates without it:

sed -i 's|\([0-9]\+:[0-9]\+:[0-9]\+\)+[0-9]\+|\1|g' file

But dates should be in the format like 07:56:59+02.

Upvotes: 1

Kent
Kent

Reputation: 195039

Give this one-liner a try, I didn't test, but should work

awk 'BEGIN{FS=OFS="\t"}{sub(/[+][0-9]+$/,"",$(NF-1))}7' file

Upvotes: 2

Related Questions