Reputation: 219
I have to follow text:
30/01/2017 00:00:00 158
30/01/2017 00:30:00 158
30/01/2017 01:00:00 158
30/01/2017 01:30:00 158
30/01/2017 02:00:00 158
30/01/2017 02:30:00 158
30/01/2017 03:00:00 158
30/01/2017 03:30:00 158
30/01/2017 04:00:00 158
30/01/2017 04:30:00 158
30/01/2017 05:00:00 158
30/01/2017 05:30:00 158
30/01/2017 06:00:00 158
30/01/2017 06:30:00 157
30/01/2017 07:00:00 157
30/01/2017 07:30:00 157
30/01/2017 08:00:00 157
I want using regex to reorder date in ISO format and convert to .csv file.
I allred test this commands:
perl -pe 's/(\s)([0-9]{2})\/([0-9]{2})\/([0-9]{4})\s([0-9]{2}:[0-9]{2}:[0-9]{2})(\s+)(.*)/$4-$3-$2_$5;$7;931;2/g' file.txt > output.csv
and
sed -E 's/(\s)([0-9]{2})\/([0-9]{2})\/([0-9]{4})\s([0-9]{2}:[0-9]{2}:[0-9]{2})(\s+)(.*)/\4-\3-\2_\5;\7;931;2/g' file.txt > output.csv
Expected result was to be:
2017-01-30_00:00:00;158;931;2
2017-01-30_00:30:00;158;931;2
2017-01-30_01:00:00;158;931;2
2017-01-30_01:30:00;158;931;2
2017-01-30_02:00:00;158;931;2
2017-01-30_02:30:00;158;931;2
2017-01-30_03:00:00;158;931;2
2017-01-30_03:30:00;158;931;2
2017-01-30_04:00:00;158;931;2
2017-01-30_04:30:00;158;931;2
2017-01-30_05:00:00;158;931;2
2017-01-30_05:30:00;158;931;2
2017-01-30_06:00:00;158;931;2
2017-01-30_06:30:00;157;931;2
2017-01-30_07:00:00;157;931;2
2017-01-30_07:30:00;157;931;2
2017-01-30_08:00:00;157;931;2
But the result is:
;931;21-30_00:00:00;158
;931;21-30_00:30:00;158
;931;21-30_01:00:00;158
;931;21-30_01:30:00;158
;931;21-30_02:00:00;158
;931;21-30_02:30:00;158
;931;21-30_03:00:00;158
;931;21-30_03:30:00;158
;931;21-30_04:00:00;158
;931;21-30_04:30:00;158
;931;21-30_05:00:00;158
;931;21-30_05:30:00;158
;931;21-30_06:00:00;158
;931;21-30_06:30:00;157
;931;21-30_07:00:00;157
;931;21-30_07:30:00;157
;931;21-30_08:00:00;157
Note ** 931; 2 ** at the beginning, but it was to be at the end. And even ate a part of 2017.
Why does it happen?
Upvotes: 1
Views: 94
Reputation: 58371
This might work for you (GNU sed):
sed -r 's/^.(..).(..).(....).(........)\s*(\S*).*/\3-\2-\1_\4;\5;931;2/' file
Upvotes: 0
Reputation: 126722
The problem is almost certainly that you are using Linux to process a file that originated on a Windows system, which has CR LF line endings. The .*
at the end of your regex pattern matches the CR right after the last number on each line (but not the LF) and so retains it in $7
and inserts it into the output. That makes ;931;2
appear at the beginning of the line, overwriting the characters that were there before
One way to approach this is just to replace chomp
with s/\R\z//
which will match any of CR, LF, or CR LF at the end of the lines, and so handle the line endings of any system
Your regex is correct, but I would simply gather all numeric fields from each record and use printf
to reformat the output. That way there is no need to remove the line ending in the first place
It would look like this
use strict;
use warnings 'all';
open my $fh, '<', 'data.txt' or die $!;
while ( <$fh> ) {
my @F = /\d+/ag;
printf "%04d-%02d-%02d_%02d:%02d:%02d;%d;%d;%d\n",
@F[2,1,0,3,4,5,6], 931, 2;
}
2017-01-30_00:00:00;158;931;2
2017-01-30_00:30:00;158;931;2
2017-01-30_01:00:00;158;931;2
2017-01-30_01:30:00;158;931;2
2017-01-30_02:00:00;158;931;2
2017-01-30_02:30:00;158;931;2
2017-01-30_03:00:00;158;931;2
2017-01-30_03:30:00;158;931;2
2017-01-30_04:00:00;158;931;2
2017-01-30_04:30:00;158;931;2
2017-01-30_05:00:00;158;931;2
2017-01-30_05:30:00;158;931;2
2017-01-30_06:00:00;158;931;2
2017-01-30_06:30:00;157;931;2
2017-01-30_07:00:00;157;931;2
2017-01-30_07:30:00;157;931;2
2017-01-30_08:00:00;157;931;2
In a one-liner that would be
perl -ne '@F = /\d+/ag; printf "%04d-%02d-%02d_%02d:%02d:%02d;%d;%d;%d\n", @F[2,1,0,3,4,5,6], 931, 2;' myfile
Upvotes: 4