Reputation: 41
I want to replace thousands lines like this, but I'm having a hard time trying to make it work, also I have 2 variables $time and $date condition, to not make it global.:
Example: <!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
To replace: <!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>NaN</v></row>
I tried with sed:
sed -i '<!-- 2020-07-06 16:45:00 WEST \/ 1594050300 --> <row><v>5.0000000000e+00<\/v><\/row>.*/<!-- 2020-07-06 16:45:00 WEST \/ 1594050300 --> <row><v>NaN<\/v><\/row>/' dump_teste.xml
sed: -e expression #1, char 1: unknown command: `<'
Also with awk:
awk '{gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1' tmp.txt
awk: cmd. line:1: {gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1: ^ syntax error
awk: cmd. line:1: {gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1: ^ syntax error
awk: cmd. line:1: {gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1: ^ syntax error
awk: cmd. line:1: {gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1: ^ syntax error
awk: cmd. line:1: {gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1: ^ unterminated string
awk: cmd. line:1: {gsub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1: ^ syntax error
or
awk '{sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1' tmp.txt
awk: cmd. line:1: {sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1: ^ syntax error
awk: cmd. line:1: {sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1: ^ syntax error
awk: cmd. line:1: {sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1: ^ syntax error
awk: cmd. line:1: {sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1: ^ syntax error
awk: cmd. line:1: {sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1: ^ unterminated string
awk: cmd. line:1: {sub(/<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>1.9933333333e+00</v></row>/,"<!-- 2020-07-08 12:00:00 WEST / 1594206000 --> <row><v>NaN</v></row>")}1
awk: cmd. line:1: ^ syntax error
Upvotes: 0
Views: 87
Reputation:
As per your need below is a command to replace number with NAN in file considering all lines that fall in time range irrespective of order in which lines appear.
set date from and till variables and then below command
while IFS= read -r in; do out="$(echo "$in" | awk '{print $2}')" && outtime="$(echo "$in" | awk '{print $3}')" && sed -i "/"$out" "$outtime"/ s/<v>.*<\/v>/<v>NAN<\/v>/" dumpteste.xml; done <<< "$(sort -k3 -k4 -k5 dumpteste.xml | awk -v date="$date" -v from="$from" -v till="$till" '$2 == date && $3 >= from && $3 <= till' | tac)"
Example of above command
cat dumpteste.xml #original file
<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
<!-- 2020-07-06 16:47:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
<!-- 2020-07-06 17:47:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
<!-- 2020-07-06 16:48:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
<!-- 2020-07-06 17:45:00 WEST / 1594050300 --><row<v>5.0000000000e+00</v></row>
<!-- 2020-08-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
date=2020-07-06
from=16:45:00
till=17:45:00
Output
cat dumpteste.xml #after change
<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>NAN</v></row>
<!-- 2020-07-06 16:47:00 WEST / 1594050300 --> <row><v>NAN</v></row>
<!-- 2020-07-06 17:47:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>NAN</v></row>
<!-- 2020-07-06 16:48:00 WEST / 1594050300 --> <row><v>NAN</v></row>
<!-- 2020-07-06 17:45:00 WEST / 1594050300 --> <row><v>NAN</v></row>
<!-- 2020-08-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>
See for dates 2020-07-06, when time range 16:45:00-17:45:00 is provided lines with time 16:45,16:48,16:47,17:45 are changed. For time 16:45 but date 2020-08-06 it did not changed as date not matches.
Also if you need to enter date in range then define four variables: date, enddate, from, till. And execute below command
date=2020-07-06
enddate=2020-08-06
from=16:45:00
till=17:45:00
while IFS= read -r in; do out="$(echo "$in" | awk '{print $2}')" && outtime="$(echo "$in" | awk '{print $3}')" && sed -i "/"$out" "$outtime"/ s/<v>.*<\/v>/<v>NAN<\/v>/" du*; done <<< "$(sort -k3 -k4 -k5 du* | awk -v date="$date" -v from="$from" -v till="$till" -v enddate="$enddate" '$2 >= date && $2 <= enddate && $3 >= from && $3 <= till' | tac)"
Above command will help you in your task of changing the values provided with date and time in range Hope this is enough?
Shorter Version: 1). With time range
date=2020-07-06 && from=16:45:00 && till=17:45:00 && gawk -i inplace -v date="$date" -v from="$from" -v till="$till" '$2 == date && $3 >= from && $3 <= till {gsub(/<v>[^<]*/, "<v>nan<")}1' dumpteste.xml
2). With both date and time range
date=2020-07-06 && from=16:45:00 && till=17:45:00 && enddate=2020-08-06 && awk -v date="$date" -v from="$from" -v till="$till" -v enddate="$enddate" '$2 >= date && $2 <= enddate && $3 >= from && $3 <= till {gsub(/<v>[^<]*/, "<v>nan<")}1' dumpteste.xml
Upvotes: 0
Reputation:
The command you are trying is not having s option thats why it gives error.
sed -i 's/<!-- 2020-07-06 16:45:00 WEST \/ 1594050300 --> <row><v>5.0000000000e+00<\/v><\/row>.*/<!-- 2020-07-06 16:45:00 WEST \/ 1594050300 --> <row><v>NaN<\/v><\/row>/g' dumpteste.xml
or
sed -i 's/<v>.*<\/v>/<v>NAN<\/v>/g' dumpteste.xml
You are having two variable $date and $time and want to match lines having those variables and then apply sed. Do following:
sed "/"$date" "$time" .*<\/row>/ s/<v>.*<\/v>/<v>NAN<\/v>/g" dumpteste.xml
In above command if line is
<!-- 2020-07-06 16:45:00 WEST / 1594050300 --> <row><v>5.0000000000e+00</v></row>```
And date and time variable are
date='2020-07-06' time='16:45:00'
then only line containg that date and time will be edited by sed.
Did it solved your problem?
Upvotes: 2