Reputation: 1
I have been struggling to figure out how to 'unparse' lines in an log file (with 2 new line delimiters - '@' and '|') so all lines related to one time stamp are on one line.
Example:
2016-03-22 blah blah blah
|blah blah
|blah blah blah
@blah
|blah blah blah
2016-03-22 blah blah blah
|blah blah blah
@blah blah
@blah blah blah
|blah
Required Output
2016-03-22 blah blah blah |blah blah |blah blah blah @blah |blah blah blah
2016-03-22 blah blah blah |blah blah blah @blah blah @blah blah blah |blah
I thought I had this sussed simply by using xarg to put everything on one line then using sed to add new lines at 2016 but i discovered there is a limit on characters on one line and the log file is so big xargs was creating multiple lines.
Removing the carriage returns from lines starting with | and @ would solve this but can't fathom how to do this either.
I've searched on here and found a few people posting similar questions but I can't interpret some of the solutions to fit in with my issue as I'm not familiar enough with sed/awk/xargs.
Would appreciate if anyone can offer some suggestions.
Thanks
Upvotes: 0
Views: 180
Reputation: 1
But how to merge lines (only words from lines), when this word exists in both files? All words are changing automaticaly and files 1.txt and 2.txt are changing automatically too as part of package manager's script in Gnome 2 environment. And "link" means http://link
example INPUT:
1.txt contains detected http and version of packages:
link1/autotools-dev_20100122.1
link4/debhelper_8.0.0
link5/dreamchess_0.2.0
link5/dreamchess_0.2.0-2
link7/quilt_0.48
link7/quilt_0.48-7
link34/quilt-el_0.46.2
link34/quilt-el_0.46.2-1
2.txt contains needed extensions of packages:
autotools-dev_*.diff.gz
debhelper_*.diff.gz
debhelper_*.orig.tar.gz
libmxml-dev_*.diff.gz
libmxml-dev_*.dsc
libmxml-dev_*.orig.tar.gz
libsdl1.2-dev_*.diff.gz
libsdl1.2-dev_*.dsc
libsdl1.2-dev_*.orig.tar.gz
libsdl-image1.2-dev_*.diff.gz
libsdl-image1.2-dev_*.dsc
libsdl-image1.2-dev_*.orig.tar.gz
quilt_*.diff.gz
DESIRED OUTPUT to file 3.txt:
link1/autotools-dev_20100122.1.diff.gz
link4/debhelper_8.0.0.diff.gz
link4/debhelper_8.0.0.orig.tar.gz
libmxml-dev_*.diff.gz
libmxml-dev_*.dsc
libmxml-dev_*.orig.tar.gz
libsdl1.2-dev_*.diff.gz
libsdl1.2-dev_*.dsc
libsdl1.2-dev_*.orig.tar.gz
libsdl-image1.2-dev_*.diff.gz
libsdl-image1.2-dev_*.dsc
libsdl-image1.2-dev_*.orig.tar.gz
link7/quilt_0.48.diff.gz
link7/quilt_0.48-7.diff.gz
So needed script, which automaticaly detects common package name in files 1.txt and 2.txt and to file 3.txt suitable inserts to the same line where package name exist:
http and version from file 1.txt
extension from file 2.txt
lines from file 2.txt which not contain package name in file 1.txt
Upvotes: 0
Reputation: 19982
Remove the newlines, add a newline at the end of the line and insert newlines before each 2016:
echo '2016-03-22 blah blah blah
|blah blah
|blah blah blah
@blah
|blah blah blah
2016-03-22 blah blah blah
|blah blah blah
@blah blah
@blah blah blah
|blah ' | tr -d '\n' | sed -e 's/$/\n/' -e 's/2016-/\n2016-/g'
Upvotes: 0
Reputation: 58371
This might work for you (GNU sed):
sed ':a;N;/\n....-..-.. /!s/\n/ /;ta;P;D' file
Read two lines into the pattern space and if the newline is not the start of a new record, replace it by a space and repeat i.e. append another line to the existing one etc. If the line appended is the start of a new record, print the first line, delete it and repeat.
Upvotes: 0
Reputation: 80921
anubhava's answer works but it buffers the entirety of each line before printing it.
This prints as it reads each input line.
awk '{printf "%s%s", /^[|@]/?OFS:(NR>1)?"\n":"", $0} END{print ""}'
/^[|@]/
match lines starting with @
or |
?OFS
if matched lead with OFS
(output field separator, space by default):
otherwise
(NR>1)
if we aren't on the first line?"\n"
output a newline:""
otherwise output a blank (to avoid a blank line at the top of the output)END{print ""}
make sure we end the last line with a newlineUpvotes: 0
Reputation: 784958
You can use this awk command:
awk '/^[0-9]{4}(-[0-9]{2}){2}/ {
if (p!="")
print p
p=$0
next
}
{
p = p OFS $0
}
END {
print p
}' file
2016-03-22 blah blah blah |blah blah |blah blah blah @blah |blah blah blah
2016-03-22 blah blah blah |blah blah blah @blah blah @blah blah blah |blah
Upvotes: 1