Reputation: 23
I want to join lines in a file as below.
Input
01EPH087362 SHHFHDH 3673
63737
Dhdhj
01EPH636363 DHHDH
3637737
Hshshhd
01EPH7373838 HDJJDJ
Output
01EPH087362 SHHFHDH 3673 63737 Dhdhj
01EPH636363 DHHDH 3637737Hshshhd
01EPH7373838 HDJJDJ
I want the output as above; basically every line should start with 01EPH.
I have awk and sed but no luck. Please help if you know.
Upvotes: 2
Views: 1322
Reputation: 133760
@suyog: Could you please try following too and let me know if this helps you.
awk '{printf("%s%s",($0 ~ /^01E/ && NR>1)?ORS:NR>1?FS:"",$0)} END{print ""}' Input_file
Output will be as follows.
01EPH087362 SHHFHDH 3673 63737 Dhdhj
01EPH636363 DHHDH 3637737 Hshshhd
01EPH7373838 HDJJDJ
Upvotes: 2
Reputation: 104092
Here is pure Bash (plus printf
) to do this just for giggles:
while IFS= read -r line || [[ -n $line ]]; do
if [[ "$line" =~ ^01EPH ]]; then
printf "%s%s" "$pad" "$line"
pad=$'\n'
else
printf " %s" "$line"
fi
done <file
Here is a Perl slurp solution:
perl -0777 -ne 'while (/(^01EPH.*?)(?=^01EPH|\z)/gms) {($st=$1)=~s/\n/ /g; print "$st\n" }' file
In both cases, awk
is probably better...
Upvotes: 0
Reputation: 20032
When you have a file with only \n line-endings, you could use
sed 's/^01EPH/\r&/;$s/$/\r/' inputfile | tr -d "\n" | tr "\r" "\n"
The first part of sed inserts a \r
before each 01EPH
. The second part appends one at the end so that the last line will end with a linefeed too.
Now remove the original linefeeds and replace the marked ones with linefeeds.
It goes through the file 3 times, so any awk
solution will be better for a large file, but I just wanted to show tr
with sed
.
Upvotes: 1
Reputation: 67567
another awk
$ $ awk 'NR>1 && /^01EPH/ {print ""}
{printf "%s", $0 OFS}
END {print ""}' file
01EPH087362 SHHFHDH 3673 63737 Dhdhj
01EPH636363 DHHDH 3637737 Hshshhd
01EPH7373838 HDJJDJ
add newline when pattern matches (except first line) and at the end, otherwise append lines...
Upvotes: 2
Reputation: 247182
My take:
awk '
/^01EPH/ {printf "%s%s", nl, $0; nl = "\n"; next}
{printf " %s", $0}
END {print ""}
' file
Upvotes: 2
Reputation: 204548
$ awk '/^01EPH/{if (NR>1) print buf; buf=$0; next} {buf = buf OFS $0} END{print buf}' file
01EPH087362 SHHFHDH 3673 63737 Dhdhj
01EPH636363 DHHDH 3637737 Hshshhd
01EPH7373838 HDJJDJ
Upvotes: 2
Reputation: 16997
Input
$ cat f
01EPH087362 SHHFHDH 3673
63737
Dhdhj
01EPH636363 DHHDH
3637737
Hshshhd
01EPH7373838 HDJJDJ
Output
$ awk '(s=/^01EPH/) && NR>1{print ""}{printf("%s%s",(s?"":" "),$0)}END{print ""}' f
01EPH087362 SHHFHDH 3673 63737 Dhdhj
01EPH636363 DHHDH 3637737 Hshshhd
01EPH7373838 HDJJDJ
Upvotes: 1
Reputation: 5850
One liner:
tr '\n' ' ' < file.txt | sed s/01EPH/\\n01EPH/g
tr '\n' ' ' < file.txt
- makes one string
sed s/01EPH/\\n01EPH/g
- prefix 01EPH by newline
Upvotes: -2
Reputation: 755010
awk '/^01EPH/ { if (record != "") print record; record = ""; pad = "" }
{ record = record pad $0; pad = " " }
END { if (record != "") print record }'
If the line starts 01EPH
, print the saved information, if there is any, and empty the saved information and the padding.
On every line, add the pad and the new line to the saved information; set the pad to a blank.
At the end, print the saved record if there is anything in it.
This even miraculously preserves the double space between DHHDH
and 3637737Hshshhd
because there is a trailing blank on the line ending DHHDH
.
Output:
01EPH087362 SHHFHDH 3673 63737 Dhdhj
01EPH636363 DHHDH 3637737 Hshshhd
01EPH7373838 HDJJDJ
Upvotes: 1