Extracting the contents between two different strings using bash or perl

Question

I have tried to scan through the other posts in stack overflow for this, but couldn't get my code work, hence I am posting a new question.

Below is the content of file temp.

 
 2015-01-
 22T13:38:04ZXJzLXJlc3VsdHMtYWN0aW9uX18ilc3VsdHMtYWN0aW9uX18i

This file contains the base64 encoded contents of two files names test.txt and test1.txt. I want to extract the base64 encoded content of each file to seperate files test.txt and text1.txt respectively.

To achieve this, I have to remove the xml tags around the base64 contents. I am trying below commands to achieve this. However, it is not working as expected.

sed -n '/test.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's@@@g'|perl -p -e 's@@@g' > test.txt

sed -n '/test1.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's@@@g'|perl -p -e 's@@@g' > test1.txt

Below command:

sed -n '/test.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's@@@g'|perl -p -e 's@@@g'

produces output:

 XJzLXJlc3VsdHMtYWN0aW9uX18i

lc3VsdHMtYWN0aW9uX18i   `

Howeveer, in the output I am expecting only first line XJzLXJlc3VsdHMtYWN0aW9uX18i. Where I am commiting mistake?

When i run below command, I am getting expected output:

sed -n '/test1.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's@@@g'|perl -p -e 's@@@g'

It produces below string

lc3VsdHMtYWN0aW9uX18i

I can then easily route this to test1.txt file.

UPDATE

I have edited the question by updating the source file content. The source file doesn't contain any newline character. The current solution will not work in that case, I have tried it and failed. wc -l temp must output to 1.

OS: solaris 10 Shell: bash

user2607367 · Accepted Answer

/usr/xpg4/bin/sed works well here.

/usr/bin/sed is not working as expected in case if the file contains just 1 line.

below command works for a file containing only single line.

/usr/xpg4/bin/sed -n 's_$[^>]*$$.*$_\2_p' securebackup.xml 2>/dev/null

Without 2>/dev/null this sed command outputs the warning sed: Missing newline at end of file.

This because of the below reason:

Solaris default sed ignores the last line not to break existing scripts because a line was required to be terminated by a new line in the original Unix implementation.

GNU sed has a more relaxed behavior and the POSIX implementation accept the fact but outputs a warning.

Extracting the contents between two different strings using bash or perl

Answers (2)

Related Questions