imre
imre

Reputation: 489

csplit - what am I doing wrong?

I have this .xml file:

<docs>
<doc>
Some text
</doc>
<doc>
here some
</doc>
<doc>
text here
</doc>
</docs>

I am trying to use csplit in order to get only the text parts. This is what I came up with.

$ csplit docs.xml '%^<docs>%1' '/^<\/doc/1' '{*}'

Upvotes: 1

Views: 442

Answers (1)

Saddam Abu Ghaida
Saddam Abu Ghaida

Reputation: 6749

if the file structure like the one you included you can extract the content by doing grep -v "^<" x or more conveniant approach cat x|sed -e 's/<[^>]*>//g'|grep -v '^$' or to do it the csplit way based on the comments below you can do it lik this

cat doc.xml | egrep -v '<?xml version="1.0" \?>|<docs>|</docs>' | csplit -q -z - '/<doc/' '{*}' --prefix=out-

Upvotes: 1

Related Questions