Darryl at NetHosted
Darryl at NetHosted

Reputation: 303

Linux Text File Manipulation with sed/awk

I have a list in the following format

77 Infinite Dust
4 Illusion Dust
12 Dream Shard
29 Star's Sorrow

I need to change this to:

77 <a href="http://www.wowhead.com/?search=Infinite Dust">Infinite Dust</a>
4 <a href="http://www.wowhead.com/?search=Illusion Dust">Illusion Dust</a>
12 <a href="http://www.wowhead.com/?search=Dream Shard">Dream Shard</a>
29 <a href="http://www.wowhead.com/?search=Star's Sorrow">Star's Sorrow</a>

I've managed to get this list to the right format just missing the numbers by using:

sed 's|^[0-9]*.|<a href="http://www.wowhead.com/?search=|g' filename | sed 's|$|">|g' | sed 's#<a[ \t][ \t]*href[ \t]*=[ \t]*".*search=\([^"]*\)">#&\1</a>#'

But I can't figure out how to get it to keep the numbers before the list, any help appreciated, thanks!

Upvotes: 0

Views: 1281

Answers (5)

Alok Singhal
Alok Singhal

Reputation: 96191

If you had told us what you were ultimately trying to do in your last question, we would have told you a much easier way to do so.

As I said in my answer to your last question, you can have sed remember a part of the pattern, and refer to that part as \1, \2, etc.

You need to remember the number and the rest of the line separately, so the pattern is: \([0-9]*\) \(.*\): which is basically zero of more digits, followed by space, followed by any number of characters.

So your sed command becomes:

`sed -e 's|\([0-9]*\) \(.*\)|\1 <a href="http://www.wowhead.com/?search=\2">\2</a>|'

That command does everything you want in one go.

Upvotes: 2

Jerry Coffin
Jerry Coffin

Reputation: 490398

With awk it would be something like:

{  
   rest = substr($0, length($1)+2, length($0));
   printf("%d <a href=\"http://www.wowhead.com/?search=%s\">%s</a>\n", $1, rest, rest); 
}

Upvotes: 0

ghostdog74
ghostdog74

Reputation: 342669

awk '
{
    s=""
    for(i=2;i<NF;i++) s=s$i
    s=s" "$NF
    printf $1 "<a href=\"http://www.wowhead.com/?search="s
    print "\042>"s"</a>"

} ' file

output

$ ./shell.sh
77<a href="http://www.wowhead.com/?search=Infinite Dust">Infinite Dust</a>
4<a href="http://www.wowhead.com/?search=Illusion Dust">Illusion Dust</a>
12<a href="http://www.wowhead.com/?search=Dream Shard">Dream Shard</a>
29<a href="http://www.wowhead.com/?search=Star's Sorrow">Star's Sorrow</a>

Upvotes: 1

sienkiew
sienkiew

Reputation: 473

In sed, you can use the & character to place the matched pattern in the replacement text. For example:

echo xyz | sed 's/^xyz/abc &/'

would output

abc xyz

So in your example,

sed 's|^[0-9]*.|& <a href ....

Upvotes: 0

Steve B.
Steve B.

Reputation: 57325

You can do this with sed by mapping the line parts to groups. in sed groups the A and B in (A)--(B) match to \1 and \2, with the added wrinkle that the "()" need to be escaped: e.g.

sed 's/\([0-9]*\)\ \(.*\)$/\1 -- \2/g' testfile

maps the numbers up to the space to group 1 and everything following to group 2. You can then map group 1 and 2 to whatever you like -, e.g. by changing the sed replacement to something like

 \1 <a href.....\2">\2</a>

Upvotes: 3

Related Questions