RNs_Ghost
RNs_Ghost

Reputation: 1777

using sed to copy lines and delete characters from the duplicates

I have a file that looks like this:

@"Afghanistan.png",
@"Albania.png",
@"Algeria.png",
@"American_Samoa.png",

I want it to look like this

@"Afghanistan.png",
@"Afghanistan",
@"Albania.png",
@"Albania",
@"Algeria.png",
@"Algeria",
@"American_Samoa.png",
@"American_Samoa",

I thought I could use sed to do this but I can't figure out how to store something in a buffer and then modify it.

Am I even using the right tool?

Thanks

Upvotes: 18

Views: 23528

Answers (5)

vi-user
vi-user

Reputation: 1

or one can combine both versions and apply the duplication only on lines matching the required pattern

sed -e '/^@".*\.png",/{p;s/\.png//;}' input

Upvotes: 0

courtlandj
courtlandj

Reputation: 445

I prefer this over Carles Sala and Glenn Jackman's:

sed '/.png/p;s/.png//'

Could just say it's personal preference.

Upvotes: 10

Carles Sala
Carles Sala

Reputation: 2109

Glenn jackman's response is OK, but it also doubles the rows which do not match the expression.

This one, instead, doubles only the rows which matched the expression:

sed -n 'p; s/\.png//p'

Here, -n stands for "print nothing unless explicitely printed", and the p in s/\.png//p forces the print if substitution was done, but does not force it otherwise

Upvotes: 18

brandizzi
brandizzi

Reputation: 27050

That is pretty easy to do with sed and you not even need to use the hold space (the sed auxiliary buffer). Given the input file below:

$ cat input 
@"Afghanistan.png",
@"Albania.png",
@"Algeria.png",
@"American_Samoa.png",

you should use this command:

sed 's/@"\([^.]*\)\.png",/&\
@"\1",/' input 

The result:

$ sed 's/@"\([^.]*\)\.png",/&\
@"\1",/' input 
@"Afghanistan.png",
@"Afghanistan",
@"Albania.png",
@"Albania",
@"Algeria.png",
@"Algeria",
@"American_Samoa.png",
@"American_Samoa",

This commands is just a replacement command (s///). It matches anything starting with @" followed by non-period chars ([^.]*) and then by .png",. Also, it matches all non-period chars before .png", using the group brackets \( and \), so we can get what was matched by this group. So, this is the to-be-replaced regular expression:

@"\([^.]*\)\.png",

So follows the replacement part of the command. The & command just inserts everything that was matched by @"\([^.]*\)\.png", in the changed content. If it was the only element of the replacement part, nothing would be changed in the output. However, following the & there is a newline character - represented by the backslash \ followed by an actual newline - and in the new line we add the @" string followed by the content of the first group (\1) and then the string ",.

This is just a brief explanation of the command. Hope this helps. Also, note that you can use the \n string to represent newlines in some versions of sed (such as GNU sed). It would render a more concise and readable command:

sed 's/@"\([^.]*\)\.png",/&\n@"\1",/' input 

Upvotes: 14

glenn jackman
glenn jackman

Reputation: 246847

You don't have to get tricky with regular expressions and replacement strings: use sed's p command to print the line intact, then modify the line and let it print implicitly

sed 'p; s/\.png//'

Upvotes: 19

Related Questions