Background: I have a config file that stores values in variations of this format: (below is an example using fictitious data) 'names': { "john", "jeff", "stewie", "amy", "emily" } Some formatting details: there is never a space between 'names' and : there may-or-may-not be a space between "{" and "john" there is always a space between members of the list ("john" always has a ", " before "jeff") there may-or-may-not be a space between "emily" and "}" the elements in this list may be delineated by line instead of by space. For example, this is also acceptable: 'names': { "john", "jeff", "stewie", "amy", "emily" } So is this: 'names': { "john", "jeff", "stewie", "amy", "emily" } The functionality I'm trying to create: I'd like to delete "amy" from the list named 'names'. I've been trying to create this behavior using sed, but I'm open to using bash, awk, cut or some combination of these. If the elements of the list were on a single line, this would be easy: /bin/sed -i "/names/ s/ ${element}//" $f (where $element contains "amy" and $f contains the file I'm editing But the multi-line possibility throws me. Thoughts?

Let's consider this input file that contains all three cases: $ cat file 'names': { "john", "jeff", "stewie", "amy", "emily" } 'names': { "john", "jeff", "stewie", "amy", "emily" } 'names': { "john", "jeff", "stewie", "amy", "emily" } Now, let's apply this sed command to remove amy : $ sed '/names/{:a;/}/!{N;b a}; s/"amy",[[:space:]]*//}' file 'names': { "john", "jeff", "stewie", "emily" } 'names': { "john", "jeff", "stewie", "emily" } 'names': { "john", "jeff", "stewie", "emily" } How it works /names/ Any time a line contains names , we begin our commands. Other lines are passed through unchanged. :a; /}/! {N;b a} Once we have our line containing names , we read in additional lines until we get one that contains the closing brace. This gets the complete names assignment in at once even if it is spread over multiple lines. In more detail, :a is a label. /}/! is a condition. If the line does not contain } , then the statements N; b a are executed. N reads in the next line and adds it to the pattern space. b a jumps (branches) back to the label a . Thus, this continues until the complete assignment, from names to } , is in the pattern space. s/"amy",[[:space:]]*//} With the complete names assignment in sed's pattern space, we look for "amy", and any whitespaces which follow and we remove them. Removing amy even when she is last in the list The above solution assumes that a comma follows the name amy . Suppose, however, that amy is could be the last name in the list, as in the following file: $ cat file 'names': { "john", "jeff", "stewie", "emily", "amy" } 'names': { "john", "jeff", "stewie", "emily", "amy" } 'names': { "john", "jeff", "stewie", "emily", "amy"} To handle this situation, we need add one substitute command: $ sed '/names/{:a;/}/!{N;b a}; s/"amy",[[:space:]]*//; s/,[[:space:]]*"amy"//}' file 'names': { "john", "jeff", "stewie", "emily" } 'names': { "john", "jeff", "stewie", "emily" } 'names': { "john", "jeff", "stewie", "emily"}

regexbashawksed

carab1n3r

Reputation: 13

Delete a string between 2 delimiters on possibly different lines

Background: I have a config file that stores values in variations of this format:

(below is an example using fictitious data)

'names': { "john", "jeff", "stewie", "amy", "emily" }

Some formatting details:

there is never a space between 'names' and :
there may-or-may-not be a space between "{" and "john"
there is always a space between members of the list ("john" always has a ", " before "jeff")
there may-or-may-not be a space between "emily" and "}"
the elements in this list may be delineated by line instead of by space. For example, this is also acceptable:
```
'names': { "john",
           "jeff",
           "stewie",
           "amy",
           "emily"
         }
```

So is this:

    'names': { "john", "jeff", "stewie",
               "amy", "emily" }

The functionality I'm trying to create: I'd like to delete "amy" from the list named 'names'.

I've been trying to create this behavior using sed, but I'm open to using bash, awk, cut or some combination of these.

If the elements of the list were on a single line, this would be easy:

/bin/sed -i "/names/ s/ ${element}//" $f

(where $element contains "amy" and $f contains the file I'm editing

But the multi-line possibility throws me.

Thoughts?

Upvotes: 1

Answers (3)

John1024

Reputation: 113834

Let's consider this input file that contains all three cases:

$ cat file
'names': { "john", "jeff", "stewie", "amy", "emily" }
'names': { "john",
           "jeff",
           "stewie",
           "amy",
           "emily"
         }
'names': { "john", "jeff", "stewie",
               "amy", "emily" }

Now, let's apply this sed command to remove amy:

$ sed '/names/{:a;/}/!{N;b a}; s/"amy",[[:space:]]*//}' file
'names': { "john", "jeff", "stewie", "emily" }
'names': { "john",
           "jeff",
           "stewie",
           "emily"
         }
'names': { "john", "jeff", "stewie",
               "emily" }

How it works

/names/

Any time a line contains names, we begin our commands. Other lines are passed through unchanged.
:a; /}/! {N;b a}

Once we have our line containing names, we read in additional lines until we get one that contains the closing brace. This gets the complete names assignment in at once even if it is spread over multiple lines.

In more detail, :a is a label. /}/! is a condition. If the line does not contain }, then the statements N; b a are executed. N reads in the next line and adds it to the pattern space. b a jumps (branches) back to the label a. Thus, this continues until the complete assignment, from names to }, is in the pattern space.
s/"amy",[[:space:]]*//}

With the complete names assignment in sed's pattern space, we look for "amy", and any whitespaces which follow and we remove them.

Removing amy even when she is last in the list

The above solution assumes that a comma follows the name amy. Suppose, however, that amy is could be the last name in the list, as in the following file:

$ cat file
'names': { "john", "jeff", "stewie", "emily", "amy" }
'names': { "john",
           "jeff",
           "stewie",
           "emily",
           "amy"
         }
'names': { "john", "jeff", "stewie",
               "emily", "amy"}

To handle this situation, we need add one substitute command:

$ sed '/names/{:a;/}/!{N;b a}; s/"amy",[[:space:]]*//; s/,[[:space:]]*"amy"//}' file
'names': { "john", "jeff", "stewie", "emily" }
'names': { "john",
           "jeff",
           "stewie",
           "emily"
         }
'names': { "john", "jeff", "stewie",
               "emily"}

Upvotes: 2

FlyingGuy

Reputation: 333

Why not just use bash string handling routines, copied directly from:

http://tldp.org/LDP/abs/html/string-manipulation.html

stringZ=abcABC123ABCabc
echo ${stringZ/abc/xyz}

result = bcABC123ABCxyz

In your case

export stringZ="\'names\': \{ \"john\", \"jeff\", \"stewie\", \"amy\", \"emily\" }"

echo ${stringZ/\"amy\",/}

returns 'names': { "john", "jeff", "stewie", "emily" }

Upvotes: 0

repzero

Reputation: 8412

using sed as follows:

sed  -r ':loop;$!{N;b loop};s/(.names.: ?\{[^}]*)"amy",? *([^}]*\})/\1\2/g' my-file

Upvotes: 0

Delete a string between 2 delimiters on possibly different lines

Answers (3)

How it works

Removing amy even when she is last in the list

Related Questions