cxou
cxou

Reputation: 121

How do you delete every line but those starting with one of several specific words? bash script

I am looking for a command that will allow me to strip everything from a variable but lines starting with one of several specific words.

I have looked at many a sed command but I simply cannot solve the issue. Either I am returned a ! instead of the first character in the wanted word and simply the last character deleted from the (in this case single) other line.

The following is merely an example of what I have tried - meaning that I do not seek alternatives to the method itself. Only how to clean the variable from everything but the given lines!

distro_raw="lsb_release -si"
distro=`echo $distro_raw | sed -r '/ubuntu/!d'`

I have tried other ways (not using -r but rather '/s//g' as well). I am merely trying to exemplify what I want with code. It is obviously wrong but may make the problem more clear.

EDIT:

A more clear example:

server_file=`cat /etc/apt/sources.list`
server`echo $server_file | sed ${what_to_write_before}deb${what_to_write_after}`

which will then delete everything but the line starting with "server". What I do not know is what to sorround the word "deb" - so that the command only returns the lines starting with "deb"

Example input:

# deb cdrom:[Ubuntu-Server 14.10 _Utopic Unicorn_ - Release amd64 (20141022.2)]/ utopic main restricted

# deb cdrom:[Ubuntu-Server 14.10 _Utopic Unicorn_ - Release amd64 (20141022.2)]/ utopic main restricted

# See http://help.ubuntu.com/community/UpgradeNotes for how to upgrade to
# newer versions of the distribution.
deb http://dk.archive.ubuntu.com/ubuntu/ vivid main restricted
deb-src http://dk.archive.ubuntu.com/ubuntu/ vivid main restricted
## Major bug fix updates produced after the final release of the
## distribution.
deb http://dk.archive.ubuntu.com/ubuntu/ vivid-updates main restricted
deb-src http://dk.archive.ubuntu.com/ubuntu/ vivid-updates main restricted

The wanted output is

deb http://dk.archive.ubuntu.com/ubuntu/ vivid main restricted
deb-src http://dk.archive.ubuntu.com/ubuntu/ vivid main restricted
deb http://dk.archive.ubuntu.com/ubuntu/ vivid-updates main restricted
deb-src http://dk.archive.ubuntu.com/ubuntu/ vivid-updates main restricted

Upvotes: 0

Views: 83

Answers (3)

josifoski
josifoski

Reputation: 1726

Reading title of question

sed -n '/^one\|^two\|^three/p' file  

will keep lines starting with one or two or three

sed '/^one\|^two\|^three/d' file

will delete lines starting with those words

Upvotes: 1

Charles Duffy
Charles Duffy

Reputation: 295443

Don't capture the file's contents and set up a pipeline -- just point grep (or sed, or awk, or any other text processing tool that can filter with a regex) directly against your file, using ^ to anchor your regex to the front of the line:

result=$(grep -E '^deb' </etc/apt/sources.list)

Now, if you had more words than just "deb", an alternation would be appropriate:

result=$(grep -E '^(deb|foo|bar)' </etc/apt/sources.list)

That said, if you want all non-comment content from the file, I wouldn't do it that way at all: Just filter out the comments and blank lines (including lines which are blank after removing comments):

sed -e 's/#.*//' </etc/apt/sources.list | grep -E -v '^[[:space:]]*'

Finally, for your amusement, here's an approach in pure bash that really does extract only server names, rather than putting whole lines into a variable named server, and filters them for uniqueness:

# Collect server URLs into an associative array
declare -A servers=( )
while read -r; do
  line=${REPLY%%#*}
  [[ $line ]] || continue
  read -r type url repos <<<"$line"
  echo "Found a line of type $type with url $url for repos $repos" >&2
  servers["$url"]=$repos
done </etc/apt/sources.list

# Iterate over the servers we found:
for server in "${!servers[@]}"; do
  echo "$server"
done

Upvotes: 5

mproffitt
mproffitt

Reputation: 2527

Using the example you supplied in your question, this could be as simple as

sed '/ubuntu/!d' /etc/apt/sources.list

Example output:

deb http://gb.archive.ubuntu.com/ubuntu/ trusty main restricted
deb-src http://gb.archive.ubuntu.com/ubuntu/ trusty main restricted
deb http://gb.archive.ubuntu.com/ubuntu/ trusty-updates main restricted
deb-src http://gb.archive.ubuntu.com/ubuntu/ trusty-updates main restricted
deb http://gb.archive.ubuntu.com/ubuntu/ trusty universe
deb-src http://gb.archive.ubuntu.com/ubuntu/ trusty universe
deb http://gb.archive.ubuntu.com/ubuntu/ trusty-updates universe
deb-src http://gb.archive.ubuntu.com/ubuntu/ trusty-updates universe
...
# deb-src http://extras.ubuntu.com/ubuntu trusty main

Upvotes: 0

Related Questions