Michael Mol
Michael Mol

Reputation: 538

Remove suffix of a delimited multiline string

While trying to script safely handling filenames while handling newlines safely, I came across a difficult test case.

Given the input

a.b.c
.d.staging

where this input represents a single filename, I want to strip the .staging suffix. I would normally use something akin to | rev | cut -d. -f2- | rev for this, but this fails:

echo -ne "a.b.c\\n.d.staging" | rev | cut -d. -f2- | rev

yields

a.b
.d

In addition to having lost the c component in addition to the staging suffix, there's also a lone newline at the end there Markdown is hiding.

The best solution I've come up with so far is to use sed -e ':a' -e 'N' -e '$!ba' -e 's/\(.*\)\..*/\1/', which appears to work:

echo -ne "a.b.c\\n.d.staging" | sed -e ':a' -e 'N' -e '$!ba' -e 's/\(.*\)\..*/\1/'

yields

a.b.c
.d

which is the correct output.

This seems an inelegant solution, as it's hammering sed into handling newlines, which is something sed isn't great at doing.

Is there a more elegant solution? Ideally a POSIX-compatible one.

Upvotes: 1

Views: 207

Answers (2)

anubhava
anubhava

Reputation: 785721

Using BASH you can do:

$> s=$'a.b.c\n.d.staging'

$> echo "$s"
a.b.c
.d.staging

$> echo "${s%.staging}"
a.b.c
.d

Without BASH support you can use awk like this using null RS:

printf "%b" 'a.b.c\n.d.staging' | awk -v RS= '{sub(/\.[^.]+$/, "")} 1'

a.b.c
.d

Upvotes: 2

chepner
chepner

Reputation: 532003

If you have the name in a variable, the newline is not an issue.

$ fname=$'a.b.c\n.d.staging'
$ echo "$fname"
a.b.c
.d.staging
$ echo "${fname%.*}"
a.b.c
.d
$

Upvotes: 4

Related Questions