ricardo
ricardo

Reputation: 8425

remove initial period and text after final period in string

I have a regex edge case that I am unable to solve. I need to grep to remove the leading period (if it exists) and the text following the last period (if it exists) from a string.

That is, given a vector:

x <- c("abc.txt", "abc.com.plist", ".abc.com")

I'd like to get the output:

[1] "abc"     "abc.com" "abc"

The first two cases are solved already I obtained help in this related question. However not for the third case with leading .

I am sure it is trivial, but i'm not making the connections.

Upvotes: 2

Views: 1562

Answers (1)

Tim Pietzcker
Tim Pietzcker

Reputation: 336158

This regex does what you want:

^\.+|\.[^.]*$

Replace its matches with the empty string.

In R:

gsub("^\\.+|\\.[^.]*$", "", subject, perl=TRUE);

Explanation:

^      # Anchor the match to the start of the string
\.+    # and match one or more dots
|      # OR
\.     # Match a dot
[^.]*  # plus any characters except dots
$      # anchored to the end of the string.

Upvotes: 4

Related Questions