Reputation: 2800
I acknowledge this has been asked in different ways in the past. However, I get lost with gsub
.
I have this dataframe:
df <- structure(list(Real = c(7.76, 5.55, 4.8, 4.68, 7.43, 4.59), Predicted = c(7.36,
5.28, 5.12, 4.47, 7.48, 4.69), PdivR = c(0.95, 0.95, 1.07, 0.96,
1.01, 1.02), Regression = c("`TLC`~`7_A`.152534", "`TLC`~`7_A`.158324",
"`TLC`~`7_A`.611461", "`TLC`~`7_A`.627267", "`TLC`~`7_A`.674564",
"`TLC`~`7_A`.675169")), row.names = c(NA, 6L), class = "data.frame")
Which can be displayed in this way:
head(df)
Real Predicted PdivR Regression
1 7.76 7.36 0.95 `TLC`~`7_A`.152534
2 5.55 5.28 0.95 `TLC`~`7_A`.158324
3 4.80 5.12 1.07 `TLC`~`7_A`.611461
4 4.68 4.47 0.96 `TLC`~`7_A`.627267
5 7.43 7.48 1.01 `TLC`~`7_A`.674564
6 4.59 4.69 1.02 `TLC`~`7_A`.675169
I would like to remove in the column Regression
the point .
and the numbers to the right of the point, and also this symbol (upper comma) in order to keep only TLC ~ 7_A
.
Be aware that the number of numbers to the right are diverse along the column, but the behaviour is the same.
How could I do it with gsub
?
Upvotes: 0
Views: 897
Reputation: 886998
We can match the .
(\\.
- escaped as it is a metacharacter that matches any character) and one or more digits (\\d+
) till the end ($
) of the string and replace with blank (""
) and wrap with gsub
to match the backquote ("`") and remove it
df$Regression <- gsub("`", "", sub("\\.\\d+$", '', df$Regression))
df$Regression
[1] "TLC~7_A" "TLC~7_A" "TLC~7_A" "TLC~7_A" "TLC~7_A" "TLC~7_A"
Upvotes: 1