Reputation: 2861
Below is an example of a string I have ingested into R:
General\\Contingency\\Import\\Import_Manual\\New\\ADC170001A13_Loc.txt
I am trying to isolate the 'ADC170001A13' I have tried substring and also a gsub to remove everything apart from that part of the string but I get the below error:
Error in gsub(clean, "", TextLOCfiles) :
invalid regular expression '\\Fs01 \DepartmentFolders$\General\Contingency\Import\Import_Manual\New\', reason 'Trailing backslash'
In addition: Warning message:
In gsub(clean, "", TextLOCfiles) :
argument 'pattern' has length > 1 and only the first element will be used
Upvotes: 2
Views: 245
Reputation: 269586
Try this:
library( tools )
basename( file_path_sans_ext( TextLOCfiles ) )
or without addon packages:
sub( "\\.[^.]*$", "", basename( TextLOCfiles ) )
These solutions do not require that you know the file name or extension and also work if there is no extension.
Upvotes: 2
Reputation: 59110
You can capture the needed part with gsub
and parentheses:
> gsub(".*\\\\(\\w+)_.*", "\\1", TextLOCfiles)
[1] "ADC170001A13"
Upvotes: 2
Reputation: 545588
The easiest solution is to use regmatches
:
> rxmatch = regexpr('(?<=\\\\)\\w+(?=_Loc\\.)', TextLOCfiles, perl = TRUE)
> regmatches(TextLOCfiles , rxmatch)
ADC170001A13
perl = TRUE
is required in order to get the zero-width assertions, as mentioned by Simon in the comments.
Upvotes: 2
Reputation: 6534
This looks like a file path. If this is true then you can simply use basename() as follows:
sub(".txt", "", basename(TextLOCfiles))
Upvotes: 2