Reputation:
How do I only keep Math1 from a file name HS18_Math1.pdf, sometimes it can be Math1.pdf
Here are some examples of file names:
HS18_Dbs1.pdf //keep Dbs1.pdf
FS19_Dbs2.pdf //keep Dbs2.pdf
FS19_Math2.pdf //keep Math2.pdf
FS19_OO2.pdf //keep OO2.pdf
FS19_An1I.pdf //keep An1I.pdf
I do not have any prior experience in RegEx
Thanks in advance for anyone who wants to help me
Upvotes: 0
Views: 1071
Reputation: 1996
If your data is always in the same format as all your examples, you could use Substring to help solve your problem and always substring from the index of the underscore plus 1.
here is an example:
string originalName = "FS19_Dbs2.pdf";
string newName = originalName.Substring(originalName.IndexOf("_") + 1);
The variable newName
above has the string file name like you were asking for.
Edit: For a Regex solution that does the same as the substring example above, you can use this regex pattern that wil get the last index of the underscore character and take the rest of the string after the underscore.
Regex pattern:
[^_]*$
example:
Regex regexTest = new Regex(@"[^_]*$");
string originalName = "FS19_Dbs2.pdf";
var match = regexTest.Match(originalName);
string newName = match.Value;
// newName contains "Dbs2.pdf".
The variable newName
above has the string file name like you were asking for.
Upvotes: 0
Reputation: 351
You can use substring to obtain the result you want and do not necessarily need to use RegEx for this scenario. Below code would give you the required string with or without extension depending on your scenario.
public static void Main(string[] args)
{
string s = "S18_Dbs1.pdf";
string result;
bool keepExtention = true;
if(keepExtention)
result = s.Substring(s.IndexOf('_') + 1);
else
result = s.Substring(s.IndexOf('_') + 1, s.IndexOf('.') - s.IndexOf('_') - 1);
Console.WriteLine(result);
}
If you really are interested in solving it through Regex only (again I do not see a need in this scenario and would not recommend it)
//this is the quivalent regex if you want to print the name with extension
var r = new Regex(@"(?<=_).*");
Console.WriteLine(r.Match(s));
//this is the quivalent regex if you want to print the name without extension
var r1 = new Regex(@"(?<=_).*(?=\.)");
Console.WriteLine(r1.Match(s));
The ?<=
is called positive lookbehind which would help in skipping the _
from match
The .*
is the the 'string' of chars after the underscore
The ?=\.
is called positive lookahead which would help in matching until .
I would recommend you go through the documentation on regex before you start playing around with it and even before that you determine if your scenario can be solved without regex as it makes your code easy to understand by others besides other benefits.
Upvotes: 0
Reputation: 46
This is fairly simple for regex to handle:
string newStr = Regex.Replace(yourInputStr, @"[a-zA-Z0-9]+\_", String.Empty);
Here are some helpful resources: https://regexone.com/ https://www.dotnetperls.com/regex https://regex101.com/
Upvotes: 1