Reputation:
I have a C# application, which I'm using RegEx to run an expect from a Unix response. I currently have this.
//will pick up :
// What is your name?:
// [root@localhost ~]#
// [root@localhost ~]$
// Do you want to continue [y/N]
// Do you want to continue [Y/n]
const string Command_Prompt_Only = @"[$#]|\[.*@(.*?)\][$%#]";
const string Command_Question_Only = @".*\?:|.*\[y/N\]/g";
const string Command_Prompt_Question = Command_Question_Only + "|" + Command_Prompt_Only;
This works as I've tested it with www.regexpal.com, but I think I need some optimization as there are times, it seems to slow way down when I use Command_Prompt_Question.
var promptRegex = new Regex(Command_Prompt_Question);
var output = _shellStream.Expect(promptRegex, timeOut);
I might want to mention I'm using SSH.NET to talk to these Linux servers, but I don't think it's a SSH.NET issue because when I use Command_Prompt_Only it's fast.
Does anyone see any issues with the const string I'm using? Is there a better way to do it?
My project is open source if you feel like you want to go play with it.
https://github.com/gavin1970/Linux-Commander
Code in question: https://github.com/gavin1970/Linux-Commander/blob/master/Linux-Commander/common/Ssh.cs
It's call Linux Commander and I'm attempting to build a virtual linux console with Ansible support.
Upvotes: 0
Views: 86
Reputation: 155055
Try this:
class Foo
{
const string Command_Prompt_Only = @"[$#]|\[.*@(.*?)\][$%#]";
const string Command_Question_Only = @".*\?:|.*\[y/N\]";
const string Command_Prompt_Question = "(?:" + Command_Question_Only + ")|(?:" + Command_Prompt_Only + ")";
private static readonly Regex _promptRegex = new Regex( Command_Prompt_Question, RegexOptions.Compiled );
public void Foo()
{
// ...
var output = _shellStream.Expect( _promptRegex, timeOut );
}
}
Upvotes: -1
Reputation: 31596
Does anyone see any issues with the const string I'm using?
Yes too much backtracking in those patterns.
If one knows that there is at least one item, specifying a *
(zero or more) can cause the parser to look over many zero type assertions. Its better to prefer the+
(one or more) multiplier which can shave a lot of time off of researching dead ends in backtracking.
This is interesting \[.*@(.*?)\]
why not use the negative set ([^ ]
) pattern instead such as this change:
\[[^@]+@[^\]+\]
Which says anchor off of a literal "[" and the find 1 or more items that are not a literal "@" ([^@]+
) and then find 1 or more items that are not a literal "]" by [^\]+
.
Upvotes: 1