Reputation: 4919
I'm trying to parse some js files (ExtJS) and find all dependencies that are used by class in that file.
Sample js file looks like so:
Ext.define('Pandora.controller.Station', {
extend: 'Ext.app.Controller',
refs: [{
ref: 'stationsList',
selector: 'stationslist'
}],
stores: ['Stations', 'RecentSongs'],
...
What I want to get is Ext.app.Controller
.
With my code I'm able to get all lines that contains extend
public void ReadAndFilter(string path)
{
using (var reader = new StreamReader(path))
{
string line;
while ((line = reader.ReadLine()) != null)
{
if (line.Contains("extend"))
{
listBox2.Items.Add(line);
}
}
}
}
But this also returns comments and other unnecessary things. My idea was to use RegEx to find all strings.
My problem is that sometimes line has some spaces in front and after extend.
Here are some samples that can be found in js files:
extend : 'Ext.AbstractPlugin', extend: 'Ext.util.Observable', @extends Sch.feature.AbstractTimeSpan extend : "Sch.feature.AbstractTimeSpan", extend : "Sch.plugin.Lines", extend : "Sch.util.DragTracker",
Running RegEx on this should return:
Ext.AbstractPlugin
Ext.util.Observable
Sch.feature.AbstractTimeSpan
Sch.plugin.Lines
Sch.util.DragTracker
Here is my attempt: extend[ ]*:[ ]*['"][a-zA-Z.]*['"]
, I've tested it here, but I want only to get part between quotes or double quotes (can this be also validated? So that we can exclude those with first quote and second double quote).
RegEx aren't maybe fastest, but I have no idea how else I could do that.
Any advices are welcome.
Upvotes: 1
Views: 1967
Reputation: 71598
You can simply use a capture group; you wrap the required part between parentheses:
extend[ ]*:[ ]*['"]([a-zA-Z.]*)['"]
And you access them through .Groups[1].Value
EDIT: As per request:
extend *: *('|")(?<inside>[a-zA-Z.]*)\1
With this one, you can access the captured group with .Groups["inside"].Value
Upvotes: 4
Reputation: 5628
extend\s*:\s?("|')(.*)\1
\1
is a reference to whatever is captured by the parentheses in ("|')
, so it will force the quotes to match up correctly.
In this case, the matched part (that you want) winds up in Groups[2].Value
Also, simply a stylistic suggestion: don't use [ ]*
for matching spaces, those grouping brackets look too confusing when empty. A simple \s*
is easier to read and clear to understand.
Upvotes: 4
Reputation: 15875
You are only missing a capture group. Note the parens around [a-zA-Z.]*
extend([ ]*):[ ]*['"]([a-zA-Z.]*)['"]
To implement this try:
var result = from Match match in Regex.Matches(line, "extend([ ]*):[ ]*['"]([a-zA-Z.]*)['"]")
select match.ToString();
Upvotes: 2