Reputation: 57
Below is a code in C# for rss reader, why this code is bad? this class generates a list of the 5 most recent posts, sorted by title. What do you use to analyze code in C#?
static Story[] Parse(string content)
{
var items = new List<string>();
int start = 0;
while (true)
{
var nextItemStart = content.IndexOf("<item>", start);
var nextItemEnd = content.IndexOf("</item>", nextItemStart);
if (nextItemStart < 0 || nextItemEnd < 0) break;
String nextItem = content.Substring(nextItemStart, nextItemEnd + 7 - nextItemStart);
items.Add(nextItem);
start = nextItemEnd;
}
var stories = new List<Story>();
for (byte i = 0; i < items.Count; i++)
{
stories.Add(new Story()
{
title = Regex.Match(items[i], "(?<=<title>).*(?=</title>)").Value,
link = Regex.Match(items[i], "(?<=<link>).*(?=</link>)").Value,
date = Regex.Match(items[i], "(?<=<pubDate>).*(?=</pubdate>)").Value
});
}
return stories.ToArray();
}
Upvotes: 1
Views: 258
Reputation: 20157
It is bad because it's using string parsing when there are excellent classes in the framework for parsing XML. Even better, there are classes to deal with RSS feeds.
ETA:
Sorry to not have answered your second question earlier. There are a great number of tools to analyze correctness and quality of C# code. There's probably a huge list compiled somewhere, but here's a few I use on a daily basis to help ensure quality code:
Upvotes: 3
Reputation: 38217
For starters, it uses byte
as a indexer instead of int
(what if items
has more items in it than a byte
can represent?). It doesn't use idiomatic C# (see user1645569's response). It also unnecessarily uses var
instead of specific data types (that's more stylistic though, but for me I prefer not to, hence by my metric it's not ideal (and you've given no other metric)).
Let me clarify what I am saying about "unnecessarily using var
": var
in and of itself is not bad, and I am not suggesting that. I am (mostly) suggesting the usage here isn't very consistent. For example, explicitly declaring start
as an int
, but then declaring nextItemEnd
as var
(which will deduce to int
) and assigning nextItemEnd
to start
seems (to me) like a weird mixture between wanting to automatically deduce a variable's type and explicitly declaring it. I think it's good that var
was not used in start
's declaration (because then it's not exactly clear if the intent is an integer or a floating point number), but I (personally) don't think it helps any to declare nextItemStart
and nextItemEnd
as var
. I tend to prefer to use var
for more complex/longer data types (similar to how I use auto
in C++ for iterators, but not for "simpler" data types).
Upvotes: 1
Reputation: 23319
simply because you are reinventing a xml parser
, use Linq to xml
instead, it is very simple and clean.I am sure that you can do all the above in three lines of code if you use Linq to XML
, your code uses a lot of magic numbers (ex: 7-n ..), the thing that make it unstable and non generic
Upvotes: 1
Reputation: 43503
I think the worst thing for the code is the performance issue. You should parse the xml string into a XDocument(or similar structure) instead of parse it again and again using regex.
Upvotes: 1
Reputation: 15909
You shouldn't parse XML with string functions and regular expressions. XML can get very complicated and be formatted many ways that a real XML parser like XmlReader can handle, but will break your simple string parsing code.
Basically: don't try and reinvent the wheel (an xml parser), especially when you don't realize how complicated that wheel actually is.
Upvotes: 1