Reputation: 454
I have a question for how to compare two strings.
here is the code.
string stringA = "This is a test item";
string stringB = "item test a is This";
Obviously, stringB contains every words from stringA, but in a different order.
My desired result should be TRUE.
My question is, what should I do? I have tried to use the .Contains() method, but the result is FALSE.
Thanks everyone.
UPDATES
Thanks everyone for the kindly replies.
Here is my clarification
I am actually building a database search function by using LINQ and EF.
Assume that, an item has its name as "This is a test item".
If user input "test a is this", I would like the function smart enough to catch the item mentioned above.
Any suggestion?
ANOTHER UPDATE
Thanks again for all your help.
I do like Peter Ritchie's, codesparkle's, Dave's and EdFred's suggestion.
Upvotes: 1
Views: 992
Reputation: 15805
Building on Peter Richie's excellent suggestion, using Array.Sort()
instead of List<T>.Sort()
, without the duplication but packed into a neat extension method:
public static bool ContainsSameWordsAs(this string first, string second)
{
return first.GetSortedWords().SequenceEqual(second.GetSortedWords());
// if upper and lower case words should be seen as identical, use:
// StringComparer.OrdinalIgnoreCase as a second argument to SequenceEqual
}
private static IEnumerable<string> GetSortedWords(this string source)
{
var result = source.Split().ToArray();
Array.Sort(result);
return result;
}
Usage
string stringA = "This is a test item";
string stringB = "item test a is This";
string stringC = "Not the Same is This";
bool result = stringA.ContainsSameWordsAs(stringB); // true
bool different = stringA.ContainsSameWordsAs(stringC); // false
Edit: It's hard to understand why you accepted an answer that does not comply with the requirements stated in your question. If you really want the strings "This is a test item"
and "test a is this"
to match, you'd need to use something a bit more involved, such as:
public static bool ContainsSameWordsAs(this string first, string second)
{
var ignoreCase = StringComparer.OrdinalIgnoreCase;
return first.Split().Any(word => second.Split().Contains(word, ignoreCase));
}
You may want to come up with a better algorithm though, as this one is extremely loose -- two identical words will be enough to count as a match. But this one will match your requirements as stated in the question.
Upvotes: 1
Reputation: 35881
'String.Split' the words with a space delimiter, sort the resulting array into a List, then compare the list. For example:
var x = new List<string>(stringA.Split(' '));
x.Sort();
var y = new List<string>(stringB.Split(' '));
y.Sort();
bool areEqual = x.SequenceEqual(y);
UPDATE If you want case-insensitive:
var x = new List<string>(stringA.Split(' '));
x.Sort();
var y = new List<string>(stringB.Split(' '));
y.Sort();
bool areEqual = x.SequenceEqual(y, StringComparer.OrdinalIgnoreCase);
But, if you're looking for something that will be executed in SQL Server, then you'll likely need something else.
Upvotes: 6
Reputation: 1789
bool match = true;
string[] stringBSplit = stringB.Split(' ');
foreach (string aString in stringA.Split((' ')))
{
if (!stringBSplit.Contains(aString))
{
match = false;
break;
}
}
Upvotes: 0
Reputation: 1780
You can split the strings on white space and compare the two resulting collections. You can use set operations from linq:
If the words from the first string are in collection words1, and words from the second string are in words2 you can perform the following operation:
if(!words1.Intersect(words2).Except( words1).Any()) -> your sentences are 'equal'
Upvotes: 0
Reputation: 8103
Modeled after Bernard's explanation.
A lot of people left out a key part. You need to convert the strings .ToLower() before you make a comparison.
EDIT: This is what you need. Made it more readable with Linq.
public static bool Compare (string wordOne, string wordTwo)
{
//split into words
var wordsOne = wordOne.ToLower().Split(' ').ToList();
var wordsTwo = wordTwo.ToLower().Split(' ').ToList();
if (wordsOne.Count() != wordsTwo.Count()) {
return false;
}
//sort alphabetically
wordsOne.Sort((x,y) => string.Compare(x, y));
wordsTwo.Sort((x,y) => string.Compare(x, y));
//compare
for (int i = 0; i < wordsOne.Count(); i++) {
if(wordsOne[i] != wordsTwo[i])
return false;
}
return true;
}
Upvotes: 2
Reputation: 711
I would split the string in to tokens and test that all tokens in stringA exist in the token list for stringB. Something like:
var stringBTokens = stringB.Split(" ");
foreach(string token in stringA.Split(" "))
{
if(stringBTokens.Contains(token) == false) return false;
}
return true;
There may be some weird regular expression that can do this, but this is a fairly straight forward test. If you want to get fancy you could use the Linq Any method as well like this:
var stringBTokens = stringB.Split(" ");
return !stringBTokens.Any(token => stringA.Contains(token));
This is basically doing the same thing, just the latter I find a bit more elegant. I hope there are no errors, I'm on my macbook pro and don't have anything .net related (or mono, etc...) installed to verify this works.
Update
Based on your clarification, I would have a look at http://en.wikipedia.org/wiki/Inverted_index
This sounds like what you are trying to achieve. I have created these before to do rapid text searches in a database and it works very effectively.
Upvotes: 2
Reputation: 23786
stringA.OrderBy(c => c).SequenceEqual(stringB.OrderBy(c => c));
Edit Oops. Wrong approach. That'll teach me to answer too fast.
I believe this should work:
stringA.Split(' ').OrderBy(w => w).SequenceEqual(stringB.Split(' ').OrderBy(w => w));
Upvotes: 1
Reputation: 8937
I would do a 2 stage comparison:
Split the strings by ' ' using the .Split(' ') method and ensure they have the same number of elements (.Count property).
Cast the newly created arrays (split strings) as Sets and do a A set-difference B Union B set-difference A.
Then iff test 1 passes and there are no elements in the set created by the unions of the set differences (in test 2), you have successfully compared the words in the strings as described above.
Michael G.
Upvotes: 1
Reputation: 7961
Try this approach:
Upvotes: 4