user7033723
user7033723

Reputation:

How to find string with punctuation at the end

I'm trying to understand, why when I check my text document content (updated content, each time it is new string) with new insert for similar already exist string, for example if document content is:

hello world
hello, world
hello, world.
.hello, world

it founds new added string if it is already exists in content of file, if it is "hello world" or "hello, world", with simple checking condition, which notifies me if string already exist (and there is no any limitations or other conditions about last symbol in string):

 List<string> wordsTyped = new List<string>(); 

    if (wordsTyped.Contains(newStr))
    {
        string[] allLines = File.ReadAllLines(path);
    }

but it doesn't notifies me if I have in my document content string with punctuation mark at the end or in the beginning of the string. For example if "hello, world." which is already exist, and new insert is similar "hello, world." or ",hello, world" it does not find it and notifies me as non exist.

If there is no solution to figure out with this problem and I am forced to remove last special symbol in the string, in this case would be good also to know, how to do it with regex for certain symbols dot, comma, hash, and apostrophe and keep everything else of course

Upvotes: 0

Views: 913

Answers (1)

Thomas Ayoub
Thomas Ayoub

Reputation: 29441

You might want to use a HashSet to store the string you already have since the access is way faster. Then remove all the characters you don't want in the string:

static String beautify(String ugly)
{
    return String.Join("", ugly.Where(c => Char.IsLetter(c)));
}

Here I took the liberty to check only if the character is a letter, you can, of course, adapt it to feel your needs. Then use this little program:

static HashSet<String> lines = new HashSet<String>();
static List<String> input = new List<String>()
{
    "hello world","hello, world","hello, world.",".hello, world",
};

static void Main(String[] args)
{
    initList(input);
    var tests = new List<String>() {
        "h,e.l!l:o. w----orl.d.",// True
        "h,e.l!l:o. w----ol.d.",// False

    };

    foreach(var test in tests)
    {
        Console.WriteLine($"The string \"{test}\" is {(lines.Contains(beautify(test)) ? "already" : "not" )} here"); 
    }

    Console.ReadLine();
}

static void initList(List<String> input)
{
    foreach(String s in input)
        lines.Add(beautify(s));
}

static String beautify(String ugly)
{
    return String.Join("", ugly.Where(c => Char.IsLetter(c)));
}

Which will output:

The string "h,e.l!l:o. w----orl.d." is already here

The string "h,e.l!l:o. w----ol.d." is not here


You can use an HashSet like so:

lines
Count = 4
    [0]: "hello world"
    [1]: "hello, world"
    [2]: "hello, world."
    [3]: ".hello, world"
lines.Contains("hello, world.")
true
lines.Contains("hello, world..")
false

Upvotes: 1

Related Questions