user1916528
user1916528

Reputation: 389

How to use regex in C# to remove all unwanted characters from a string?

I have tried a solution offered here Regex Code Review but I can't seem to get it to work. It's the same exact scenario that I'm dealing with. I'm not familiar with regex yet, but I would like to use it to remove the characters "CN=" from a string, and then everything after the first comma in the string. For example,

CN=Joseph Rod,OU=LaptopUser,OU=Users,DC=Company,DC=local

becomes

Joseph Rod

Code:

protected void Page_Load(object sender, EventArgs e)
    {
        DataTable dt = new DataTable();

        dt.Columns.AddRange(new DataColumn[5]
        {
            new DataColumn("givenName", typeof (string)),
            new DataColumn("sn", typeof (string)),
            new DataColumn("mail", typeof (string)),
            new DataColumn("department", typeof (string)),
            new DataColumn("manager", typeof (string))
        });

        using (var context = new PrincipalContext(ContextType.Domain, null))
        {
            using (var group = GroupPrincipal.FindByIdentity(context, "Users"))
            {
                var users = group.GetMembers(true);
                foreach (UserPrincipal user in users)
                {
                    DirectoryEntry de = user.GetUnderlyingObject() as DirectoryEntry;
                    dt.Rows.Add
                    (
                        Convert.ToString(de.Properties["givenName"].Value),
                        Convert.ToString(de.Properties["sn"].Value),
                        Convert.ToString(de.Properties["mail"].Value),
                        Convert.ToString(de.Properties["department"].Value),
                        Regex.Replace((Convert.ToString(de.Properties["manager"].Value)), @"CN=([^,]*),", "$1")
                    );
                }
                rgAdUsrs.DataSource = dt;
                rgAdUsrs.DataBind();
            }
        }
    }

My code however just removes "CN=" and the first comma. I need everything from the first comma and to the right to be removed as well.

Results of above code:

Joseph RodOU=LaptopUser,OU=Users,DC=Company,DC=local

How can I modify the regex to remove the characters to the right of the comma as well?

Upvotes: 1

Views: 497

Answers (2)

wp78de
wp78de

Reputation: 18950

To remove the rest of the line as well

CN=([^,]*),.*$

and then replace with $1.

Regex Demo

However, as already mentioned, you do not really need regex to achieve this. This will search for a string between the first = and the first ,.

Console.WriteLine(input.Substring(input.IndexOf("=") + 1, input.IndexOf(',') - (input.IndexOf("=") + 1)));

Upvotes: 2

maccettura
maccettura

Reputation: 10818

If the string always starts with "CN=" you can get the data very easily with string.Substring():

string input = "CN=Joseph Rod,OU=LaptopUser,OU=Users,DC=Company,DC=local";

// take a string starting at 3rd index, going to the first comma
Console.WriteLine(input.Substring(3, input.IndexOf(',') - 3));

//Output: "Joseph Rod"

If the string can start with anything but always sticks to the same pattern you can use Split() and some LINQ:

string input = "OU=LaptopUser,CN=Joseph Rod,OU=Users,DC=Company,DC=local";
string[] splitInput = input.Split(',');
Console.WriteLine(splitInput.FirstOrDefault(x => x.StartsWith("CN="))?.Substring(3));

//Output: "Joseph Rod"

Fiddle for both here

This assumes a reasonable input of course.

Upvotes: 1

Related Questions