ghaith alserhan
ghaith alserhan

Reputation: 49

remove "ال" from all string in arab word

I'm trying to remove "ال" from every arabic string thats contains "ال"

I'm trying to do this by using this code but its only delete "ال" from the first word:

input      : الغيث الغيث الغيث
output     : غيث الغيث الغيث
what i need: غيث غيث غيث
string[] prefixes = { "ال", "اَلْ", "الْ", "اَل" };
 
foreach (string prefix in prefixes)
{
    if (text.StartsWith(prefix))
    {
        text = text.Substring(prefix.Length);
        break;
    }

Upvotes: 2

Views: 92

Answers (2)

SaSkY
SaSkY

Reputation: 1086

Try this regex:

\b\u0627(?:\u0644\u0652?|\u064e\u0644\u0652?)

See regex demo.

And this is the C# code that does what you want:

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = @"الغيث الغيث الغيث الغيث

اَلغيث اَلغيث اَلغيث اَلغيث

اَلْغيث اَلْغيث اَلْغيث اَلْغيث

الْغيث الْغيث الْغيث الْغيث
";

      string pattern = @"\b\u0627(?:\u0644\u0652?|\u064e\u0644\u0652?)";
      string replacement = "";
      string result = Regex.Replace(input, pattern, replacement);
      
      Console.WriteLine("Original String: {0}", input);
      Console.WriteLine("\n\n-----------------\n\n");
      Console.WriteLine("Replacement String: {0}", result);                             
   }
}

See C# code demo.

Upvotes: 2

Dmitrii Bychenko
Dmitrii Bychenko

Reputation: 186728

If you are going to work with words not just Replace every occurrence, you may want regular expression to match words, e.g.

using System.Text.RegularExpressions;

...

string input = "الغيث الغيث الغيث";
string[] prefixes = { "ال", "اَلْ", "الْ", "اَل" };

// \b - word boundary - we are looking for prefixes only
string output = Regex.Replace(input, @$"\b({string.Join("|", prefixes)})", "");

Let's have a look:

Console.Write(string.Join(Environment.NewLine, input, output));

Output:

الغيث الغيث الغيث
غيث غيث غيث

Upvotes: 3

Related Questions