KevinBui
KevinBui

Reputation: 1099

Find and replace emoticon with the according smiley image

I have got a problem with my exercise. The input data is a set of sentences - string[] sentences - The exercise's requirement is that how to find and replace emoticon (ex: :D) to according smiley image in each sentences, and then export them to .html file.

File text data define emoticon and smiley has a structure like that:

[imagename] tab [emoticon1] space [emoticon2] space [emoticon2]

smile.gif    :) :-) :=) (smile)
sadsmile.gif :( :-( :=( (sad)
laugh.gif    :D :-D (laugh)
...

The first issue is which C#'s data structure to store emoticon and smiley.

I'm happy :). How are you? -> I'm happy <img src="smile"> How are you?

The second issue is how I code to search and replace emoticon.

the last issue is, because the export file is html format, so we must encode html, may be we use HttpUtility.HtmlEncode(...) But the resultSentence contain <img ...> tag, so I think it invole to the sencond issue...

Please help me to solve those above problem. Thanks so much!

Upvotes: 1

Views: 2649

Answers (2)

Thomas Levesque
Thomas Levesque

Reputation: 292455

First, you need to load the smiley "mappings" into a dictionary:

Dictionary<string, string> LoadSmileys(string fileName)
{
    var smileys = new Dictionary<string, string>();
    using (var reader = new StreamReader(fileName))
    {
        string line;
        while ((line = reader.ReadLine()) != null)
        {
            string[] parts = line.Split(new[] { '\t' }, StringSplitOptions.RemoveEmptyEntries);
            for (int i = 1; i < parts.Length; i++)
            {
                smileys[parts[i]] = parts[0];
            }
        }
    }
    return smileys;
}

Then, just loop over the keys, and replace each occurrence of the key with the corresponding image. To avoid the problem mentionned in your comment to Carra's answer, just replace the longest keys first:

StringBuilder tmp = new StringBuilder(originalText);
foreach (var key in smileys.Keys.OrderByDescending(s => s.Length))
{
    tmp.Replace(key, GetImageLink(smileys[key]));
}

Note the use of a StringBuilder, to avoid creating many instances of String.

It's obviously not the most efficient approach, but at least it's simple... you can always try to optimize it later if it turns out to be a performance bottleneck.


UPDATE

OK, so there is still a problem if some of your smileys include reserved HTML characters like '<' or '>'... If you encode the text to HTML before replacing the smileys, these characters will be replaced with &lt; or &gt;, so the smileys won't be recognized. On the other hand, if you encode the text after replacing the smileys with <img> tags, the tags will be encoded as well.

Here's what you could do:

  • assign a unique identifier to each smiley, something unlikely to appear in the original text, like a GUID
  • replace each occurrence of each smiley by the corresponding identifier (again, starting with the longest smiley)
  • encode the resulting text to HTML
  • replace each occurrence of each smiley identifier by the appropriate <img> tag

    var mapping = LoadSmileys(@"D:\tmp\smileys.txt");
    var smileys = mapping.Keys.OrderByDescending(s => s.Length)
                         .ToArray();
    
    // Assign an ID like "{93e8b75a-6837-43f8-95ec-801ed59bc167}" to each smiley
    var ids = smileys.Select(key => Guid.NewGuid().ToString("B"))
                     .ToArray();
    
    string text = File.ReadAllText(@"D:\tmp\test_smileys.txt");
    
    // Replace each smiley with its id
    StringBuilder tmp = new StringBuilder(text);
    for (int i = 0; i < smileys.Length; i++)
    {
        tmp.Replace(smileys[i], ids[i]);
    }
    
    // Encode the text to HTML
    text = HttpUtility.HtmlEncode(tmp.ToString());
    
    // Replace each id with the appropriate <img> tag
    tmp = new StringBuilder(text);
    for (int i = 0; i < smileys.Length; i++)
    {
        string image = mapping[smileys[i]];
        tmp.Replace(ids[i], GetImageLink(image));
    }
    
    text = tmp.ToString();
    

Upvotes: 1

Carra
Carra

Reputation: 17964

You can use simple string.replace here.

foreach(string text in sentences)
{
    foreach(var kvp in dict)
    {
      text = text.replace(kvp.Key, GetImageLink(kvp.Value));
    }
}

To create the html you're better of using the native C# classes like HtmlTextWriter or an XmlWriter.

Upvotes: 0

Related Questions