Reputation: 21580
Question
How do I convert the string "Européen" to the RTF-formatted string "Europ\'e9en"?
[TestMethod]
public void Convert_A_Word_To_Rtf()
{
// Arrange
string word = "Européen";
string expected = "Europ\'e9en";
string actual = string.Empty;
// Act
// actual = ... // How?
// Assert
Assert.AreEqual(expected, actual);
}
What I have found so far
RichTextBox
RichTextBox can be used for certain things. Example:
RichTextBox richTextBox = new RichTextBox();
richTextBox.Text = "Européen";
string rtfFormattedString = richTextBox.Rtf;
But then rtfFormattedString turns out to be the entire RTF-formatted document, not just the string "Europ\'e9en".
Stackoverflow
I've also found a bunch of other resources on the web, but nothing quite solved my problem.
Answer
Had to add Trim()
to remove the preceeding space in result
. Other than that, Brad Christie's solution seems to work.
I'll run with this solution for now even though I have a bad gut feeling since we have to SubString and Trim the heck out of RichTextBox to get a RTF-formatted string.
Test case:
[TestMethod]
public void Test_To_Verify_Brad_Christies_Stackoverflow_Answer()
{
Assert.AreEqual(@"Europ\'e9en", "Européen".ConvertToRtf());
Assert.AreEqual(@"d\'e9finitif", "définitif".ConvertToRtf());
Assert.AreEqual(@"\'e0", "à".ConvertToRtf());
Assert.AreEqual(@"H\'e4user", "Häuser".ConvertToRtf());
Assert.AreEqual(@"T\'fcren", "Türen".ConvertToRtf());
Assert.AreEqual(@"B\'f6den", "Böden".ConvertToRtf());
}
Logic as an extension method:
public static class StringExtensions
{
public static string ConvertToRtf(this string value)
{
RichTextBox richTextBox = new RichTextBox();
richTextBox.Text = value;
int offset = richTextBox.Rtf.IndexOf(@"\f0\fs17") + 8; // offset = 118;
int len = richTextBox.Rtf.LastIndexOf(@"\par") - offset;
string result = richTextBox.Rtf.Substring(offset, len).Trim();
return result;
}
}
Upvotes: 16
Views: 47174
Reputation: 1
private static string ConvertToRtf(string text)
{
// Create a regular expression pattern to match non-ASCII characters
string pattern = "[^\x00-\x7F]";
// Use Regex.Replace to escape non-ASCII characters
return Regex.Replace(text, pattern, m => m.Value[0] > 255 ? @"\u" + ((int)m.Value[0]).ToString() + "?" : @"\'" + ((int)m.Value[0]).ToString("X2").ToLowerInvariant());
}
Upvotes: 0
Reputation: 9756
Here's improved @Vladislav Zalesak's answer:
public static string ConvertToRtf(string text)
{
// using default template from wiki
StringBuilder sb = new StringBuilder(@"{\rtf1\ansi\ansicpg1250\deff0{\fonttbl\f0\fswiss Helvetica;}\f0\pard ");
foreach (char character in text)
{
if (character <= 0x7f)
{
// escaping rtf characters
switch (character)
{
case '\\':
case '{':
case '}':
sb.Append('\\');
break;
case '\r':
sb.Append("\\par");
break;
}
sb.Append(character);
}
// converting special characters
else
{
sb.Append("\\u" + Convert.ToUInt32(character) + "?");
}
}
sb.Append("}");
return sb.ToString();
}
Upvotes: 2
Reputation: 713
This is how I went:
private string ConvertString2RTF(string input)
{
//first take care of special RTF chars
StringBuilder backslashed = new StringBuilder(input);
backslashed.Replace(@"\", @"\\");
backslashed.Replace(@"{", @"\{");
backslashed.Replace(@"}", @"\}");
//then convert the string char by char
StringBuilder sb = new StringBuilder();
foreach (char character in backslashed.ToString())
{
if (character <= 0x7f)
sb.Append(character);
else
sb.Append("\\u" + Convert.ToUInt32(character) + "?");
}
return sb.ToString();
}
I think using a RichTextBox
is:
1) overkill
2) I don't like RichTextBox
after spending days of trying to make it work with an RTF document created in Word.
Upvotes: 5
Reputation: 13556
I found a nice solution that actually uses the RichTextBox itself to do the conversion:
private static string FormatAsRTF(string DirtyText)
{
System.Windows.Forms.RichTextBox rtf = new System.Windows.Forms.RichTextBox();
rtf.Text = DirtyText;
return rtf.Rtf;
}
http://www.baltimoreconsulting.com/blog/development/easily-convert-a-string-to-rtf-in-net/
Upvotes: 5
Reputation: 400
I know it has been a while, hope this helps..
This code is working for me after trying every conversion code I could put my hands on:
titleText and contentText are simple text filled in a regular TextBox
var rtb = new RichTextBox();
rtb.AppendText(titleText)
rtb.AppendText(Environment.NewLine);
rtb.AppendText(contentText)
rtb.Refresh();
rtb.rtf now holds the rtf text.
The following code will save the rtf text and allow you to open the file, edit it and than load it back into a RichTextBox back again:
rtb.SaveFile(path, RichTextBoxStreamType.RichText);
Upvotes: 1
Reputation: 1722
Not the most elegant, but quite optimal and fast method:
public static string PlainTextToRtf(string plainText)
{
if (string.IsNullOrEmpty(plainText))
return "";
string escapedPlainText = plainText.Replace(@"\", @"\\").Replace("{", @"\{").Replace("}", @"\}");
escapedPlainText = EncodeCharacters(escapedPlainText);
string rtf = @"{\rtf1\ansi\ansicpg1250\deff0{\fonttbl\f0\fswiss Helvetica;}\f0\pard ";
rtf += escapedPlainText.Replace(Environment.NewLine, "\\par\r\n ") + ;
rtf += " }";
return rtf;
}
.
Encode characters (Polish ones) method:
private static string EncodeCharacters(string text)
{
if (string.IsNullOrEmpty(text))
return "";
return text
.Replace("ą", @"\'b9")
.Replace("ć", @"\'e6")
.Replace("ę", @"\'ea")
.Replace("ł", @"\'b3")
.Replace("ń", @"\'f1")
.Replace("ó", @"\'f3")
.Replace("ś", @"\'9c")
.Replace("ź", @"\'9f")
.Replace("ż", @"\'bf")
.Replace("Ą", @"\'a5")
.Replace("Ć", @"\'c6")
.Replace("Ę", @"\'ca")
.Replace("Ł", @"\'a3")
.Replace("Ń", @"\'d1")
.Replace("Ó", @"\'d3")
.Replace("Ś", @"\'8c")
.Replace("Ź", @"\'8f")
.Replace("Ż", @"\'af");
}
Upvotes: 0
Reputation: 101614
Doesn't RichTextBox
always have the same header/footer? You could just read the content based on off-set location, and continue using it to parse. (I think? please correct me if I'm wrong)
There are libraries available, but I've never had good luck with them personally (though always just found another method before fully exhausting the possibilities). In addition, most of the better ones are usually include a nominal fee.
EDIT
Kind of a hack, but this should get you through what you need to get through (I hope):
RichTextBox rich = new RichTextBox();
Console.Write(rich.Rtf);
String[] words = { "Européen", "Apple", "Carrot", "Touché", "Résumé", "A Européen eating an apple while writing his Résumé, Touché!" };
foreach (String word in words)
{
rich.Text = word;
Int32 offset = rich.Rtf.IndexOf(@"\f0\fs17") + 8;
Int32 len = rich.Rtf.LastIndexOf(@"\par") - offset;
Console.WriteLine("{0,-15} : {1}", word, rich.Rtf.Substring(offset, len).Trim());
}
EDIT 2
The breakdown of the codes RTF control code are as follows:
\par
is specifying that it's the end of a paragraph.Hopefully that clears some things up. ;-)
Upvotes: 9
Reputation: 25834
Below is an ugly example of converting a string to an RTF string:
class Program
{
static RichTextBox generalRTF = new RichTextBox();
static void Main()
{
string foo = @"Européen";
string output = ToRtf(foo);
Trace.WriteLine(output);
}
private static string ToRtf(string foo)
{
string bar = string.Format("!!@@!!{0}!!@@!!", foo);
generalRTF.Text = bar;
int pos1 = generalRTF.Rtf.IndexOf("!!@@!!");
int pos2 = generalRTF.Rtf.LastIndexOf("!!@@!!");
if (pos1 != -1 && pos2 != -1 && pos2 > pos1 + "!!@@!!".Length)
{
pos1 += "!!@@!!".Length;
return generalRTF.Rtf.Substring(pos1, pos2 - pos1);
}
throw new Exception("Not sure how this happened...");
}
}
Upvotes: 1