Reputation: 19793
Take a string such as:
In C#: How do I add "Quotes" around string in a comma delimited list of strings?
and convert it to:
in-c-how-do-i-add-quotes-around-string-in-a-comma-delimited-list-of-strings
Requirements:
ToSeoFriendly("hello world hello world", 14)
returns "hello-world"
On a separate note, should there be a minimum length?
Upvotes: 17
Views: 10710
Reputation: 369
To do this we need to:
I wanted a function to generate the entire string and also to have an input for a possible max length, this was the result.
public static class StringHelper
{
/// <summary>
/// Creates a URL And SEO friendly slug
/// </summary>
/// <param name="text">Text to slugify</param>
/// <param name="maxLength">Max length of slug</param>
/// <returns>URL and SEO friendly string</returns>
public static string UrlFriendly(string text, int maxLength = 0)
{
// Return empty value if text is null
if (text == null) return "";
var normalizedString = text
// Make lowercase
.ToLowerInvariant()
// Normalize the text
.Normalize(NormalizationForm.FormD);
var stringBuilder = new StringBuilder();
var stringLength = normalizedString.Length;
var prevdash = false;
var trueLength = 0;
char c;
for (int i = 0; i < stringLength; i++)
{
c = normalizedString[i];
switch (CharUnicodeInfo.GetUnicodeCategory(c))
{
// Check if the character is a letter or a digit if the character is a
// international character remap it to an ascii valid character
case UnicodeCategory.LowercaseLetter:
case UnicodeCategory.UppercaseLetter:
case UnicodeCategory.DecimalDigitNumber:
if (c < 128)
stringBuilder.Append(c);
else
stringBuilder.Append(ConstHelper.RemapInternationalCharToAscii(c));
prevdash = false;
trueLength = stringBuilder.Length;
break;
// Check if the character is to be replaced by a hyphen but only if the last character wasn't
case UnicodeCategory.SpaceSeparator:
case UnicodeCategory.ConnectorPunctuation:
case UnicodeCategory.DashPunctuation:
case UnicodeCategory.OtherPunctuation:
case UnicodeCategory.MathSymbol:
if (!prevdash)
{
stringBuilder.Append('-');
prevdash = true;
trueLength = stringBuilder.Length;
}
break;
}
// If we are at max length, stop parsing
if (maxLength > 0 && trueLength >= maxLength)
break;
}
// Trim excess hyphens
var result = stringBuilder.ToString().Trim('-');
// Remove any excess character to meet maxlength criteria
return maxLength <= 0 || result.Length <= maxLength ? result : result.Substring(0, maxLength);
}
}
This helper is used for remapping some international characters to a readable one instead.
public static class ConstHelper
{
/// <summary>
/// Remaps international characters to ascii compatible ones
/// based of: https://meta.stackexchange.com/questions/7435/non-us-ascii-characters-dropped-from-full-profile-url/7696#7696
/// </summary>
/// <param name="c">Charcter to remap</param>
/// <returns>Remapped character</returns>
public static string RemapInternationalCharToAscii(char c)
{
string s = c.ToString().ToLowerInvariant();
if ("àåáâäãåą".Contains(s))
{
return "a";
}
else if ("èéêëę".Contains(s))
{
return "e";
}
else if ("ìíîïı".Contains(s))
{
return "i";
}
else if ("òóôõöøőð".Contains(s))
{
return "o";
}
else if ("ùúûüŭů".Contains(s))
{
return "u";
}
else if ("çćčĉ".Contains(s))
{
return "c";
}
else if ("żźž".Contains(s))
{
return "z";
}
else if ("śşšŝ".Contains(s))
{
return "s";
}
else if ("ñń".Contains(s))
{
return "n";
}
else if ("ýÿ".Contains(s))
{
return "y";
}
else if ("ğĝ".Contains(s))
{
return "g";
}
else if (c == 'ř')
{
return "r";
}
else if (c == 'ł')
{
return "l";
}
else if (c == 'đ')
{
return "d";
}
else if (c == 'ß')
{
return "ss";
}
else if (c == 'þ')
{
return "th";
}
else if (c == 'ĥ')
{
return "h";
}
else if (c == 'ĵ')
{
return "j";
}
else
{
return "";
}
}
}
To the function would work something like this
const string text = "ICH MUß EINIGE CRÈME BRÛLÉE HABEN";
Console.WriteLine(StringHelper.URLFriendly(text));
// Output:
// ich-muss-einige-creme-brulee-haben
This question has already been answered many time here but not a single one was optimized. you can find the entire sourcecode here on github with some samples. More you can read from Johan Boström's Blog. More on this is compatible with .NET 4.5+ and .NET Core.
Upvotes: 1
Reputation: 22445
This is close to how Stack Overflow generates slugs:
public static string GenerateSlug(string title)
{
string slug = title.ToLower();
if (slug.Length > 81)
slug = slug.Substring(0, 81);
slug = Regex.Replace(slug, @"[^a-z0-9\-_\./\\ ]+", "");
slug = Regex.Replace(slug, @"[^a-z0-9]+", "-");
if (slug[slug.Length - 1] == '-')
slug = slug.Remove(slug.Length - 1, 1);
return slug;
}
Upvotes: 1
Reputation: 156138
In python, (if django is installed, even if you are using another framework.)
from django.template.defaultfilters import slugify
slugify("In C#: How do I add "Quotes" around string in a comma delimited list of strings?")
Upvotes: 0
Reputation: 14973
Another season, another reason, for choosing Ruby :)
def seo_friendly(str)
str.strip.downcase.gsub /\W+/, '-'
end
That's all.
Upvotes: 0
Reputation:
Solution in shell:
echo 'In C#: How do I add "Quotes" around string in a comma delimited list of strings?' | \
tr A-Z a-z | \
sed 's/[^a-z0-9]\+/-/g;s/^\(.\{1,20\}\).*/\1/'
Upvotes: 1
Reputation:
Solution in Perl:
my $input = 'In C#: How do I add "Quotes" around string in a comma delimited list of strings?';
my $length = 20;
$input =~ s/[^a-z0-9]+/-/gi;
$input =~ s/^(.{1,$length}).*/\L$1/;
print "$input\n";
done.
Upvotes: 1
Reputation: 154513
A better version:
function Slugify($string)
{
return strtolower(trim(preg_replace(array('~[^0-9a-z]~i', '~-+~'), '-', $string), '-'));
}
Upvotes: 2
Reputation: 144927
A slightly cleaner way of doing this in PHP at least is:
function CleanForUrl($urlPart, $maxLength = null) {
$url = strtolower(preg_replace(array('/[^a-z0-9\- ]/i', '/[ \-]+/'), array('', '-'), trim($urlPart)));
if ($maxLength) $url = substr($url, 0, $maxLength);
return $url;
}
Might as well do the trim()
at the start so there is less to process later and the full replacement is done with in the preg_replace()
.
Thxs to cg for coming up with most of this: What is the best way to clean a string for placement in a URL, like the question name on SO?
Upvotes: 0
Reputation: 19793
Here is my solution in C#
private string ToSeoFriendly(string title, int maxLength) {
var match = Regex.Match(title.ToLower(), "[\\w]+");
StringBuilder result = new StringBuilder("");
bool maxLengthHit = false;
while (match.Success && !maxLengthHit) {
if (result.Length + match.Value.Length <= maxLength) {
result.Append(match.Value + "-");
} else {
maxLengthHit = true;
// Handle a situation where there is only one word and it is greater than the max length.
if (result.Length == 0) result.Append(match.Value.Substring(0, maxLength));
}
match = match.NextMatch();
}
// Remove trailing '-'
if (result[result.Length - 1] == '-') result.Remove(result.Length - 1, 1);
return result.ToString();
}
Upvotes: 10
Reputation: 75794
C#
public string toFriendly(string subject)
{
subject = subject.Trim().ToLower();
subject = Regex.Replace(subject, @"\s+", "-");
subject = Regex.Replace(subject, @"[^A-Za-z0-9_-]", "");
return subject;
}
Upvotes: 4
Reputation: 655129
I would follow these steps:
preg_replace()
function call already prevents multiple hyphens)So, all together in a function (PHP):
function generateUrlSlug($string, $maxlen=0)
{
$string = trim(preg_replace('/[^a-z0-9]+/', '-', strtolower($string)), '-');
if ($maxlen && strlen($string) > $maxlen) {
$string = substr($string, 0, $maxlen);
$pos = strrpos($string, '-');
if ($pos > 0) {
$string = substr($string, 0, $pos);
}
}
return $string;
}
Upvotes: 7
Reputation: 93318
Here's a solution for php:
function make_uri($input, $max_length) {
if (function_exists('iconv')) {
$input = @iconv('UTF-8', 'ASCII//TRANSLIT', $input);
}
$lower = strtolower($input);
$without_special = preg_replace_all('/[^a-z0-9 ]/', '', $input);
$tokens = preg_split('/ +/', $without_special);
$result = '';
for ($tokens as $token) {
if (strlen($result.'-'.$token) > $max_length+1) {
break;
}
$result .= '-'.$token;
}
return substr($result, 1);
}
usage:
echo make_uri('In C#: How do I add "Quotes" around string in a ...', 500);
Unless you need the uris to be typable, they don't need to be small. But you should specify a maximum so that the urls work well with proxies etc.
Upvotes: 2