Reputation: 11143
How-to build tagging system like SO ?
I'm using a unique textbox on my asp.net mvc website to submit "tags".
First of all, i tried to split tags with commas "asp.net, c#, sql server". It works but if user forgot to seperate tags with commas i've a problem to split that string.
"asp.net c# sql server" : sql server should be a single tag, not two "sql" + "server".
Moreover i "can't" (he should not take care about this ...) ask user to use "-" character to seperate words of the tag : "sql-server"
Someone help ?
Upvotes: 9
Views: 3601
Reputation: 10146
My advice:
First off, up front, choose right away: do you want to allow tags with spaces in their names or not? Pick one or the other, don't try to create some crazy mish-mash with heuristic prediction about whether the user meant one thing or the other.
Either this:
sql server
always means 1 tag, or always means 2. Just choose right now what you want. One or the other. If you should choose to not allow spaces in tags (that means it's 2 tags), but you also want to allow users to separate tags with commas, e.g.:
sql,server
Then you could deal with the user entering a bunch of tags mixed, e.g.:
sql server,regular expressions,java c#
With code like this:
string[] tags = Regex.Split(input, @"(,|\s)+");
Which will get you:
tags[0]: sql
tags[1]: server
tags[2]: regular
tags[3]: expressions
tags[4]: java
tags[5]: c#
Upvotes: 0
Reputation: 5308
There's one (easy) way I can think of to allow your user to include any character in a tag. That solution is to allow the user to enter only one tag at a time. You could have a textbox where the user enters the tag (autocompletion for existing tags is a definite plus), presses enter or a button when finished entry, and the entered tags appear below the textbox in a list of applied tags. Those applied tags must have a button for each tag in order to allow the user to remove the tag.
Wordpress has a similar tagging mechanism when you create posts, but they allow multiple tags to be entered at once by simply stating what character delimits tags. Asking for a delimiter is not a big deal, but if you don't want to mandate a particular delimiter, you'll simply have to restrict the user to entering a single tag at a time.
Another Idea (edit)
I just read this today: Tokenizing Control
Upvotes: 9
Reputation: 839194
StackOverflow has exactly the same problem with users incorrectly entering tags, for example if you had entered 'string manipulation' instead of 'string-manipulation'. You've just changed the tag separator from space to comma.
The fundamental problem is still the same, so it is no surprise that the solution is also the same:
StackOverflow proves that this model can work well. An automated solution for correcting user errors will sometimes make errors itself because of the ambiguity you pointed out yourself. This will frustrate people who are doing it correctly only to be foiled by the software "fixing" their tags for them.
Upvotes: 6
Reputation: 15134
You might try a statistical spelling correction kind of approach: if there are a bunch of things already tagged "sql server" it could make an educated guess. Of course, it would get it wrong sometimes.
Upvotes: 2
Reputation: 32954
either you match the string for existing tags, so then you can have tags with spaces (assuming you search for the bigger tags first so you find 'sql server' before you search for 'sql'. You could make this more robust by only allowing existing tags to be used, and have a separate mechanism for creating new tags. That way users could easily create tags with spaces, as anything entered in the new tag box would be a single tag like 'sql server 2005'.
EDIT:
Alternatively you could have some special syntax in the tags for creating new ones:
'sql,asp.net,[NEWTAG]sql server,c#' would use existing tags 'sql','asp.net','c#' and would create a new tag 'sql server'
/EDIT
or you split on spaces and don't allow tags with spaces
in your example how do you tell the difference between 'sql server' (1 tag) and 'sql' 'server' (2 tags)?
if you look on SO the tags are all space separated, so one tag is sql-server.
As long as you have the tags suggested to them as they are entering them I don't think this will be a problem
Upvotes: 5