Jesper
Jesper

Reputation: 999

Remove many instances of the same character (C#)

When people are posting some text on my website, they will sometimes insert a long line of hyphens, asterisks or full-stops like this

*********************************************************************

That will unfortunately destroy the layout on some result pages, and it's kind of pointless (to me).

How should I handle this? Maybe a regex that will reduce more than X repeats of the same character to only 5. If so, then how...

Regex regex = new Regex("[\\*\\._-]{5,}");
string goodstring = regex.Replace(badstring, "-----");

But what if a user thought it was fun to write aaaaaaaaaaaaaaaaaaaaaaaaaa, then my regex would fail.

The question is. How do you think I should handle this problem and if you think I should handle it with a regex, then how do I write a regex that would remove unnecessary repeats of any character (and not just *.-_ like my own regex here)?

Upvotes: 1

Views: 301

Answers (2)

Mark Byers
Mark Byers

Reputation: 837966

To answer your regular expression question:

how do I write a regex that would remove unnecessary repeats of any character

You can use a backreference to detect the same character entered more than once, for example:

Regex regex = new Regex(@"(.)\1{4,}");

However the main point of your question seems to be this:

That will unfortunately destroy the layout on some result pages, and it's kind of pointless (to me). How should I handle this?

You should use stylesheets to specify what should happen when the text doesn't fit into its container. For example, you can use the overflow property to set the content to hide or scroll on overflow, rather than having the default behaviour which is that overflowing content can overlap other elements on the page.

Upvotes: 6

Random Dev
Random Dev

Reputation: 52270

how to handle this is somewhat up to you/your client ... we can only advise but not answer on this ... I would (if you check for injects) just print whatever the user wants but chop/break it so it cannot destroy your layout - if you filter out multiples of one character (someone has put in there to destroy your layout) he/she will simple go on and write "abababababababababa" the next time and you are back on start

Upvotes: 2

Related Questions