I love Regular Expressions

I have a shirt from ThinkGeek that has a regular expression on it. I wear it to work sometimes.

I'm making my web game, and one of the requirements that I came up with is to allow formatting HTML but not all HTML. This is a pretty reasonable request.

It's easy to get rid of all HTML tags using a simple regular expression in Java. It's

<[/]?.*?>

Just do this:

String regex = "<[/]?.*?>";
myHtmlFilledString.replaceAll(regex,"");


This is elaborate. I actually wrote an app so I could upload screenshots. The Java source code is here.

What I needed to do, for now, is replace all HTML except those basic HTML tags that do formatting... essentially to bold, italicize and underline. You can think of it methodically, easily. "Go through, if it's an HTML tag but it's not bold, italic, or underline, erase it." But I'm lazy. Here's the shot of my test application replacing all HTML in an input string. The input string is on top, the result is in the middle, and the regular expression used is on the bottom. The button simply performs that replacement.



Here is the regex I found to replace "not allowed" HTML tags... those that aren't bold, underline and italics (for now).

<[/]?[^/uUbBiI].*?>

I basically went through hell to get that though. It's fun but man it's a mind bender to think in terms of regular expressions. Here's the screenshot of that regex.



That is all.

blog comments powered by Disqus