MorkaLork Development

Interesting stuff I've picked up over the years...


2009-06-10 13:38:55 | 4646 views | bbcode safety html converting preg_replace

What is BBCode?

BBCode stands for Bulletin Board Code and is a way to enhance the safety of your forum/guestbook/whatever when you want the users to be able to format their text. The problem with letting people use normal html tags is that it opens up for possible hacks(using javascript) or layout destruction(using CSS). With BBCode you can decide what a user can and can't do.

Normally in a forum, a user has no need to implement javascript och changing the layout. All they need is perhaps bold text to emphesize a word or italic text to mark a quote. With bbcode you, as admin (or whatever), gets total control on what a user can do, and how it's done.

This article will show both how to implement the easiest of bbcode(bold, italic, underlined and headertext), but will also display more complex bbcode, such as how to let a user enter a URL, image, or even smileys.

Of course, before giving the user too much freedom, ask your self what the visitors of your site needs. Too much bbcode may give unwanted effect as users start entering images just because they can and putting smileys everywhere. If it's just a guestbook, perhaps bold, italic and underline will do.

How is it done?

Well, it's usually done with a mix of PHP and javascript. This tutorial will show you how to do the PHP part since that's where the coding is done. The javascript part is normally done to create buttons where the user can enter the bbcode(push the image button and the text that is marked is magically turned into text), but this is just graphical and provides no help when it comes to actually working with the bbcode. Many users are today aware of BBCodes existance and can enter the [b] themselves. If you want a GUI(graphical user interface) for you users, look in the javascript section of morkalork for more information.

In this example we will pretend that you have a guestbook and you give your visitors the ability to bolden text with BBCode. What's the first step? Well, the user enters the text, say:

Hello World!
My name is maffelu and I love this page!
For the horde!

Well, hopefully your guestbook works and the visitor seems to be an idiot =D.
When this is entered it is most likely saved in a database somewhere. Now what if they want to enter data with some effects. Perhaps "Hello World!" is ment to be a title, and the user wants to bolden their name, and really point out that they love this page? And what if "For the horde!" is a quote? Well, maybe they'd want to enter it like this:

[title]Hello world![/title]
My name is [b]maffelu[/b] and I [u]love[/u] this page!
[i]For the horde![/i]

And then they'd expect it to look like this:

Hello world!

My name is maffelu and I love this page!
For the horde!

This is BBCode and it's effects. It gives your user a totally different experiance on your page. It feels more like you're typing in some sort of Office package.

So what happens behind the curtain?
Well, this is what we'll look into in the next chapter.

Next page: What does the code look like?


The Code

What you need to know before starting to work with BBCoding is PHP =() and a little bit about regular expressions.
Really what we do is that we convert [b] to <b> and [i] to <i>. This can all be done with str_replace you think. Yes, it can. However, how do you convert [img=] to <img src="">? That's not as easy. So, I'll show you two ways. One if you just want to use the typical formats such as bold and italic, and one if you want to get more complex.

The simple way

You could just replace [b] with <b> using str_replace. This function would do just that:

function simpleBBcode($text)
//This is what we're looking for
$search = array("[b]", "[/b]", "[i]", "[/i]", "[u]", "[/u]", "[title]", "[/title]");
//This is what we replace it with
$replace = array("<b>", "</b>", "<i>", "</i>", "<u>", "</u>", "<h2>", "</h2>");
//We use str_replace to make it easy
$text = str_replace($search, $replace, $text);
//We return the formatted text
return $text;

Using this function like this:

$text = '[title]Hello world![/title]<br />';
$text .= 'My name is [b]maffelu[/b] and I [u]love[/u] this page!<br />';
$text .= '[i]For the horde![/i]<br />';

$text = simpleBBcode($text);
echo $text;

This will now output properly formatted text in HTML.



The proper way

The real way is to use preg_replace instead of str_replace. Using regexp we can capture any text between [b] and [/b] which will make this script more logical since this will only convert [b]text[/b], not [b]text.

A real bbcode function could look like this:

function bbcode($text)
//This is what we're looking for
$search = array(

//This is what we replace it with
$replace = array(
'<strong>$1</strong>', //Bold text
'<em>$1</em>', //Italic text
'<u>$1</u>', //Underlined text
'<h2>$1</h2>' //Header text

//We use preg_replace since we're dealing with regexp
$text = preg_replace ($search, $replace, $text);

//We return the formatted text
return $text;

This is more logical and its use will become more clear as we move onto more complex grounds.


The right way
Can handle all sorts of bbcode->html conversion


More complex

How it works

What we do with preg_replace is that we search for a tag and an end-tag and then we replace it like this: [starttag]Whatever is in-between[endtag]
So, how do we find everything in between? With our regexp!
This is how the regexp for finding bold formatting look:

I'm not gonna go to deep on regexp, look into it at for more information. However, the important thingy with regexp is that we can use (.*?) which allows us to capture everything between or inside a subject. As we can see in the b-tag example above, we look for (.*?) which means that it doesn't matter what exists between the tags, the regular expression collects it all.

This might clear up how to trap a link such as this: [b][url=http:///][url][/b] since this will have to be converted to <a href=""></a>. This could not be done with regular expression since you have no idea to trap the link text with str_replace. But with regular expression we could do like this:
This would catch two things. The URL and the link text. We could use it like this:
'<a href="$1" target="_blank">$2</a>'
As we can see, the collected values can be refered to as $1, $2 etc depending on how much has been collected.

In the next chapter I'll show example code on a good, normal BBCode generator.

Next page: Full bbcode example


Full code

Allright, here is a short proper BBCode function that will handle most BBCode that you might want. This will also give you an idea on how to enter more BBCode tags:

function bbcode($text)
//This is what we're looking for
$simple_search = array(
//This is what we replace it with
$simple_replace = array(
'<a href="$1" target="_blank">$2</a>',
'<span style="color:$1">$2</span>',
'<span style="font-size:$1px">$2</span>',
'<img src="$1" alt="image$1" border="0">'


//We use preg_replace since we're dealing with regexp
$text = preg_replace ($search, $replace, $text);

//We return the formatted text
return $text;

Here is a table to show the tags, their html counterparts and what you get:

[b]<b>[bold]bold text[bold]
[i]<i>[italic]italic text[italic]
[u]<u>underlined text
[url=]Google[/url]<a href="">Google</a>Google
[color=#F00]Red [/color]text<span style="color: red;">Red </span>textRed text
[size=10]Small[/size] text<span style="font-size: 10px;">Small</span> textSmall text
[title]A title[/title]<h2>A title</h2>

A title

[img]Smiley6.gif[/img]<img src="Smiley6.gif">image

Hopefully this has been helpful, please leave a message if you have any questions...

Article comments

Feel free to comment this article using a facebook profile.

I'm using facebook accounts for identification since even akismet couldn't handle all the spam I receive every day.