I am trying to standardise on a single text markup language.
Over the years I have used many.
- Standard Format Markers for Scripture (\v, \c etc)
- html
- xml
- Restructured Text
- Various Wiki dialects
- Textile
- Almost Free Text
There are others available such as markdown. Wikipedia has a list.
To standardise I want to be able to use this one format for my blog, comments on the blog, meeting notes, web site content management. I am not very interested in wysiwyg for this as it is structure that I want to mark not format (and I want to be able to use this on my phone, off-line on my laptop, on my sunray and from any web browser).
My needs are:
- I want to be able to enforce well formed xhtml. For me that means no embedded html and automatic ending of all tags. (important in weblog comments)
- I also want to be able to type this fast, I am not too worried about the text format looking good itself as I will be transforming it into xhtml for presentation and want to do that from anywhere.
- I need a standalone conversion so that I can type into an editor, convert it to xhtml (or html) and paste it into tools that only accept html.
- For my ease of hacking there should be a working and supported Python implementation.
- I want to be able to set html class and id for markup so that I can format with css easily. Important for CMS.
- I want to be sure nobody can embed any nasty javascript etc into it. (important for weblog comments and CMS). Not allowing any embedded html is the easiest way.
- I want to be able to type this into my Nokia notepad using my wonderful nokia bluetooth keyboard. Handy for seminar and meeting notes.
So far my thoughts are.
- Restructured Text. I find the headings a pain. I have to remember which arbitrary character was used for each level and type the correct number of them. Has some support for DocBook which is nice if probably only in a theoretical way for me.
- Markdown. I don't like having to type numbers for enumerated lists. Can't see how to set class and/or id for elements without using html. Too supportive of embedded html.
- Textile. I like the pytextile implementation. The markup is very brief. It should be easy to block html and pytextile already supports a safe mode which may be enough.
- Wiki formats, too many of them, generally difficult to add class and id to markup. Not suited to standalone use.
So it looks like Textile at present, but probably Markdown would come 2nd and reStructuredText next.
This research does not include real use of implenentations of all these. RestructuredText and Textile are the only ones I have used both on websites and with standalone python tools. If you use the blog search you can find some earlier writings by me (try searching for wiki or textile or restructured text)
Anything better? Any other thoughts?

# select count(*) from document where type_id = 42 and creation_date > '2004-03-01';
count
-------
1,748
That's how many pytextile blog entries (a business blog) I have typed in over the past 17 months; another seven hundred or so by a colleague.
# select sum(length(content)) from content where creation_date > '2004-03-01';
sum
----------
43,091,842
Hmnn, that's a lot of typing and no doubt no small amount of cut and pasting, which leads me to why I am going to use Markdown for awhile.
I generally like pytextile - its capable. My particular use of pytextile (and perhaps pytextile itself, I've not looked into this yet), is somewhat brain dead in that it really messes up included HTML (some, but not all) - lets say I paste in a a complex table or something - it leaves the html as is, except inserts a ton or br elements on each line break.
I am however about to take a detour and use Markdown for a while. Its particularly good about not adding br in fixed width pasted text where they do not belong.
I'd recommend the latest pymarkdown... it adds the ability to do footnootes (that was what was holding me back, I'm I big user of them) and attributes in a simple way.
http://mikewatkins.net/categories/technical/pymarkdown-09.html
Currently I store converted html; in the future I am retaining the markup (markdown, pytextile, rest or whatever) as is and caching a rendered page on disk. Pretty trivial, a lot faster to serve up, and lets me do updates more easily instead of wading through html after the item is published.
Posted by: Mike Watkins | Thursday, October 13, 2005 at 12:44 AM
And... I must admit, I like looking at Markdown formatted text. I could include the text with zero processing in a mail out, for example, or feed it into a text indexer without having to worry about h2. / p. / bq. / @code@ / !someimageurl.png! / "links to things":http://www.someurltoimportant.com/ which probably ought to be cleaned up if the pytextile 'source' is going to be used for other things.
Markdown's format is just so much cleaner, often no work required to repurpose text.
True, ReST has the benefit of docutils and repurposing text can be done easily, but it surely has an obtuse set of formatting commands. I'm undecided here... impemented Formatters() for all of them and will see which fits my day to day needs best, but suspect Markdown or pytextile will remain the tool of choice.
Posted by: Mike Watkins | Thursday, October 13, 2005 at 12:51 AM
I'm really surprised docbook isn't in the list. It's certainly more heavyweight than anything you have already mentioned, but it has the benefit of having structures for most anything you'd expect to want to represent in a technology focused blog, and it easily converts into XHTML (or plain ole HTML) using the DocBook-XSL stylesheets. I know norman walsh (docbook's primary author) uses it, some pretty slick RDF tools, and some other scripting to manage his entire site.
Personally, I'm still giving plone a shot, so for now, I'm using ReST for my site (but DocBook for all my other documentation and most of my other markup needs).
Posted by: William McVey | Monday, October 17, 2005 at 02:54 AM
William,
The reason for not using DocBook is that I am not the only author involved. The websites need to allow several people to update who will not be familiar with html, xml etc so DocBook would be far too intimidating.
Dave
Posted by: DaveW | Monday, October 17, 2005 at 09:53 AM
"python-markdown" is now made modular so you can extend it with your own features.
Posted by: karl | Sunday, November 06, 2005 at 03:19 AM