ncyoung.com

Update: This part of my website is an archive of old content. Links may be broken, etc. home

Nathan's Markup Language (NML)

This entry is in the following categories:

Introduction


The most optimistic of souls could characterize XML's satisfaction of the array of business needs for which it was conceived lackluster at best. I myself feel comfortable characterizing it as a complete failure. This afternoon I sat down and created its replacement, and I'm glad I finally took the time to do so, because the whole technical community will benefit.

In the spirit of many recently named "vanity algorithms" I've named my new markup language after myself. I think this language will be enshrouded in history's chronicles and I as its creator deserve to have my name go down with it. This is in stark contrast to the obscure XSL hacks named after mailing list contributors, or worse yet the CSS hacks people named after themselves (and here I have to say WAKE UP!! do you realize you're basically naming a freaking BROWSER BUG after yourself???)

[ahem]

Realities addressed



XML has failed on the following counts, all of which NML addresses:
  • - validation
  • - namespaces
  • - escaping
  • - internationalization


version



Since I have so carefully thought through all the possible ramification of the sections below, I'm starting the version numbering of this spec directly at 1.1.

Update People have pointed out some minor problems with the specification below. Although they have all been easily addressed so far, I'm a little less confident that this spec is final. Therefore I'm resetting the current version of the spec to .85

tag delimiters



Curly brackets ("{" and "}") are used to wrap NML element tags. Curly brackets are obviously more cool and complex than square brackets ("square" brackets... need more be said???) and infinitely better than "<" and ">". Why? Because "<" and ">" are not matched parentheticals at all!!! They are the mathematical symbols for FREAKING GREATER THAN AND LESS THAN!!! Markup community please GET A CLUE!!!!

If your input has curly brackets in it you may escape them by replacing them with round brackets ("(" and ")").

end tags



End tags are the same for all elements, namely "{/}". There is no rigorous reason to name end tags since NML (like XML) does not allow for tag overlapping. Any language "feature" that is included simply to ease debugging is clearly for wimps and anyone who suggests it should be scorned.

self closing tags


When you have an empty element like {person}{/} you can omit the first bracket of the end tag to save on typing, thus creating a self closing tag like this: {person}/}

ASCII escaping


NML supports the full ASCII character set. ASCII characters can either be represented directly by the character or encoded using the ASCII numerical code point preceded by a dollar sign. So for example an exclamation point can either be typed as "!" or encoded as $33. If you need to have a dollar sign in your input document you should escape it by replacing it with %. When using ASCII characters above 127, use the ASCII dot notation to indicate the offset from the standard 127 ASCII ending point. For example, to represent the ASCII character at 138, you'd use "$127.11".

Internationalization: the scribble element


The special element {scribble}/} is used when questions about NML support for high bit character sets comes up. {scribble}/} will be replaced by random scribbling in the formatted output. By using this generously in some documents, "normal" users will be convinced your application can support fancy languages like Icelandic, Sandscrit, Japanese and French.

case sensitivity


Content in NML documents is not case sensitive and case may or may not be preserved in output. You can force upper case by preceding a character with forward slash and lowercase by preceding a character with the backslash. If you want to include a slash in your document without effecting the case of the following letter, preceded the slash with a forward slash, since there is no such thing as an upper case slash.

Letters in element names are case sensitive, with the exceptions of p,q,h and m, which are not case sensitive.

other escaping requirements


~ is a reserved character in NML and should always be preceded by the word "home".

validation



Anyone who has used DTDs, schemas, relaxNG or schematron can tell you that validation for XML has utterly failed. In fact the whole idea of strong typing is questionable to start with. You should know what kind of data you have and you should communicate that directly to your users. Validation does not replace communication and in fact it is a crutch for weak business processes.

Experience with powerful and practical programming languages like Perl and Javascript further reinforces the fact that strong typing in general wastes programmer hours.

Update: People have been wining about the lack of support for validation in NML, so I'm adding the following validation support.

NML supports the most useful functionality of validation, while minimizing unnecessary complexity in the parsing layer and placing the burdens of validation where they belong, on the content author.

The mechanism for this is the optional special attribute "is-valid". This boolean attribute can hold the values "yes" and "no". If the element contents are valid, this attribute should be set to "yes". A "no" value is equivalent to leaving the attribute out. Blank values are ambiguous.

If every node in the document is valid, the "is-valid" attribute should be removed from all of them and replaced with a document level {is-valid}/} element.

namespaces


Document authors in NML are required to choose unique names for each element. This obviates the need for any namespacing mechanism, and I can't believe the creators of XML didn't think of it.

ordinals



Sometimes the document order may not reflect the true desired order of elements. The special attribute "ordinal" can be used to indicate true order of occurrence in these instances. For example, the element {people ordinal="3"}/} should be treated as the third item in the document.

output formatting


Formatting engines should support the special NML attribute "style-as". This allows document authors flexibility as to how their content will be formatted. For example, in the NML version of XHTML, {span style-as="div"} should be formatted as a div, while {b style-as="i"} should come out italic.

inclusion


NML's built in inclusion mechanisms are simple and powerful. The inclusion element is simply an empty self closing NML tag like so: {}/}. The first time this tag is encountered, the parser should build a list of files in the document's directory and all subdirectories. The file list should be alphabetized, and the contents of the first file substituted for the {}/} tag. The next time the tag is encountered, the second file in the list is used and so on. For performance reasons, all files in the list should be loaded and parsed at the first include tag encountered.

executable content


The % tag delimits executable content. Any text between % and % will be executed at parse time. Executable content can be written in any scripting language. The parser should run through all the interpreters installed on the client machine and try to execute the string with each of them, thus providing the greatest likelihood the commands will get executed at least once. Output from the command should be discarded or written to a numbered file in the user's temp directory.

An MS Outlook plug-in for this functionality is already available and runs immediately when the message is received.

security



Security problems are in general the problem of application developers and end users. Any user who can't secure their own machine should not be allowed to own a computer, and in fact should be taken out into the desert and forced to generate their own 256 bit MD5 keys using only an abacus with lifesavers for wheels while surrounded by ants. At night the ice weasels come.

performance



When coding parsers for NML, developers are encouraged to make them faster and simpler than the equivalent XML parsers. Of course there will be some developers who write poorly performing code but they will be sternly reprimanded by the creator of NML and (more importantly) censured by the vast NML user community.

killer apps



I'm working on translating my XML based csv replacement to use NML instead.

There are also rumours that the national polo league is going to use NML to represent information about national disasters.




Dated: 10/14/2005