Thought Experiment: XML integrated into a C-like language
In recent days I made the following thought experiment: how can XML processing be made easier by integrating XML support into a Java/C#-like programming language.
I created the code snippet below to try out what such a language could look like. The syntax of this theoretical language:
- Adds two hybrid-base types called Node and NodeList to the language. Hybrid means that they are Objects like java.lang.String in Java, but have their own literals and operators.
- Node is similar to a DOM node, but uses the XPath data model (no DTD/Doctype, no entities, no CData sections, everything normalized)
- Using the index ([]) operator an XPath expression can be executed on a Node, the result is a NodeList
- A node has the operators += (add as a child), + (create a node list of the two nodes), -= (remove node from children) and << (replace the node)
- A NodeList is a list of references to nodes. It has operators like +, += (append a node list) and << (replace all nodes)
- A normal XML node literal is contained in [[ ]] brackets. To avoid uneccessary escaping, you can use more than two brackets, e.g. [[[[ <element/> ]]]].
- A perl-string-like XML node expression that allows the insertion of base types is enclosed in single brackets [ ]. This would be a simple node with content: [ <text>Blabla ${somevariable} $anothervariable</text> ] . Variables can be Nodes, NodeLists, Strings, numbers..
- You can cast any Node to NodeList. NodeLists can be casted to Node, but when the list has more than one member it throws an exception
- Nodes can be implicitly casted to Strings
- Strings can be implicitly casted to (text) nodes
- the keyword prefix is used to define a XML namespace prefix to be used in XML node literals and XPath expressions. It can be used in all places you can declare a const variable, and has the same scoping rules
The example assumes that you are familar with XPath. Dont expect the code to be really useful, it's just to get a feel for the syntax. I think I could get used to something like this...
class Test { prefix ageext "urn:mascot-age-extension"; static void main() { Node mascots = [[ <mascotList> <mascot> <name>Tux</name> <species>Penguin</species> <project>Linux</project> <ageext:age>8</ageext:age> </mascot> <mascot> <name>Konqi</name> <species>Dragon</species> <project>KDE</project> <ageext:age>3</ageext:age> </mascot> </mascotList> ]]; workWithMascots(mascots, 4); } void workWithMascots(Node mascots, int mimimumAge) { mascots[/mascotList/mascot[ageext:age < $minimumAge]] << minimumAge; NodeList n = mascots[/mascotList/mascot]; foreach Node i in n { Node summary = [ <summary>${i[name]} is a ${i[species]} and the mascot of ${i[project]}</summary> ]; i += summary; } // print all mascots int num = 0; foreach Node i in n { num++; Console.println([Mascot Number $num: ${i[summary]}]); } } };