How to Convert JSON to XML using ANTLR

In the previous article I have shown how to convert JSON to XML using XSLT 2.0 capabilities.

The problem w/ implementing parsers in XSLT, is conversion from flat structure to tree structure. XSLT was simply NOT created for such kind of conversions. For example, JSON to XML transformation is using XML Pipeline of mode1, mode2, mode3 to build a tree structure from a sequence of tokens generated by regexp in mode0.

Conversion of more complex languages, such as CSS to XML (aka css2xml.xslt) would require even more modes (actually 7, I have a prototype;). This leads to somewhat complex code and obvious speed and memory issues. The conclusion is that XSLT is not quite good for implementing parsers.

On other hand, implementing parsers in "regular" programming languages, such as C# and Java is complicated as well. The only viable choice is to use parser generators such as GNU bison, yacc, etc.

Speaking about parser generators, I find the best one to use is ANTLR: it supports multiple front-ends, has GUI and clear organization. The only problem with ANTLR is its weird syntax: lots of cryptic symbols, no english keywords. But that's tolerable taking into consideration there are a lot of sample grammars on ANTLR website.

Another "problem" of using ANTLR is that it generates AST (abstract syntax tree) -- there is no way to serialize it as XML and later employ XSLT conversion to convert "AST as XML" to required XSD. But that's possible to fix.

So, as starter point I have taken sample JSON grammar from Xerial project and enhanced it w/ parameterized ANTLR rewrite rules: XML_ELEMENT["<element-name>"] & XML_ATTRIBUTE["attribute-name"]

json
    : value -> ^(XML_ELEMENT["json"] value)
    ;
 
object
    : '{' (element (',' element)*)? '}'
      -> ^(XML_ELEMENT["object"] element*)
    ;
    
element
    : String ':' value
      -> ^(XML_ELEMENT["element"] ^(XML_ATTRIBUTE["name"] String) value)
    ;    
    
array
    : '[' value (',' value)* ']'
      -> ^(XML_ELEMENT["array"] value+)
    ;
 
    
value
    : String -> ^(XML_ELEMENT["string"] String)
    | Integer -> ^(XML_ELEMENT["integer"] Integer)
    | Double -> ^(XML_ELEMENT["double"] Double)
    | Boolean -> ^(XML_ELEMENT["boolean"] Boolean)
    | object   
    | array   
    | NULL
    ;

Next, I have written special AST walker that generates SAX events using parameters from above ANTLR XML_ELEMENT and XML_ATTRIBUTE rewrite rules.

The result is AST to SAX adapter that allows to load JSON using common SAX API. Thus SAX serialization as XML actually performs JSON to XML conversion:

TransformerFactory.newInstance().newTransformer().transform(
       new SAXSource(new JsonSaxParser(), new InputSource(jsonFile.toString())),
       new StreamResult(xmlFile));

Sample JSON input:

{
     "firstName": "John",
     "lastName": "Smith",
     "age": 25,
     "address": {
         "streetAddress": "21 2nd Street",
         "city": "New York",
         "state": "NY",
         "postalCode": "10021"
     },
     "phoneNumber": [
         { "type": "home", "number": "212 555-1234" },
         { "type": "fax", "number": "646 555-4567" }
     ]
}

Sample XML output:

<?xml version="1.0" encoding="UTF-8"?>
<object>
    <element name="firstName">
        <string>John</string>
    </element>
    <element name="lastName">
        <string>Smith</string>
    </element>
    <element name="age">
        <integer>25</integer>
    </element>
    <element name="address">
        <object>
            <element name="streetAddress">
                <string>21 2nd Street</string>
            </element>
            <element name="city">
                <string>New York</string>
            </element>
            <element name="state">
                <string>NY</string>
            </element>
            <element name="postalCode">
                <string>10021</string>
            </element>
        </object>
    </element>
    <element name="phoneNumber">
        <array>
            <object>
                <element name="type">
                    <string>home</string>
                </element>
                <element name="number">
                    <string>212 555-1234</string>
                </element>
            </object>
            <object>
                <element name="type">
                    <string>fax</string>
                </element>
                <element name="number">
                    <string>646 555-4567</string>
                </element>
            </object>
        </array>
    </element>
</object>

Generally, this "to XML" conversion appoach can applied to any markup that ANTLR is able to parse: RTF2XML, MIF2XML, CSS2XML, ObjC2XML.

AttachmentSize
json2xml2.zip14.83 KB
   Previous post
« ANN: Announcing XSLT Lint

Comments

Compiled jar

Could you provide a compiler jar?

Thanks,
Mel

Post new comment

The content of this field is kept private and will not be shown publicly.