XML (Extensible Markup Language) can create elements to store data. It is used for exchanging structured data over APIs and applications. JSON (JavaScript Object Notation) is also a very commonly used data structure notation to transfer the data between APIs and web applications. It is based on arrays and dictionaries. It was built as an improvement over XML.
The Python programming language has modules available to parse both JSON and XML data. There is no direct method to convert XML to JSON so we will use a dictionary as an intermediary. We will learn how to perform this conversion in this article.
Table of Contents
Using the xmltodict
and json
module to convert XML data to JSON
The xmltodict
is a module used to read and parse XML data to dictionary and list type structures. We can use the xmltodict.parse()
function from this module to achieve this.
We can then use the json
module to write this data to JSON type. The json.dump()
function can write the given object to a JSON string.
See the following code.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
import xmltodict, json xml_string = """ <emp> <id>102</id> <name>Mark</name> <dept>Accounts</dept> <sal>75000</sal> </emp> """ d = xmltodict.parse(xml_string) json_str = json.dumps(d) print(json_str) |
Output:
In the above example,
- We created an object which stores the XML string.
- The
xmltodict.parse()
function parses this to a dictionary type object. - We pass this object to the
json.dump()
function which returns a JSON formatted string.
Using the xml.etree.ElementTree
and json
module to convert XML data to JSON
The xml.etree
module is an efficient method to parse the XML data as a tree. We can use this module to create a user-defined function that will parse our XML string to a dictionary, which we can write as a JSON file using the json
module.
See the code below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
from collections import defaultdict from xml.etree import ElementTree as ET def xmldict(t): d = {t.tag: {} if t.attrib else None} children = list(t) if children: dd = defaultdict(list) for dc in map(xmldict, children): for k, v in dc.items(): dd[k].append(v) d = {t.tag: {k: v[0] if len(v) == 1 else v for k, v in dd.items()}} if t.text: text = t.text.strip() if children or t.attrib: if text: d[t.tag]['#text'] = text else: d[t.tag] = text return d xml_string = """ <emp> <id>102</id> <name>Mark</name> <dept>Accounts</dept> <sal>75000</sal> </emp> """ xml_data = ET.XML(xml_string) d = xmldict(xml_data) json_str = json.dumps(d) print(json_str) |
Output:
Now, let us understand what we implemented in the above code.
- We pass the XML string using the
ET.XML()
function. - We pass this data to a function called
xmldict()
. - We process this data to return a dictionary.
- This dictionary is then passed to the
json.dump()
function, which returns the final JSON data.
Using the xmljson
and json
module to convert XML data to JSON
The xmljson
is a new and simple library available to process XML data in Python. It provides different objects to parse the data differently.
We can use the xmljson.Yahoo()
constructor to initialize an object that can be used to work with XML string. The xtree.ElementTree.fromstring()
function returns an XML string, and we use the data()
function with this object to parse this string. This will return a dictionary with the required data.
Then as we did earlier, we convert this dictionary to JSON using the json.dump()
function.
For example,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
import xmljson from xml.etree import ElementTree as ET xml_string = """ <emp> <id>102</id> <name>Mark</name> <dept>Accounts</dept> <sal>75000</sal> </emp> """ P = xmljson.Yahoo(dict_type = dict) d = P.data(ET.fromstring(xml_string)) json_str = json.dumps(d) print(json_str) |
Output:
That’s all about how to convert XML to JSON in Python.