Skip to content

Python XML and JSON

XML (Extensible Markup Language) and JSON (JavaScript Object Notation) are two of the most popular data exchange formats. They are widely used in Web APIs, configuration files, and data storage. Python provides powerful built-in libraries for parsing and generating data in both formats.

JSON (JavaScript Object Notation)

JSON is a lightweight data exchange format based on a subset of JavaScript. Due to its characteristics of being easy to read, write, and parse, JSON has become the de facto standard for modern Web APIs.

Python's json module can easily convert between Python objects and JSON strings.

Conversion from Python to JSON types:

PythonJSON
dictobject
list, tuplearray
strstring
int, floatnumber
True / Falsetrue / false
Nonenull

Core Functions

  • json.dumps(obj): Serializes a Python object (such as a dictionary or list) into a JSON-formatted string.
  • json.loads(s): Deserializes a JSON-formatted string into a Python object.

Example:

python
import json

# A Python dictionary
python_data = {
    "name": "John Doe",
    "age": 30,
    "isStudent": False,
    "courses": [
        {"title": "History", "credits": 3},
        {"title": "Math", "credits": 4}
    ]
}

# 1. Convert Python dictionary to JSON string
# indent=4 parameter makes the output JSON formatted and more readable
json_string = json.dumps(python_data, indent=4)
print("--- JSON String ---")
print(json_string)

# 2. Convert JSON string back to Python dictionary
retrieved_data = json.loads(json_string)
print("\n--- Retrieved Python Dict ---")
print(retrieved_data)
print(f"Name: {retrieved_data['name']}")

XML (eXtensible Markup Language)

XML is a markup language designed for transmitting and storing data. Similar to HTML, it also uses tags, but XML's tags are not predefined; you need to define your own tags.

Python's standard library provides the xml.etree.ElementTree module for processing XML data.

Parsing XML

Suppose we have the following data.xml file:

xml
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
    </country>
</data>

Example Code:

python
import xml.etree.ElementTree as ET

# Parse XML file
tree = ET.parse('data.xml')
# Get root element
root = tree.getroot()

# Traverse XML tree
print("--- XML Data ---")
for country in root.findall('country'):
    rank = country.find('rank').text
    name = country.get('name') # Get attribute
    print(f"Name: {name}, Rank: {rank}")

Creating XML

You can also use ElementTree to build an XML tree and write it to a file.

python
import xml.etree.ElementTree as ET

# Create root element
root = ET.Element("users")

# Create child element
user1 = ET.SubElement(root, "user", id="1")
name1 = ET.SubElement(user1, "name")
name1.text = "Alice"
email1 = ET.SubElement(user1, "email")
email1.text = "alice@example.com"

# ... create more users ...

# Create an ElementTree object and write to file
tree = ET.ElementTree(root)
# pretty-printing is not straightforward with ElementTree
# but this will write the file
tree.write("users.xml", encoding='utf-8', xml_declaration=True)

JSON vs. XML

FeatureJSONXML
ReadabilityVery goodGood, but more verbose
SyntaxMore concise, less redundantStrict, requires closing tags
Parsing SpeedUsually fasterUsually slower
Data TypesSupports numbers, strings, booleans, arrays, objectsAll data treated as strings
EcosystemDominant in modern Web APIsStill widely used in enterprise systems, configuration files

Overall, for new applications, especially those related to the Web, JSON is usually the better choice. However, knowing how to process XML remains a valuable skill.

Content is for learning and research only.