Python XML and JSON
XML (Extensible Markup Language) and JSON (JavaScript Object Notation) are two of the most popular data exchange formats. They are widely used in Web APIs, configuration files, and data storage. Python provides powerful built-in libraries for parsing and generating data in both formats.
JSON (JavaScript Object Notation)
JSON is a lightweight data exchange format based on a subset of JavaScript. Due to its characteristics of being easy to read, write, and parse, JSON has become the de facto standard for modern Web APIs.
Python's json module can easily convert between Python objects and JSON strings.
Conversion from Python to JSON types:
| Python | JSON |
|---|---|
dict | object |
list, tuple | array |
str | string |
int, float | number |
True / False | true / false |
None | null |
Core Functions
json.dumps(obj): Serializes a Python object (such as a dictionary or list) into a JSON-formatted string.json.loads(s): Deserializes a JSON-formatted string into a Python object.
Example:
import json
# A Python dictionary
python_data = {
"name": "John Doe",
"age": 30,
"isStudent": False,
"courses": [
{"title": "History", "credits": 3},
{"title": "Math", "credits": 4}
]
}
# 1. Convert Python dictionary to JSON string
# indent=4 parameter makes the output JSON formatted and more readable
json_string = json.dumps(python_data, indent=4)
print("--- JSON String ---")
print(json_string)
# 2. Convert JSON string back to Python dictionary
retrieved_data = json.loads(json_string)
print("\n--- Retrieved Python Dict ---")
print(retrieved_data)
print(f"Name: {retrieved_data['name']}")XML (eXtensible Markup Language)
XML is a markup language designed for transmitting and storing data. Similar to HTML, it also uses tags, but XML's tags are not predefined; you need to define your own tags.
Python's standard library provides the xml.etree.ElementTree module for processing XML data.
Parsing XML
Suppose we have the following data.xml file:
<data>
<country name="Liechtenstein">
<rank>1</rank>
<year>2008</year>
<gdppc>141100</gdppc>
</country>
<country name="Singapore">
<rank>4</rank>
<year>2011</year>
<gdppc>59900</gdppc>
</country>
</data>Example Code:
import xml.etree.ElementTree as ET
# Parse XML file
tree = ET.parse('data.xml')
# Get root element
root = tree.getroot()
# Traverse XML tree
print("--- XML Data ---")
for country in root.findall('country'):
rank = country.find('rank').text
name = country.get('name') # Get attribute
print(f"Name: {name}, Rank: {rank}")Creating XML
You can also use ElementTree to build an XML tree and write it to a file.
import xml.etree.ElementTree as ET
# Create root element
root = ET.Element("users")
# Create child element
user1 = ET.SubElement(root, "user", id="1")
name1 = ET.SubElement(user1, "name")
name1.text = "Alice"
email1 = ET.SubElement(user1, "email")
email1.text = "alice@example.com"
# ... create more users ...
# Create an ElementTree object and write to file
tree = ET.ElementTree(root)
# pretty-printing is not straightforward with ElementTree
# but this will write the file
tree.write("users.xml", encoding='utf-8', xml_declaration=True)JSON vs. XML
| Feature | JSON | XML |
|---|---|---|
| Readability | Very good | Good, but more verbose |
| Syntax | More concise, less redundant | Strict, requires closing tags |
| Parsing Speed | Usually faster | Usually slower |
| Data Types | Supports numbers, strings, booleans, arrays, objects | All data treated as strings |
| Ecosystem | Dominant in modern Web APIs | Still widely used in enterprise systems, configuration files |
Overall, for new applications, especially those related to the Web, JSON is usually the better choice. However, knowing how to process XML remains a valuable skill.