Skip to content

Ruby XML Tutorial

XML (eXtensible Markup Language) is a markup language used for storing and transporting data. Ruby provides powerful XML processing capabilities, including parsing, generating, transforming, and querying XML documents.

📋 Chapter Contents

  • XML Basic Concepts
  • XML Processing Libraries in Ruby
  • Parsing XML Documents
  • Generating XML Documents
  • XPath Queries
  • XSLT Transformations
  • Practical Application Examples

🔧 XML Processing Libraries

Ruby provides multiple XML processing libraries:

REXML (Ruby Built-in)

ruby
require 'rexml/document'
ruby
# Requires installation first: gem install nokogiri
require 'nokogiri'

📖 XML Basic Concepts

What is XML?

XML is a markup language used to describe the structure and content of data:

xml
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
  <book id="1">
    <title>Ruby Programming Guide</title>
    <author>John Smith</author>
    <price currency="CNY">89.00</price>
    <category>Programming</category>
  </book>
  <book id="2">
    <title>Web Development Practice</title>
    <author>Jane Doe</author>
    <price currency="CNY">99.00</price>
    <category>Web Development</category>
  </book>
</bookstore>

🔍 Using REXML to Parse XML

Basic Parsing

ruby
require 'rexml/document'

# XML string
xml_string = <<-XML
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
  <book id="1">
    <title>Ruby Programming Guide</title>
    <author>John Smith</author>
    <price>89.00</price>
  </book>
</bookstore>
XML

# Create document object
doc = REXML::Document.new(xml_string)

# Get root element
root = doc.root
puts "Root element: #{root.name}"

# Iterate child elements
root.elements.each('book') do |book|
  puts "Book ID: #{book.attributes['id']}"
  puts "Title: #{book.elements['title'].text}"
  puts "Author: #{book.elements['author'].text}"
  puts "Price: #{book.elements['price'].text}"
  puts "---"
end

Reading XML from File

ruby
require 'rexml/document'

# Read from file
file = File.new("books.xml")
doc = REXML::Document.new(file)
file.close

# Process document...

🏗️ Generating XML Documents

Using REXML to Generate XML

ruby
require 'rexml/document'

# Create new document
doc = REXML::Document.new
doc.add_element('xml-stylesheet', {
  'type' => 'text/xsl',
  'href' => 'books.xsl'
})

# Create root element
root = doc.add_element('bookstore')

# Add books
book1 = root.add_element('book', {'id' => '1'})
book1.add_element('title').add_text('Ruby Programming Guide')
book1.add_element('author').add_text('John Smith')
book1.add_element('price', {'currency' => 'CNY'}).add_text('89.00')

book2 = root.add_element('book', {'id' => '2'})
book2.add_element('title').add_text('Web Development Practice')
book2.add_element('author').add_text('Jane Doe')
book2.add_element('price', {'currency' => 'CNY'}).add_text('99.00')

# Output XML
formatter = REXML::Formatters::Pretty.new
formatter.compact = true
formatter.write(doc, $stdout)

🔎 XPath Queries

XPath is a language for finding information in XML documents.

Basic XPath Syntax

ruby
require 'rexml/document'

xml_string = <<-XML
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
  <book id="1" category="programming">
    <title>Ruby Programming Guide</title>
    <author>John Smith</author>
    <price>89.00</price>
  </book>
  <book id="2" category="web">
    <title>Web Development Practice</title>
    <author>Jane Doe</author>
    <price>99.00</price>
  </book>
</bookstore>
XML

doc = REXML::Document.new(xml_string)

# Find all book titles
titles = REXML::XPath.match(doc, "//title")
titles.each { |title| puts title.text }

# Find book with specific ID
book = REXML::XPath.first(doc, "//book[@id='1']")
puts "Found book: #{book.elements['title'].text}"

# Find books with price greater than 90
expensive_books = REXML::XPath.match(doc, "//book[price > 90]")
expensive_books.each do |book|
  puts "Expensive book: #{book.elements['title'].text}"
end

# Find programming category books
programming_books = REXML::XPath.match(doc, "//book[@category='programming']")
programming_books.each do |book|
  puts "Programming book: #{book.elements['title'].text}"
end

Common XPath Expressions

ruby
# Select root element
root = REXML::XPath.first(doc, "/bookstore")

# Select all book elements
books = REXML::XPath.match(doc, "//book")

# Select first book element
first_book = REXML::XPath.first(doc, "//book[1]")

# Select last book element
last_book = REXML::XPath.first(doc, "//book[last()]")

# Select book elements with id attribute
books_with_id = REXML::XPath.match(doc, "//book[@id]")

# Select elements containing specific text
ruby_books = REXML::XPath.match(doc, "//book[contains(title, 'Ruby')]")

🔄 XSLT Transformations

XSLT (eXtensible Stylesheet Language Transformations) is used to transform XML documents into other formats.

Installing libxslt

bash
# Ubuntu/Debian
sudo apt-get install libxslt1-dev

# macOS
brew install libxslt

# Install Ruby gem
gem install nokogiri

XSLT Transformation Example

ruby
require 'nokogiri'

# XML data
xml_string = <<-XML
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
  <book id="1">
    <title>Ruby Programming Guide</title>
    <author>John Smith</author>
    <price>89.00</price>
  </book>
  <book id="2">
    <title>Web Development Practice</title>
    <author>Jane Doe</author>
    <price>99.00</price>
  </book>
</bookstore>
XML

# XSLT stylesheet
xslt_string = <<-XSLT
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <html>
      <head><title>Bookstore Catalog</title></head>
      <body>
        <h1>Bookstore Catalog</h1>
        <table border="1">
          <tr>
            <th>ID</th>
            <th>Title</th>
            <th>Author</th>
            <th>Price</th>
          </tr>
          <xsl:for-each select="bookstore/book">
            <tr>
              <td><xsl:value-of select="@id"/></td>
              <td><xsl:value-of select="title"/></td>
              <td><xsl:value-of select="author"/></td>
              <td><xsl:value-of select="price"/></td>
            </tr>
          </xsl:for-each>
        </table>
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet>
XSLT

# Perform transformation
xml_doc = Nokogiri::XML(xml_string)
xslt_doc = Nokogiri::XSLT(xslt_string)
result = xslt_doc.transform(xml_doc)

puts result.to_s

Nokogiri is the most popular XML/HTML processing library in Ruby, with better performance and more features.

Installing Nokogiri

bash
gem install nokogiri

Basic Usage

ruby
require 'nokogiri'

# Parse XML
xml_string = <<-XML
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
  <book id="1">
    <title>Ruby Programming Guide</title>
    <author>John Smith</author>
    <price>89.00</price>
  </book>
</bookstore>
XML

doc = Nokogiri::XML(xml_string)

# Using CSS selectors
doc.css('book').each do |book|
  puts "ID: #{book['id']}"
  puts "Title: #{book.at_css('title').text}"
  puts "Author: #{book.at_css('author').text}"
  puts "Price: #{book.at_css('price').text}"
end

# Using XPath
doc.xpath('//book').each do |book|
  puts "Book: #{book.at_xpath('title').text}"
end

Generating XML with Nokogiri

ruby
require 'nokogiri'

# Create XML document
builder = Nokogiri::XML::Builder.new do |xml|
  xml.bookstore do
    xml.book(id: '1') do
      xml.title 'Ruby Programming Guide'
      xml.author 'John Smith'
      xml.price '89.00', currency: 'CNY'
    end
    xml.book(id: '2') do
      xml.title 'Web Development Practice'
      xml.author 'Jane Doe'
      xml.price '99.00', currency: 'CNY'
    end
  end
end

puts builder.to_xml

📚 Next Steps

After mastering Ruby XML processing, we recommend continuing to learn:

Continue your Ruby learning journey!

Content is for learning and research only.