Ruby Strings

Strings are one of the most commonly used data types in programming, used to store and manipulate text data. Ruby strings are powerful and flexible, providing rich built-in methods for text processing. This chapter will introduce in detail the creation, manipulation, and processing methods of strings in Ruby.

🎯 String Basics

String Definition

Ruby supports multiple ways to define strings, each with specific uses:

# Single-quoted strings (don't parse escape characters and interpolation)
name = 'Zhang San'
message = 'Hello\nWorld'  # \n won't be parsed as newline
puts message  # Output: Hello\nWorld

# Double-quoted strings (parse escape characters and interpolation)
name = "Li Si"
age = 25
message = "Hello, #{name}! You are #{age} years old."
greeting = "Hello\nWorld"  # \n will be parsed as newline
puts greeting
# Output:
# Hello
# World

# Multi-line strings (heredoc syntax)
poem = <<~TEXT
  Quiet Night Thought
  Before my bed, moonlight,
  Perhaps frost on the ground.
  I lift my head and see the moon,
  I lower my head and think of home.
TEXT

puts poem
# Output:
# Quiet Night Thought
# Before my bed, moonlight,
# Perhaps frost on the ground.
# I lift my head and see the moon,
# I lower my head and think of home.

# Use %q and %Q to define strings
single_quoted = %q(This is a 'single-quoted' string)
double_quoted = %Q(This is a "double-quoted" string, containing #{name})

puts single_quoted  # This is a 'single-quoted' string
puts double_quoted  # This is a "double-quoted" string, containing Li Si

String Encoding

Ruby 1.9+ supports multiple character encodings:

# View string encoding
str = "Hello"
puts str.encoding  # UTF-8

# Specify encoding
str_with_encoding = "Hello".force_encoding("ASCII")
puts str_with_encoding.encoding  # ASCII

# Convert encoding
utf8_str = "Hello".encode("UTF-8")
puts utf8_str

🔤 String Operations

String Concatenation

# Use + for concatenation
first_name = "Zhang"
last_name = "San"
full_name = first_name + last_name
puts full_name  # ZhangSan

# Use << for concatenation (modifies original string)
greeting = "Hello, "
greeting << "World!"
puts greeting  # Hello, World!

# Use concat method
message = "Hello".concat(" ", "World")
puts message  # Hello World

# Use interpolation (recommended)
name = "Ruby"
version = "3.0"
info = "#{name} #{version}"
puts info  # Ruby 3.0

String Repetition and Length

# Repeat string
line = "-" * 20
puts line  # --------------------

# Get string length
str = "Hello, World!"
puts str.length    # 13
puts str.size      # 13
puts str.bytesize  # 13 (byte length)

# Check if empty
empty_str = ""
puts empty_str.empty?  # true
puts " ".empty?        # false
puts " ".strip.empty?  # true

String Comparison

# Basic comparison
puts "hello" == "hello"  # true
puts "hello" == "Hello"  # false (case-sensitive)

# Case-insensitive comparison
puts "hello".casecmp("Hello")  # 0 (equal)
puts "a".casecmp("B")          # -1 (less than)
puts "c".casecmp("B")          # 1 (greater than)

# Containment
puts "Hello, World!".include?("World")  # true
puts "Hello, World!".start_with?("Hello")  # true
puts "Hello, World!".end_with?("World!")   # true

🔍 String Search and Replace

Find Substrings

text = "Hello, Ruby World! Welcome to Ruby programming."

# Find position
puts text.index("Ruby")      # 7
puts text.rindex("Ruby")     # 31 (search from right)
puts text.index("Python")    # nil (not found)

# Find using regular expression
puts text.index(/R\w+/)      # 7 (match word starting with R)

# Get character
puts text[0]          # H
puts text[7]          # R
puts text[-1]         # .
puts text[7, 4]       # Ruby
puts text[7..10]      # Ruby

Replace Substrings

text = "Hello, Ruby World! Welcome to Ruby programming."

# Replace first match
new_text = text.sub("Ruby", "Python")
puts new_text  # Hello, Python World! Welcome to Ruby programming.

# Replace all matches
new_text = text.gsub("Ruby", "Python")
puts new_text  # Hello, Python World! Welcome to Python programming.

# Use regular expression replacement
new_text = text.gsub(/R\w+/, "Python")
puts new_text  # Hello, Python World! Welcome to Python programming.

# Use block for complex replacement
new_text = text.gsub(/\b\w+\b/) { |word| word.upcase }
puts new_text  # HELLO, RUBY WORLD! WELCOME TO RUBY PROGRAMMING.

🔄 String Transformation

Case Conversion

text = "Hello, World!"

# Case conversion
puts text.upcase      # HELLO, WORLD!
puts text.downcase    # hello, world!
puts text.capitalize  # Hello, world!
puts text.swapcase    # hELLO, wORLD!

# Capitalize first letter
title = "hello world"
puts title.capitalize  # Hello world

# Capitalize first letter of each word
sentence = "hello world ruby programming"
puts sentence.split.map(&:capitalize).join(" ")  # Hello World Ruby Programming

Remove Whitespace

text = "  Hello, World!  "

# Remove whitespace from both sides
puts text.strip      # Hello, World!
puts text.lstrip     # Hello, World!  
puts text.rstrip     #   Hello, World!

# Remove all whitespace
compact_text = "  Hello   World  !  "
puts compact_text.gsub(/\s+/, " ").strip  # Hello World !

Split and Join

# Split string
sentence = "apple,banana,orange,grape"
fruits = sentence.split(",")
puts fruits.inspect  # ["apple", "banana", "orange", "grape"]

# Split using regular expression
text = "apple  banana\torange\ngrape"
words = text.split(/\s+/)
puts words.inspect  # ["apple", "banana", "orange", "grape"]

# Join array into string
puts fruits.join(", ")  # apple, banana, orange, grape
puts fruits.join(" | ") # apple | banana | orange | grape

🧮 String Formatting

sprintf and Formatting

name = "Zhang San"
age = 25
score = 95.5

# Use sprintf for formatting
formatted = sprintf("Name: %s, Age: %d, Score: %.1f", name, age, score)
puts formatted  # Name: Zhang San, Age: 25, Score: 95.5

# Use % operator
formatted = "Name: %s, Age: %d, Score: %.1f" % [name, age, score]
puts formatted  # Name: Zhang San, Age: 25, Score: 95.5

# Number formatting
puts "%04d" % 42        # 0042 (zero-padded)
puts "%.2f" % 3.14159   # 3.14 (keep 2 decimal places)
puts "%10s" % "Ruby"    # "      Ruby" (right-aligned)
puts "%-10s" % "Ruby"   # "Ruby      " (left-aligned)

String Interpolation

name = "Ruby"
version = "3.0"
release_date = "2020-12-25"

# Basic interpolation
info = "#{name} #{version} released on #{release_date}"
puts info  # Ruby 3.0 released on 2020-12-25

# Expression interpolation
x, y = 10, 20
result = "#{x} + #{y} = #{x + y}"
puts result  # 10 + 20 = 30

# Control interpolation format
price = 123.456
formatted_price = "Price: #{"%.2f" % price}"
puts formatted_price  # Price: 123.46

🔧 String Utility Methods

String Validation

# Check string type
puts "123".numeric?     # true (requires custom method)
puts "123".match?(/^\d+$/)  # true (use regular expression)

# Custom validation methods
class String
  def numeric?
    !!Float(self)
  rescue ArgumentError, TypeError
    false
  end
  
  def alphabetic?
    match?(/^[a-zA-Z]+$/)
  end
  
  def alphanumeric?
    match?(/^[a-zA-Z0-9]+$/)
  end
end

puts "123".numeric?        # true
puts "abc".alphabetic?     # true
puts "abc123".alphanumeric? # true

String Processing Tips

# Safe navigation (avoid nil errors)
name = nil
puts name&.upcase       # nil (won't throw error)
puts name&.length       # nil

# Handle nil values
def safe_upcase(str)
  str&.upcase || ""
end

puts safe_upcase("hello")  # HELLO
puts safe_upcase(nil)      # ""

# Truncate string
def truncate(str, length = 20, omission = "...")
  return str if str.length <= length
  str[0, length - omission.length] + omission
end

long_text = "This is a very long text that needs to be truncated"
puts truncate(long_text, 15)  # This is a very lo...

🎯 String Practice Examples

Text Processing Tool

class TextProcessor
  def initialize(text)
    @text = text
  end
  
  # Count words
  def word_count
    @text.split(/\s+/).length
  end
  
  # Count characters (excluding spaces)
  def char_count(exclude_spaces = true)
    if exclude_spaces
      @text.gsub(/\s/, "").length
    else
      @text.length
    end
  end
  
  # Find most common word
  def most_common_word
    words = @text.downcase.gsub(/[^\w\s]/, "").split(/\s+/)
    word_count = Hash.new(0)
    words.each { |word| word_count[word] += 1 }
    word_count.max_by { |word, count| count }&.first
  end
  
  # Replace sensitive words
  def censor_words(bad_words, replacement = "*")
    result = @text
    bad_words.each do |word|
      pattern = Regexp.escape(word)
      result = result.gsub(/#{pattern}/i) do |match|
        replacement * match.length
      end
    end
    result
  end
  
  # Generate summary
  def summarize(max_words = 10)
    words = @text.split(/\s+/)
    if words.length <= max_words
      @text
    else
      words[0, max_words].join(" ") + "..."
    end
  end
end

# Use text processor
text = "Ruby is a dynamic, open-source programming language focused on simplicity and efficiency. Ruby has elegant syntax that is easy to read and write."
processor = TextProcessor.new(text)

puts "Word count: #{processor.word_count}"  # Word count: 26
puts "Character count: #{processor.char_count}"  # Character count: 132
puts "Summary: #{processor.summarize(10)}"
# Summary: Ruby is a dynamic, open-source programming language focused on...

String Validator

class StringValidator
  def self.email?(str)
    pattern = /\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i
    str.match?(pattern)
  end
  
  def self.phone?(str)
    pattern = /\A1[3-9]\d{9}\z/
    str.match?(pattern)
  end
  
  def self.url?(str)
    pattern = /\Ahttps?:\/\/[\w\-]+(\.[\w\-]+)+[/#?]?.*\z/
    str.match?(pattern)
  end
  
  def self.strong_password?(str)
    # At least 8 characters, contains uppercase, lowercase, digits, and special characters
    return false if str.length < 8
    has_upper = str.match?(/[A-Z]/)
    has_lower = str.match?(/[a-z]/)
    has_digit = str.match?(/\d/)
    has_special = str.match?(/[!@#$%^&*(),.?":{}|<>]/)
    
    has_upper && has_lower && has_digit && has_special
  end
  
  def self.id_card?(str)
    # Simplified ID card validation (18 digits, last may be X)
    pattern = /\A\d{17}[\dXx]\z/
    str.match?(pattern)
  end
end

# Use validator
puts StringValidator.email?("user@example.com")           # true
puts StringValidator.email?("invalid.email")             # false
puts StringValidator.phone?("13812345678")               # true
puts StringValidator.url?("https://www.example.com")     # true
puts StringValidator.strong_password?("Password123!")    # true
puts StringValidator.id_card?("110101199001011234") # true

String Formatting Tool

class StringFormatter
  # Convert camelCase to snake_case
  def self.camel_to_snake(str)
    str.gsub(/([A-Z]+)([A-Z][a-z])/, '\1_\2')
       .gsub(/([a-z\d])([A-Z])/, '\1_\2')
       .downcase
  end
  
  # Convert snake_case to camelCase
  def self.snake_to_camel(str)
    str.split('_').map.with_index { |word, i| 
      i == 0 ? word : word.capitalize 
    }.join
  end
  
  # Format phone number
  def self.format_phone(phone)
    phone.gsub(/\D/, "")  # Remove non-digit characters
         .gsub(/(\d{3})(\d{4})(\d{4})/, '\1-\2-\3')
  end
  
  # Format amount
  def self.format_currency(amount, currency = "$")
    "#{currency}#{'%.2f' % amount.to_f}"
  end
  
  # Format date
  def self.format_date(date_str, format = :us)
    case format
    when :cn
      date_str.gsub(/(\d{4})-(\d{2})-(\d{2})/, '\1 year \2 month \3 day')
    when :us
      date_str.gsub(/(\d{4})-(\d{2})-(\d{2})/, '\2/\3/\1')
    else
      date_str
    end
  end
end

# Use formatting tool
puts StringFormatter.camel_to_snake("userName")        # user_name
puts StringFormatter.camel_to_snake("XMLHttpRequest")  # xml_http_request
puts StringFormatter.snake_to_camel("user_name")       # userName
puts StringFormatter.format_phone("13812345678")       # 138-1234-5678
puts StringFormatter.format_currency(1234.5)           # $1234.50
puts StringFormatter.format_date("2023-12-25", :cn)    # 2023 year 12 month 25 day

📊 String Performance Optimization

String Building Optimization

# Inefficient way (creates new object each concatenation)
def inefficient_build(strings)
  result = ""
  strings.each { |str| result += str }
  result
end

# Efficient way (use << to modify original object)
def efficient_build(strings)
  result = ""
  strings.each { |str| result << str }
  result
end

# Most efficient way (use array join)
def most_efficient_build(strings)
  strings.join("")
end

# Use StringIO for large string operations
require 'stringio'

def build_with_stringio(parts)
  io = StringIO.new
  parts.each { |part| io << part }
  io.string
end

String Freezing

# Freeze string to prevent modification
CONSTANT_STRING = "This is a constant string".freeze

# Freeze string literals (Ruby 3.0 default)
# frozen_string_literal: true
# str = "This will be automatically frozen"

# Check if string is frozen
puts CONSTANT_STRING.frozen?  # true

🎯 String Best Practices

1. Choose Appropriate String Definition Method

# For simple text, use double quotes (support interpolation)
name = "Zhang San"
greeting = "Hello, #{name}!"

# For text with special characters, use single quotes
sql = 'SELECT * FROM users WHERE name = "Zhang San"'

# For multi-line text, use heredoc
template = <<~HTML
  <div class="user-card">
    <h1>#{name}</h1>
    <p>Welcome to our service</p>
  </div>
HTML

2. Safely Process User Input

class SafeStringHandler
  # Escape HTML special characters
  def self.escape_html(str)
    str.gsub(/&/, "&amp;")
       .gsub(/</, "&lt;")
       .gsub(/>/, "&gt;")
       .gsub(/"/, "&quot;")
       .gsub(/'/, "&#39;")
  end
  
  # Safe string truncation
  def self.safe_truncate(str, length, omission = "...")
    return str if str.nil? || str.length <= length
    str[0, length - omission.length] + omission
  end
  
  # Clean string (remove control characters)
  def self.sanitize(str)
    str.gsub(/[[:cntrl:]]/, "")
  end
end

# Use safe processing
user_input = "<script>alert('XSS')</script>"
safe_output = SafeStringHandler.escape_html(user_input)
puts safe_output  # &lt;script&gt;alert(&#39;XSS&#39;)&lt;/script&gt;

3. String Encoding Handling

class EncodingHandler
  # Convert to UTF-8
  def self.to_utf8(str)
    return str if str.encoding == Encoding::UTF_8
    str.encode(Encoding::UTF_8, invalid: :replace, undef: :replace)
  end
  
  # Detect and handle encoding
  def self.handle_encoding(str)
    # Detect encoding
    detected_encoding = CharlockHolmes::EncodingDetector.detect(str)
    
    if detected_encoding && detected_encoding[:encoding] != 'UTF-8'
      str.force_encoding(detected_encoding[:encoding])
         .encode('UTF-8', invalid: :replace, undef: :replace)
    else
      str
    end
  rescue
    str.force_encoding('UTF-8')
  end
end

📚 Next Steps

After mastering Ruby string operations, it is recommended to continue learning:

Continue your Ruby learning journey!