FormatForge

What is XML?

XML (Extensible Markup Language) is a markup language designed to store and transport data. It uses custom tags to define data structure and is both human-readable and machine-readable. XML is widely used in enterprise systems, web services, and configuration files.

XML Syntax

XML syntax is strict and well-defined. Here are the key rules:

  • All elements must have a closing tag
  • Tags are case-sensitive
  • Elements must be properly nested
  • Attribute values must be quoted
  • There must be exactly one root element

Basic XML Example

<?xml version="1.0" encoding="UTF-8"?>
<user id="123" active="true">
  <name>John Doe</name>
  <email>john@example.com</email>
  <age>30</age>
  <address>
    <city>New York</city>
    <country>USA</country>
  </address>
  <hobbies>
    <hobby>reading</hobby>
    <hobby>gaming</hobby>
    <hobby>hiking</hobby>
  </hobbies>
</user>

XML Components

XML Declaration

The optional declaration at the beginning specifies version and encoding:

<?xml version="1.0" encoding="UTF-8"?>

Elements

Elements are the building blocks of XML. They have opening and closing tags:

<element>content</element>

<!-- Empty element (self-closing) -->
<empty />

<!-- Element with child elements -->
<parent>
  <child>value</child>
</parent>

Attributes

Attributes provide additional information about elements:

<book isbn="978-0-123456-78-9" category="fiction">
  <title>The Great Story</title>
  <author>Jane Smith</author>
</book>

Comments

<!-- This is a comment -->
<data>
  <!-- Comments can span
       multiple lines -->
  <value>123</value>
</data>

CDATA Sections

CDATA sections contain text that should not be parsed:

<script>
  <![CDATA[
    function compare(a, b) {
      return a < b && a > 0;
    }
  ]]>
</script>

Common XML Use Cases

1. Configuration Files

<!-- Maven pom.xml -->
<project>
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.example</groupId>
  <artifactId>my-app</artifactId>
  <version>1.0.0</version>
  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.13.2</version>
    </dependency>
  </dependencies>
</project>

2. SOAP Web Services

<?xml version="1.0"?>
<soap:Envelope
  xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <GetUser xmlns="http://example.com/api">
      <userId>123</userId>
    </GetUser>
  </soap:Body>
</soap:Envelope>

3. RSS Feeds

<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>My Blog</title>
    <link>https://example.com</link>
    <item>
      <title>First Post</title>
      <link>https://example.com/post-1</link>
      <pubDate>Mon, 01 Jan 2024 00:00:00 GMT</pubDate>
    </item>
  </channel>
</rss>

XML Validation

XML documents can be validated against schemas to ensure correct structure:

DTD (Document Type Definition)

The older, simpler validation method:

<!DOCTYPE user [
  <!ELEMENT user (name, email)>
  <!ELEMENT name (#PCDATA)>
  <!ELEMENT email (#PCDATA)>
]>

XSD (XML Schema Definition)

The modern, more powerful validation method with data types:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="user">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="name" type="xs:string"/>
        <xs:element name="age" type="xs:integer"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

Common XML Mistakes

Unclosed tags

<name>John <name>John</name>

Improper nesting

<a><b></a></b> <a><b></b></a>

Unquoted attributes

<item id=123> <item id="123">

Unescaped special characters

<text>5 < 10</text> <text>5 &lt; 10</text>

Special Characters in XML

CharacterEntityDescription
<&lt;Less than
>&gt;Greater than
&&amp;Ampersand
'&apos;Apostrophe
"&quot;Quote

Learn more about other formats:

Frequently Asked Questions

What does XML stand for?

XML stands for Extensible Markup Language. It's called 'extensible' because you can define your own tags and document structure, unlike HTML which has predefined tags.

Is XML still used today?

Yes, XML is still widely used in enterprise systems, SOAP web services, configuration files (like Maven's pom.xml), document formats (DOCX, SVG), and data interchange between legacy systems.

What's the difference between XML and HTML?

HTML is for displaying data with predefined tags, while XML is for storing and transporting data with custom tags. XML is stricter - all tags must be closed and properly nested.

Should I use XML or JSON?

Use JSON for web APIs and simple data exchange. Use XML when you need document validation (XSD), mixed content, namespaces, or when integrating with enterprise/legacy systems that require it.

XML Tools