# XML from absolutely nothing¶

## XML elements¶

The basic unit of an XML document is an XML element.

A standard XML element consists of:

• a start tag – e.g. <my-element>;
• optional content – e.g. Some data;
• an end tag – e.g. </my-element>.
<my-element>Some data</my-element>


If an element has no content, it can be abbreviated into an empty element tag:

<my-element />


To repeat then, there are three types of tag:

• a start tag – e.g. <a-name>;
• an end tag – e.g. </a-name>;
• an empty element tag – e.g <a-name />.

A tag always starts with < and ends with >.

A start tag and an empty element tag must start with an element name. This is a case-sensitive string starting with a letter or underscore, followed by any combination of letters, digits, hyphens, underscores, and periods.

A start tag and an empty element tag can have attributes. These are name, value pairs:

<a-name an-attribute="my value" another-attribute="3">Some text</a-name>
<a-name an-attribute="my value" another-attribute="3" />


The content of an element consists of zero or more items, where an item can be:

• Text
• An element

Elements contained in other elements are child elements, e.g:

<a-parent>
<a-child>with text content</a-child>
<another-child>with more text</another-child>
</a-parent>


You can mix text items and element items in element content, like this:

<a-parent>
Some text
<a-child>with text content</a-child>
More text
<another-child>with more text</another-child>
Text continues
</a-parent>


but it is more common to have element content that is either one or more element items, or one single text item.

## XML documents¶

There is a single element at the root of a valid XML document. This is the root element.

Contents of some_example.xml
<a-root-element my-type="example">
<at-second-level>
<first-thing>Some text</first-thing>
<second-thing>More text</second-thing>
</at-second-level>
</a-root-element>


To take another example, this would be a valid XML document, because it has a single element at the root level:

<my-element>
Some text
</my-element>


But this would not, because it has two elements at the root level:

<my-element>
Some text
</my-element>
<another-element>
More text
</another-element>


The XML document may start with a special construction called the XML prolog of this form:

<?xml version="1.0"?>


Default XML encoding is UTF-8, but you can specify another encoding in the XML prolog:

<?xml version="1.0" encoding="UTF-16"?>


For example, in Python:


>>> import xml.etree.ElementTree as ET
>>> tree = ET.parse('some_example.xml')
>>> root = tree.getroot()
>>> print(root.tag)
a-root-element
>>> print(root.attrib)
{'my-type': 'example'}
>>> children = root.getchildren()
>>> print(len(children))
1
>>> only_child = children[0]
>>> for child in only_child.getchildren():
...     print(child.tag, child.text)
...
('first-thing', 'Some text')
('second-thing', 'More text')