########################### XML from absolutely nothing ########################### ************ XML elements ************ The basic unit of an XML document is an `XML element `_. A standard XML element consists of: * a start *tag* |--| e.g. ````; * optional content |--| e.g. ``Some data``; * an end *tag* |--| e.g. ````. :: Some data If an element has no content, it can be abbreviated into an *empty element tag*:: .. testcode:: :hide: from __future__ import print_function import xml.etree.ElementTree as ET print(ET.fromstring("").tag) .. testoutput:: :hide: my-element To repeat then, there are three types of tag: * a start tag |--| e.g. ````; * an end tag |--| e.g. ````; * an empty element tag |--| e.g ````. A tag always starts with ``<`` and ends with ``>``. A start tag and an empty element tag must start with an *element name*. This is a case-sensitive string starting with a letter or underscore, followed by any combination of letters, digits, hyphens, underscores, and periods. A start tag and an empty element tag can have *attributes*. These are name, value pairs:: Some text The *content* of an element consists of zero or more items, where an item can be: * Text * An element Elements contained in other elements are *child elements*, e.g:: with text content with more text You can mix text items and element items in element content, like this:: Some text with text content More text with more text Text continues but it is more common to have element content that is *either* one or more element items, *or* one single text item. ************* XML documents ************* There is a single element at the root of a valid XML document. This is the *root element*. .. writefile:: :language: xml :cwd: /working # file: some_example.xml Some text More text To take another example, this would be a valid XML document, because it has a single element at the root level:: Some text But this would not, because it has two elements at the root level:: Some text More text The XML document may start with a special construction called the *XML prolog* of this form:: Default XML encoding is UTF-8, but you can specify another encoding in the XML prolog:: *********** Reading XML *********** For example, in Python: .. workrun:: pycon >>> import xml.etree.ElementTree as ET >>> tree = ET.parse('some_example.xml') >>> root = tree.getroot() >>> print(root.tag) >>> print(root.attrib) >>> children = root.getchildren() >>> print(len(children)) >>> only_child = children[0] >>> for child in only_child.getchildren(): ... print(child.tag, child.text) .. include:: links_names.inc