Searching an HTML/XML document
Basic Searching
Let’s suppose you have the following document:
[versions.xml]
<root>
<ios>
<version>
<name>iOS 10</name>
</version>
<version>
<name>iOS 9</name>
</version>
<version>
<name>iOS 8</name>
</version>
</ios>
<macos>
<version>
<name>macOS 10.12</name>
</version>
<version>
<name>macOS 10.11</name>
</version>
</macos>
<tvos>
<version>
<name>tvOS 10.0</name>
</version>
</tvos>
</root>
Let’s further suppose that you want a list of all the versions in all the showns in this document.
let xml = try! String(contentsOfFile: "versions.xml", encoding: NSUTF8StringEncoding)
if let doc = XML(xml: xml, encoding: NSUTF8StringEncoding) {
for node in doc.xpath("//name") {
print(node.text!)
}
}
// "iOS 10"
// "iOS 9"
// "iOS 8"
// "macOS 10.12"
// "macOS 10.11"
// "tvOS 10.0"
The methods xpath
and css
actually return a XPathObject
,
which acts very much like an array, and contains matching nodes from the document.
let nodes = doc.xpath("//name")
print(nodes[0].text!)
// "iOS 10"
You can use any XPath or CSS query you like.
for node in doc.xpath("//ios//name") {
print(node.text!)
}
// "iOS 10"
// "iOS 9"
// "iOS 8"
Notably, you can even use CSS queries in an XML document!
for node in doc.css("ios name") {
print(node.text!)
}
// "iOS 10"
// "iOS 9"
// "iOS 8"
CSS queries are often the easiest and most succinct way to express what you’re looking for, so don’t be afraid to use them!
Single Results
If you know you’re going to get only a single result back, you can use the shortcuts at_xpath
and at_css
instead of having to access the first element of a XPathObject.
doc.css("tvos name").first // "tvOS 10.0"
doc.at_css("tvos name") // "tvOS 10.0"
Namespaces
[libraries.xml]
<root>
<host xmlns="https://github.com/">
<title>Kanna</title>
<title>Alamofire</title>
</host>
<host xmlns="https://bitbucket.org/">
<title>Hoge</title>
</host>
</root>
let xml = try! String(contentsOfFile: "libraries.xml", encoding: NSUTF8StringEncoding)
if let doc = XML(xml: xml, encoding: NSUTF8StringEncoding) {
for node in doc.xpath("//github:title", namespaces: ["github": "https://github.com/"]) {
print(node.text!)
}
}
// "Kanna"
// "Alamofire"
if let doc = XML(xml: xml, encoding: NSUTF8StringEncoding) {
for node in doc.xpath("//bitbucket:title", namespaces: ["bitbucket": "https://bitbucket.org/"]) {
print(node.text!)
}
}
// "Hoge"