PHP: simpleXMLElement and (multiple) Namespaces
<?php
// disable errors
libxml_use_internal_errors();
// your XML String / Source
$content = file_get_contents('http://www.eslgfx.net/rss/news_default_www_de.xml');
// our root element
$xml_root = new SimpleXMLElement($content, null, false);
$namespaces = $xml_root->getNamespaces(true);
// if there aren't any namespaces, we use a default one
// with an empty prefix and no URL for the definition
if(!count($namespaces))
{
$namespaces = Array('' => null);
}
foreach($namespaces AS $prefix => $ns)
{
$xml_root->registerXPathNamespace(empty($prefix) ? '__ns' : $prefix, $ns);
}
$prefix = '';
if(isset($namespaces['rdf']))
{
$prefix = 'rdf';
}
$list = getXMLChildList($xml_root, $namespaces, $prefix);
echo '<pre>';
var_dump($list);
echo '</pre>';
function getXMLChildList(SimpleXMLElement $xml, array $namespaces, $prefix = '', $path = '', $path_dir = '/')
{
$list = Array();
$dname = (!empty($prefix) ? $prefix.':' : '').$xml->getName();
$name = (!empty($prefix) ? $prefix.':' : '__ns').$xml->getName();
$path_dir = $path_dir.$name;
$cpath = $path.'<'.$dname.'>';
if((string)$xml !== '')
{
$list[$path_dir] = $cpath;
}
foreach($namespaces AS $prefix => $ns)
{
foreach($xml->attributes($ns) AS $a)
{
$aname = (!empty($prefix) ? $prefix.':' : '').$a->getName();
$list[$path_dir.'/@'.$aname] = $path.'<'.$dname.'> @'.$aname;
}
}
foreach($namespaces AS $prefix => $ns)
{
foreach($xml->children($ns) AS $xml_child)
{
$x = getXMLChildList($xml_child, $namespaces, $prefix, $cpath.' ', $path_dir.'/');
$list = array_merge($list, $x);
}
}
return array_unique($list);
}
?>
I was working on some parsing of XML Feeds for my CMS as I had a problem with namespaces. I tried a lot to get all child elements (their paths) out of it.
This is the basic code extracted so you can use it right away.
What it does? Well that's quite simple: The code parses a XML document with SimpleXMLElement and returns a list via the function that is an array of key value pairs. Keys are xPath() conform paths you can use to query on a SimpleXMLElement for children. The value is a string giving you the path in some tag style ( @ ns:attribute) including all of the namespaces.
array(19) {
["/rdf:RDF"]=>
string(9) "<rdf:RDF>"
["/rdf:RDF/channel"]=>
string(19) "<rdf:RDF> <channel>"
["/rdf:RDF/channel/@rdf:about"]=>
string(30) "<rdf:RDF> <channel> @rdf:about"
["/rdf:RDF/channel/title"]=>
string(27) "<rdf:RDF> <channel> <title>"
["/rdf:RDF/channel/description"]=>
string(33) "<rdf:RDF> <channel> <description>"
["/rdf:RDF/channel/link"]=>
string(26) "<rdf:RDF> <channel> <link>"
["/rdf:RDF/channel/items"]=>
string(27) "<rdf:RDF> <channel> <items>"
["/rdf:RDF/channel/items/rdf:Seq"]=>
string(37) "<rdf:RDF> <channel> <items> <rdf:Seq>"
["/rdf:RDF/channel/items/rdf:Seq/rdf:li/@rdf:resource"]=>
string(60) "<rdf:RDF> <channel> <items> <rdf:Seq> <rdf:li> @rdf:resource"
["/rdf:RDF/channel/dc:date"]=>
string(29) "<rdf:RDF> <channel> <dc:date>"
["/rdf:RDF/item"]=>
string(16) "<rdf:RDF> <item>"
["/rdf:RDF/item/@rdf:about"]=>
string(27) "<rdf:RDF> <item> @rdf:about"
["/rdf:RDF/item/title"]=>
string(24) "<rdf:RDF> <item> <title>"
["/rdf:RDF/item/link"]=>
string(23) "<rdf:RDF> <item> <link>"
["/rdf:RDF/item/description"]=>
string(30) "<rdf:RDF> <item> <description>"
["/rdf:RDF/item/dc:format"]=>
string(28) "<rdf:RDF> <item> <dc:format>"
["/rdf:RDF/item/dc:date"]=>
string(26) "<rdf:RDF> <item> <dc:date>"
["/rdf:RDF/item/dc:source"]=>
string(28) "<rdf:RDF> <item> <dc:source>"
["/rdf:RDF/item/dc:creator"]=>
string(29) "<rdf:RDF> <item> <dc:creator>"
}