Querying the Linked Data Cloud for African Countries (Quirk #1)
Posted in Uncategorized on June 14th, 2013 by admin – Be the first to commentI’m currently embarking on an EDA project to build a graph of currently relevant data about the African continent, it’s countries and cities. There may be other projects that have done this already, but my goal is to first go through the process and discovery resources along the way, hopefully bump into some challenges with the data that I can learn how to get around, and then look to see if there are people with better solutions out there.
The motivation for this work comes from the fact that MIT is highly prioritizing what they can do to have a positive impact in Africa. At the MIT Libraries, we hold a lot of research output, some of which specifically applies to issues and topics happening in African countries. The question is whether our research is actually being seen by the people in those areas who would benefit most from it. The first step is to build a reliable data-set representing the continent, countries, cities, and holding metadata about them all.
Here’s one data quirk, and an example of the type of thing one has to deal with when using linked data in general.
This query returns 99 countries in Africa, but there are actually only 54-56 *currently. Examples of things I wouldn’t want back are things that are no longer relevant to modern geo-political questions, e.g.
{
"Country": { "type": "literal", "xml:lang": "en", "value": "Roman Empire" }
}
Here’s the actual query for the curious. I’m running this query on FactForge, which integrates roughly 8 LOD data-sets, including DBPedia, Geonames, NYT, CIA Fact Book, etc.
PREFIX ff: <http://factforge.net/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX dbpedia: <http://dbpedia.org/resource/> PREFIX pext: <http://www.ontotext.com/proton/protonext#> PREFIX ptop: <http://www.ontotext.com/proton/protontop#>SELECT DISTINCT ?Country where { ?Coun ff:preferredLabel ?Country ; rdf:type pext:Country ; ptop:subRegionOf dbpedia:Africa. FILTER ( LANG(?Country) = "en") }
* In order to easily distinguish what is the current representation of the world and what is historical, and subset either would require properties indicating this currently relevant status, e.g. (in short-hand) dbpedia:Roman_Empire dbpedia:current “false”^^xsd:boolean .














