Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

SPARQL

A guide to CORDIS Linked Open Data

What is Linked Open Data?

Linked Open Data (LOD) is a combination of Linked Data and Open Data. Linked Data refers to machine-readable data shared on the Web, while Open Data allows for data to be used and distributed freely.

Linked Open Data is a method of accessing the decentralized web in a centralized way. It provides users with the means and services to discover the most relevant and accurate information. By combining the Linked Data design principles with machine-readable structured data, LOD can offer more useful information, interlinked with other data for further discovery.

The FAIR principles (Findable, Accessible, Interoperable, Reusable) and the 5-star deployment scheme for Open Data as described by Tim Berners-Lee ensure that data can be freely shared and distributed on the web.

As part of the Linked Open Data initiative, the Resource Description Format (RDF) is the primary language and technology to express and publish information about data, as well as interlink them on the Web. RDF allows to structure the data as subject-predicate-object triples.

EURIO Knowledge Graph

A knowledge graph represents real-world entities (e.g., projects, organisations, project results such as project deliverables) along with their relationships (e.g., an organization’s participation in a project) and attributes (e.g., the start date of a project or the VAT number of an organisation) as an interconnected network comprising nodes and edges.

Knowledge graphs provide a structured, machine-readable representation of data, promoting integration, linking, and reuse of knowledge. The EURIO Knowledge Graph makes use of the knowledge graph representation paradigm) to transform the CORDIS data into machine-readable interlinked data.

The data is published in the form of Resource Description Format (RDF) triples, following the Linked Open Data principles. The meaning of the entities described is formally defined by the EUropean Information Research Ontology (EURIO). The resulting EURIO Knowledge Graph is a network of interconnected RDF triples that encode the original CORDIS data, and it can be queried using SPARQL, which is the standardized language for retrieving and manipulating data in RDF format.

The EURIO ontology

To improve the visibility, reusability, and accessibility of CORDIS content, and boost its semantic interoperability, the Publications Office of the European Commission has developed the EUropean Research Information Ontology (EURIO). EURIO is a conceptual data model that draws on a network of existing ontologies (e.g., schema.org, DINGO, etc.) and reference data (e.g., the EuroSciVoc taxonomy, the NUTS code list, etc.). It provides the means to describe, among others, administrative information associated with research projects and their grants, such as start and end dates, total cost and funding received, information about the organisations and persons involved, as well as the produced project results, such the list of authors, title and journal information about a publication.

EURIO uses the OWL 2 Web Ontology Language to formally define the meaning of the domain terms used to describe the CORDIS entities (e.g., projects, organisations, etc.), their attributes (e.g., title, acronym, legal name, etc.) and interrelations (e.g., the relation between a project and the participating organisations, etc.),.

The EURIO ontology and its documentation can be accessed on the EU Vocabularies website.

Using SPARQL to query the EURIO Knowledge Graph

SPARQL is a standard query language for retrieving and manipulating data stored in RDF format. Its development and evolution are overseen by the SPARQL Working Group within W3C and it is fully documented and publicly available.

SPARQL queries are based around graph pattern matching, i.e., the matching of sets of triple patterns forming conjunctive (AND) or disjunctive (OR) conditions. Triple patterns are like RDF triples except that each of the subject, predicate and object may be a variable. A given SPARQL query graph pattern matches a subgraph of the queried RDF data when RDF terms from that subgraph may be substituted for the variables.

The examples below aiming to demonstrate how SPARQL queries can be constructed to search and retrieve information from the EURIO Knowledge Graph.

Let us begin with this simple example to start getting more specific information:

Find all project titles

The PREFIX keyword associates a label with the IRI of the namespace where the entities are defined; in the running example, we use the terms “Project” and “title” which are defined in the EURIO ontology whose IRI is http://data.europa.eu/s66#.

The query consists of two parts:

  • the SELECT clause sets the variables to appear in the query results. In our example, ?project_title contains the requested projects’ title value (the name of variables up to the user’s choice).
  • the WHERE clause provides the graph pattern to match against the EURIO Knowledge Graph. Our example consists of two conjunctive triple patterns, i.e., two patterns that must be matched:
    • a triple pattern with the variable ?project used to express that the requested project entities should belong to the class eurio:Project.
    • a triple pattern that starts with the variable ?project where the variable ?title should contain the titles these project

We order the results by title with the ORDER BY clause, which must be put after the WHERE clause.

Because we expect the resulting list to be long, we use the LIMIT clause to show only 100 results. Also, for demonstrating purposes, we skip the first 1000 results with the OFFSET clause.

The complete query is as follows:

PREFIX eurio: <http://data.europa.eu/s66#>
SELECT ?project_title
WHERE {
  ?project a eurio:Project .
  ?project eurio:title ?project_title .
}
ORDER BY ?project_title
LIMIT 100
OFFSET 1000

Using a project’s title, we could be interested in finding the following information:

Find the start and end dates of the project “taRgeted thErapy for adVanced colorEctal canceR paTients”

To this end, we specify these conditions into the WHERE clause:

  • We set the project title (eurio:title) value to be the string “taRgeted thErapy for adVanced colorEctal canceR paTients".
  • The project’s start date is stored into the ?project_start_date variable.
  • The project’s end date is stored into the ?project_end_date variable.

The query is as follows:

PREFIX eurio: <http://data.europa.eu/s66#>
SELECT ?project_start_date ?project_end_date 
WHERE {
  ?project eurio:title 'taRgeted thErapy for adVanced colorEctal canceR paTients'.
  ?project eurio:startDate ?project_start_date .
  ?project eurio:endDate ?project_end_date .
}

We can delve deeper in the project from the previous query with the following one:

Find the participants in the project taRgeted thErapy for adVanced colorEctal canceR paTients and their role in the project

Considering the information we want returned, we specify these conditions into the WHERE clause:

  • We set the project title (eurio:title) value to be the string “taRgeted thErapy for adVanced colorEctal canceR paTients".
  • The involved organisations roles of the project is stored in the ?organisation_role variable.
  • The label of the organisation role is stored in the ?role_label variable.
  • The ?organisation variable refers to the associated participants.
  • The ?organisation_name variable contains the legal names of the organisations.
  • Also, the output is alphabetically sorted based on the organisation_name using ORDER BY DESC(?organisation_name).
PREFIX eurio: <http://data.europa.eu/s66#> 
SELECT ?organisation_name ?role_label
WHERE {
  ?project eurio:title 'taRgeted thErapy for adVanced colorEctal canceR paTients' .
  ?project eurio:hasInvolvedParty ?organisation_role .
  ?organisation_role eurio:roleLabel ?role_label .
  ?organisation_role eurio:isRoleOf ?organisation .
  ?organisation eurio:legalName ?organisation_name .
} 
ORDER BY DESC(?organisation_name)

Since CORDIS contains research information, we could also to exploit the EuroSciVoc taxonomy to filter our results. EuroSciVoc is a multilingual taxonomy that represents all the main fields of science in CORDIS and it is used to classify the data. It contains more than 1000 categories in 6 languages (English, French, German, Italian, Polish and Spanish) and each category is enriched with relevant keywords extracted from the textual description of CORDIS projects.

The next two queries show how to combine information obtained from them and use EuroSciVoc to restrict the results to the field of science of artificial intelligence.

Retrieve the title and the start and end dates of all projects with participants from Greece in the area of artificial intelligence

Considering the information we want returned, we specify these conditions into the WHERE clause:

  • The projects must be categorised as “artificial intelligence” in EuroSciVoc. We find them through the label of the EuroSciVoc concept (?euroSciVoc_label_value), which we set to be “artificial intelligence”. We also add “@en” to make sure we find the English label of the concept of artificial intelligence.
  • To find the projects in Greece we have to find the involved organisations (stored in the ?organisation variable) and retrieve their country by looking for the eurio:Country entity whose eurio:name is “Greece”.
  • The project’s title is stored into the ?project_title variable.
  • The project’s start date is stored into the ?project_start_date variable.
  • The project’s end date is stored into the ?project_end_date variable.

The resulting SPARQL query is as follows:

PREFIX eurio: <http://data.europa.eu/s66#>
PREFIX skos-xl: <http://www.w3.org/2008/05/skos-xl#>
SELECT DISTINCT ?project_title ?project_start_date ?project_end_date 
WHERE {
  ?project eurio:title ?project_title .
  ?project eurio:startDate ?project_start_date .
  ?project eurio:endDate ?project_end_date .
  ?project eurio:hasEuroSciVocClassification ?euroSciVoc.
  ?euroSciVoc skos-xl:prefLabel ?euroSciVoc_label .
  ?euroSciVoc_label skos-xl:literalForm 'artificial intelligence'@en.
  ?role eurio:isInvolvedIn ?project .
  ?role eurio:isRoleOf ?organisation .
  ?organisation eurio:hasSite ?site .
  ?site eurio:hasGeographicalLocation ?country .
  ?country a eurio:Country;
  eurio:name 'Greece' .
}

Retrieve the names of the participants and their role for all projects with participants from Greece in the area of artificial intelligence Greece

Considering the information we want returned, the query is like the previous one, with the following conditions into the WHERE clause:

  • The projects must be categorised as “artificial intelligence” in EuroSciVoc. We find them through the label of the EuroSciVoc concept (?euroSciVoc_label_value), which we set to be “artificial intelligence”. We also add “@en” to make sure we find the English label of the concept of artificial intelligence.
  • To find the projects in Greece, we have to find the involved organisations (stored in the ?organisation variable) and retrieve their country by looking for the eurio:Country entity whose eurio:name is “Greece".
  • The eurio:hasInvolvedParty of the ?project is stored in the ?organisation_role variable.
  • The eurio:roleLabel of the ?organisation_role is stored in the ?role variable.
  • The eurio:isRoleOf of the ?organisation_role is stored in the ?organisation variable.
  • The eurio:legalName of the ?organisation is stored in the ?organisation_name variable.
PREFIX eurio: <http://data.europa.eu/s66#>
PREFIX skos-xl: <http://www.w3.org/2008/05/skos-xl#>
SELECT ?project_title ?role ?organisation_name 
WHERE {
  ?project eurio:hasEuroSciVocClassification ?euroSciVoc.
  ?euroSciVoc skos-xl:prefLabel ?euroSciVoc_label .
  ?euroSciVoc_label skos-xl:literalForm 'artificial intelligence'@en .
  ?project eurio:title ?project_title .
  ?project eurio:hasInvolvedParty ?organisation_role .
  ?organisation_role eurio:roleLabel ?role .
  ?organisation_role eurio:isRoleOf ?organisation .
  ?organisation eurio:legalName ?organisation_name .
  ?organisation eurio:hasSite ?site .
  ?site eurio:hasGeographicalLocation ?country .
  ?country a eurio:Country;
  eurio:name 'Greece' .
}
ORDER BY DESC(?project_title)

While investigating the impact of the field of artificial intelligence in Greece, one should know that CORDIS also contains information about the scientific publications of projects, so we could extend the previous query:

Given an organisation, find the related projects in the field of artificial intelligence and their publications

Considering the information we want returned, we specify these conditions into the WHERE clause:

  • The projects must be categorised as “artificial intelligence” in EuroSciVoc. We find them through the label of the EuroSciVoc concept (?euroSciVoc_label_value), which we set to be “artificial intelligence”. We also add “@en” to make sure we find the English label of the concept of artificial intelligence.
  • In order find the expected organisation, we retrieve the legal names of the organisations and store them in the variable ?organisation_name. Then, we specify the organisation with a filter: FILTER(str(?organisation_name) = 'ETHNIKO KENTRO EREVNAS KAI TECHNOLOGIKIS ANAPTYXIS').
  • The query stores all the projects that are related to the artificial intelligence field of science in the variable ?project.
  • Then, it finds all the organisation roles associated (the roles are stored in the ?role variable).
  • In parallel, it checks which of the organisation roles are associated with the organisation given as input in ?organisation variable.
  • Based on these checks, it infers the projects that are associated with the input organisation, and it stores the project publications. The publications are stored in the ?result variable.
  • The query returns: (i) the title of the project they are involved in (?project_title), (ii) the title of the publication (?publication_title), (iii) its author(s) (?authors), (iv) its DOI (?doi), and (v) its publisher (?publisher). The results are ordered by publication and project title using ORDER BY ?publication ?project_title
PREFIX eurio: <http://data.europa.eu/s66#>
PREFIX skos-xl: <http://www.w3.org/2008/05/skos-xl#>
SELECT ?project_title ?publication ?authors ?doi ?publisher
WHERE {
  ?organisation eurio:legalName ?organisation_name .
  ?role eurio:isRoleOf ?organisation .
  ?role eurio:isInvolvedIn ?project .
  ?project eurio:hasEuroSciVocClassification ?euroSciVoc .
  ?project eurio:title ?project_title .
  ?project eurio:hasResult ?result .
  ?result a eurio:ProjectPublication .
  ?euroSciVoc skos-xl:prefLabel ?euroSciVoc_label .
  ?euroSciVoc_label skos-xl:literalForm 'artificial intelligence'@en .
  OPTIONAL { ?result eurio:title ?publication }
  OPTIONAL { ?result eurio:doi ?doi }
  OPTIONAL { ?result eurio:author ?authors }
  OPTIONAL { ?result eurio:publisher ?publisher }
  FILTER(str(?organisation_name) = 'ETHNIKO KENTRO EREVNAS KAI TECHNOLOGIKIS ANAPTYXIS')
}
ORDER BY ?publication ?project_title

Up to now, we have been talking about artificial intelligence projects in Greece, so we could be interested in how much this topic is researched at European level with the following query:

List the top 10 countries by number of projects in the area of artificial intelligence

We specify the conditions into the WHERE clause:

  • The projects must be categorised as “artificial intelligence” in EuroSciVoc. We find them through the label of the EuroSciVoc concept (?euroSciVoc_label_value), which we set to be “artificial intelligence”. We also add “@en” to make sure we find the English label of the concept of artificial intelligence.
  • The query stores all the projects that are related to the field of science of artificial intelligence in the ?project variable.
  • Then, it finds all the organisation roles that are associated with the projects (the roles are stored in the ?role variable).
  • Next, it finds the organisation linked to the organisation roles and stores them in the ?organisation variable. Finally, the query finds the countries in which the organisations are located and store their names in the ?country_name variable.
  • The results are sorted by number of projects in a descending order with ORDER BY DESC (?total_projects_in_country) and LIMIT 10 to show the first ten countries.
PREFIX eurio: <http://data.europa.eu/s66#>
PREFIX skos-xl: <http://www.w3.org/2008/05/skos-xl#>
SELECT ?country_name (COUNT(?project) AS ?total_projects_in_country)
WHERE {
  ?project eurio:hasEuroSciVocClassification ?euroSciVoc.
  ?euroSciVoc skos-xl:prefLabel ?euroSciVoc_label .
  ?euroSciVoc_label skos-xl:literalForm 'artificial intelligence'@en .
  ?role eurio:isInvolvedIn ?project .
  ?role eurio:isRoleOf ?organisation .
  ?organisation eurio:hasSite ?site .
  ?site eurio:hasGeographicalLocation ?country .
  ?country a eurio:Country .
  ?country eurio:name ?country_name .
} 
GROUP BY ?country_name
ORDER BY DESC (?total_projects_in_country)
LIMIT 10

The SELECT queries described above comprise one of the query forms defined by SPARQL and which enable to specify and use the solutions from pattern matching to form result sets or RDF graphs. These are:

  • SELECT, which, as presented above, returns all, or a subset of, the variables bound in a query pattern match.
  • CONSTRUCT, which returns an RDF graph constructed by substituting variables in a set of triple templates.
  • ASK, which returns a boolean indicating whether a query pattern matches or not.
  • DESCRIBE, which returns an RDF graph that describes the resources found.

For further information into the use of the different query forms, as well as a comprehensive description of the overall features of the SPARQL query language, the official SPARQL 1.1 Query Language documentation should be consulted.

How to trigger federated queries?

In all the examples presented above, the queries were executed over the data contained in the EURIO Knowledge Graph. However, with the growing number of SPARQL query services (SPARQL endpoints) by various data providers via the publishing of their data as Linked Open Data, the opportunity to query jointly these distributed LOD datasets emerges.

To allow this, SPARQL uses the SERVICE extension. This extension allows to direct a portion of a query to a particular SPARQL endpoint and to combine the returned results with the results of the rest of the query.

The next query combines information from EURIO and the EU Knowledge Graph, which contains, among other, Kohesio projects, which are projects financed by the European Union under the Cohesion policy for the 2014-2020 programming period. Kohesio covers projects developed under the national and regional operational programmes that are co-financed by the European Regional Development Fund (ERDF), the Cohesion Fund (CF), and the European Social Fund (ESF), including, where relevant, the Youth Employment Initiative. Kohesio also includes the European Territorial Cooperation Programmes projects (also known as INTERREG).

This query allows the user to:

Identify in both CORDIS and KOHESIO the number of projects and their funding in which a given organisation participates

We specify the conditions into the WHERE clause:

  • We specify that the value of the variable ?organisation_name is 'UNIVERSIDAD CARLOS III DE MADRID'.
  • It then find all the organisation roles linked to the projects (stored in the ?role variable), together with the maximum grant amount information (stored in ?grant_amount), and also ensures that they are the ones of the 'UNIVERSIDAD CARLOS III DE MADRID' via the ?organisation variable.
  • From ?grant_payment, the query traverses the properties to get to the funding amount, which we store in the ?funding_amount variable.
  • The query then ask from Kohesio endpoint, based on the name of the input organisation from the EURIO Knowledge Graph:
    1. the total cost for each project (stored in ?total_budget variable)
    2. the EU contribution (stored in ?eu_contribution).
  • The query returns: (i) the total number of projects in CORDIS (?total_projects_cordis), (ii) the total number of projects in Kohesio (?total_projects_kohesio), (iii) the total cost for each project (through ?total_budget), and (iv) the EU contribution (?eu_contribution) for the given organisation. To obtain the values, we use the SUM function, which adds the values in the specified expression.
PREFIX eurio:<http://data.europa.eu/s66#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX kohesio: <https://linkedopendata.eu/prop/direct/>
SELECT (COUNT(DISTINCT ?project_cordis) AS ?total_projects_cordis) (SUM(?funding_amount) AS ?total_funding_cordis)
(COUNT(DISTINCT ?project_kohesio) AS ?total_projects_kohesio) (SUM(?budget) AS ?total_budget_kohesio) (SUM(?eu_contribution) AS ?total_eu_contribution_kohesio)
WHERE {
  ?organisation eurio:legalName 'UNIVERSIDAD CARLOS III DE MADRID'.
  ?role eurio:isRoleOf ?organisation ;
  eurio:isInvolvedIn ?project_cordis ;
  eurio:isRecipientOf ?grant_amount .
  ?grant_amount eurio:hasPaymentAmount ?monetary_amount . 
  ?monetary_amount eurio:value ?funding_amount .
  SERVICE <https://query.linkedopendata.eu/sparql> {
    ?project_kohesio kohesio:P841 'UNIVERSIDAD CARLOS III DE MADRID' ;
      kohesio:P474 ?budget ;
      kohesio:P835 ?eu_contribution .
  }
}

The next query is based on the notion of “small university”, which the Times Higher Education, a British magazine responsible for the annual Times Higher Education–QS World University Ranking, defines as any university with less than 5 000 students. Since this information is not present in CORDIS, we rely on Wikidata, which is a free, collaborative, multilingual, secondary knowledge base, collecting structured data to provide support for Wikipedia, Wikimedia Commons, the other wikis of the Wikimedia movement, and to anyone in the world.

This query allows the user to:

Find all small universities and their countries, plus the title and their id of the projects to which they participate in the area of drug safety

We specify the conditions into the WHERE clause:

  • The projects must be categorised as drug safety in EuroSciVoc. We find them through the label of the EuroSciVoc concept (?euroSciVoc_label_value), which we set as “drug safety”. Finally, we add “@en” to make sure we find the English label of the concept of drug safety.
  • The query stores all the projects that are related to the field of science of drug safety, their title and their grant agreement ids in the variable ?project.
  • Then, it finds all the organisation roles that are associated with the projects (the roles are stored in the ?role variable).
  • The universities’ names are stored in the ?organisation_name variable, and the vat number in the ?vat_number. We use the VAT number of universities to match their Wikidata counterparts (since it can be considered a uniquely identifying value)
  • Finally, the query finds the countries in which the organisations are located and stores them in the ?country_name variable.
  • In addition to finding the universities, we use the Wikidata ?statement, which contains both the information on the number of students (?students) and the date at which this number was calculated (?time, which is another ?statement that contains the actual date in the ?date variable).
  • Since we need specific sets of information, we use two FILTER functions:
    1. The first filters all universities with less than 5 000 students
    2. The second use the function YEAR on the ?date variable to transform it into a year, e.g., from “1 January 2022” to “2022”, which unifies how the date is represented. This filter is set to be from 2020 onwards to account for recent information on the student population.
  • The query returns: (i) the name of the university (?organisation_name), (ii) the title of the project they are involved in (?project_title), and (iii) its grant agreement ID (?id)
PREFIX eurio:<http://data.europa.eu/s66#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX skos-xl: <http://www.w3.org/2008/05/skos-xl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX pqv: <http://www.wikidata.org/prop/qualifier/value/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX wikibase: <http://wikiba.se/ontology#> 
SELECT DISTINCT ?organisation_name ?project_title ?id 
WHERE { 
  ?role eurio:isRoleOf ?organisation .
  ?role eurio:isInvolvedIn ?project .
  ?project eurio:title ?project_title .
  ?project eurio:identifier ?id .
  ?project eurio:hasEuroSciVocClassification ?euroSciVoc.
  ?euroSciVoc skos-xl:prefLabel ?euroSciVoc_label .
  ?euroSciVoc_label skos-xl:literalForm 'drug safety'@en .
  ?organisation eurio:legalName ?organisation_name .
  ?organisation eurio:vatNumber ?vat_number.
  ?organisation eurio:hasSite ?site .
  ?site eurio:hasGeographicalLocation ?country .
  ?country a eurio:Country .
  ?country eurio:name ?country_name .
  SERVICE <https://query.wikidata.org/sparql> {
    ?wikidata_org wdt:P3608 ?vat_number .
    ?wikidata_org p:P2196 ?statement.
    ?statement pqv:P585 ?time .
    ?statement ps:P2196 ?students .
    ?time wikibase:timeValue ?date .
    FILTER (YEAR(?date) >= 2020)
    FILTER(?students < 5000)
  }
}

The next federated query extends the search with CELLAR data. Cellar is the common data repository of the Publications Office of the European Union. Digital publications and metadata are stored in and disseminated via Cellar, to be used by humans and machines. Aiming to transparently serve users, Cellar stores multilingual publications and metadata, it is open to all EU citizens and provides machine-readable data.

Being the official repository for the Publications Office of the European Union, CELLAR contains several authority tables, one of them being countries and territories, which includes additional geo-political information which is missing from CORDIS.

The next three queries show you how to extract three pieces of information by matching countries and territories from CELLAR with CORDIS countries via the ISO 3166-1 alpha-2 codes. These pieces of information are:

  • Continents: the query uses the notion of continent to find all African participants to projects, together with their id and title.
  • Geographical regions: the query uses the United Nations geoscheme, which is a regional grouping of countries and territories according to the United Nations geoscheme system, to find all participants to projects, together with their id and title, from the geographic region of western Africa.
  • International partnerships: the query uses international partnerships such as the Southern Neighbourhood to find all participants to projects, together with their id and title, belonging to this specific partnership, which includes ten partner countries: Algeria, Egypt, Israel, Jordan, Lebanon, Libya, Morocco, Palestine, Syria and Tunisia.

Find projects that have participants from Africa

We specify the conditions in the WHERE clause:

  • We start by querying the CELLAR database, specifying the graph http://publications.europa.eu/resource/authority/country to retrieve all countries, identified by their skos:notation. It filters the ?cellar_code to ensure that it adheres to the ISO 3166-1 alpha-2 datatype.
  • Within the SERVICE block, it checks for countries from Africa, using the ogcgs:sfWithin property to link specific countries to the URI of the African continent (http://publications.europa.eu/resource/authority/continent/AFRICA) which comes from the authority table for continents.
  • For each project, the query retrieves the project title (stored in ?project_title), its grant agreement identifier (stored in ?id), and the legal name of the associated organisation (stored in ?organisation_name). It also gathers information about the geographical location of the organisation’s site, which includes the country name (stored in ?country_name).
  • Furthermore, the query binds the string representation of the cellar code to ?cordis_code for identifying the country in the organisation’s address from CORDIS.
  • The query returns: (i) the title of the project they are involved in (?project_title), (ii) its grant agreement identifier (?id), (iii) the name of the organisation (?organisation_name), and (iv) the country of the organisation (?country_name). Using the ORDER BY ASC (?country_name), we order the result alphabetically in ascending order based on the country’s name.
PREFIX eurio:<http://data.europa.eu/s66#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX ogcgs: <http://www.opengis.net/ont/geosparql#>
SELECT ?project_title ?id ?organisation_name ?country_name
WHERE {
  SERVICE <https://publications.europa.eu/webapi/rdf/sparql>
    {GRAPH <http://publications.europa.eu/resource/authority/country>{
      ?s skos:notation ?cellar_code.
      FILTER(datatype(?cellar_code)=<http://publications.europa.eu/ontology/euvoc#ISO_3166_1_ALPHA_2>)
      ?s ogcgs:sfWithin <http://publications.europa.eu/resource/authority/continent/AFRICA> 
    }
  }
  BIND(STR(?cellar_code) as ?cordis_code)
  ?project eurio:title ?project_title .
  ?project eurio:identifier ?id .
  ?project eurio:hasInvolvedParty ?participant . 
  ?participant eurio:isRoleOf ?organisation . 
  ?organisation eurio:legalName ?organisation_name .
  ?organisation eurio:hasSite ?site .
  ?site eurio:hasAddress ?address .
  ?address eurio:addressCountry ?cordis_code .
  ?site eurio:hasGeographicalLocation ?country .
  ?country a eurio:Country.
  ?country eurio:name ?country_name .
}
ORDER BY ASC(?country_name)

Find projects that have participants from Western Africa

This query is the same as the previous query, with the only difference being in the specification of the geographical filter in the SERVICE block, since it checks for countries belonging to the geographical region of Western Africa. To do so, the query stores in the name of the region in the ?unsd_geoscheme variable and it finds it by applying two FILTER functions:

  • The first filter narrows the search on the skos:notation attribute the one with datatype http://publications.europa.eu/ontology/euvoc#UNSD_GEOSCHEME
  • The second filter searches only for values in the ?unsd_geoscheme variable that match the string "Western Africa" using the function CONTAINS, which checks if the variable contains the specified string.

For completeness’s sake, you can find the complete query below:

PREFIX eurio:<http://data.europa.eu/s66#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?project_title ?id ?organisation_name ?country_name
WHERE {
  SERVICE <https://publications.europa.eu/webapi/rdf/sparql> {
    GRAPH <http://publications.europa.eu/resource/authority/country>{
      ?s skos:notation ?cellar_code.
      FILTER(datatype(?cellar_code)=<http://publications.europa.eu/ontology/euvoc#ISO_3166_1_ALPHA_2>) 
      ?s skos:notation ?unsd_geoscheme .
      FILTER(datatype(?unsd_geoscheme)=<http://publications.europa.eu/ontology/euvoc#UNSD_GEOSCHEME>)
      FILTER(CONTAINS(?unsd_geoscheme,'Western Africa')) 
    }
  }
  BIND(STR(?cellar_code) as ?cordis_code)
  ?project eurio:title ?project_title .
  ?project eurio:identifier ?id .
  ?project eurio:hasInvolvedParty ?participant . 
  ?participant eurio:isRoleOf ?organisation . 
  ?organisation eurio:hasSite ?site .
  ?organisation eurio:legalName ?organisation_name .
  ?site eurio:hasGeographicalLocation ?country .
  ?country a eurio:Country .
  ?country eurio:name ?country_name .
  ?site eurio:hasAddress ?address .
  ?address eurio:addressCountry ?cordis_code .
}
ORDER BY ASC(?country_name)

Find projects that have participants from countries included in the European Neighbourhood policy (southern part)

This query is the same as the previous query, with the only difference being in the specification of the geographical filter in the SERVICE block, since it checks for countries belonging to the Southern Neighbourhood partnership. To do so, it store in the relation between the country and the Southern Neighbourhood via the property org:hasMembership, which leads to the URI of the partnership to which the country belongs to, i.e., http://publications.europa.eu/resource/authority/corporate-body/ENP-SOUTH.

For completeness’s sake, you can find the complete query below:

PREFIX eurio:<http://data.europa.eu/s66#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX org: <http://www.w3.org/ns/org#> 
SELECT ?project_title ?id ?organisation_name ?country_name
WHERE {
  SERVICE <https://publications.europa.eu/webapi/rdf/sparql> {
    GRAPH <http://publications.europa.eu/resource/authority/country>{
      ?s skos:notation ?cellar_code.
      FILTER(datatype(?cellar_code)=<http://publications.europa.eu/ontology/euvoc#ISO_3166_1_ALPHA_2>) 
      ?s org:hasMembership ?international_partnership .
      ?international_partnership org:organization <http://publications.europa.eu/resource/authority/corporate-body/ENP-SOUTH>}
  }
  BIND(STR(?cellar_code) as ?cordis_code)
  ?project eurio:title ?project_title .
  ?project eurio:identifier ?id .
  ?project eurio:hasInvolvedParty ?participant . 
  ?participant eurio:isRoleOf ?organisation . 
  ?organisation eurio:legalName ?organisation_name .
  ?organisation eurio:hasSite ?site .
  ?site eurio:hasGeographicalLocation ?country .
  ?country a eurio:Country .
  ?country eurio:name ?country_name .
  ?site eurio:hasAddress ?address .
  ?address eurio:addressCountry ?cordis_code .
}
ORDER BY ASC(?country_name)

It must be noted that federated queries must be used with caution to avoid excessive queries to remote SPARQL endpoints as well as inefficient query patterns, as both can severely impact query execution time, often leading to query time outs and the inability to retrieve any result at all. In the view of this situation, and since, additionally, no guarantee can be provided on the stability, availability, and performance of external SPARQL endpoints, it is highly recommended to opt instead for local data dumps of the KGs (or their sub-graphs) of interest and rely on local-based federated query deployments.

For a comprehensive overview of the features and specifications pertinent to SPARQL’s support of federated queries the SPARQL 1.1 Federated Query documentation should be consulted.

Data dumps

The latest EURIO KG dump can be downloaded from the European data portal, where you can also find sub-graphs of the EURIO KG. These sub-graphs comprise a most relevant, self-contained snapshot of the relations and attributes pertinent to each of the main EURIO KG types of entities and allow for finer-grained access to the EURIO KG contents. The sub-graphs are published as distinct Named Graphs, i.e., as subsets of the EURIO KG graph each with its own distinct label.