Linked Data can be published and accessed on the Web in a variety of ways. This document specifies the Triple Pattern Fragments interface, designed for low-cost Linked Data publishing and efficient client-side execution of several common types of queries. Each Triple Pattern Fragment contains those triples of a dataset that match a specific triple pattern, together with the estimated total number of matching triples, and hypermedia controls to find all other Triple Pattern Fragments of the dataset.

This specification was published by the Hydra W3C Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

To participate in the development of this specification, please join the Hydra W3C Community Group. If you have questions, want to suggest a feature, or raise an issue, please send a mail to the public-linked-data-fragments@w3.org mailing list.

Introduction

Interfaces to Linked Data

Linked Data [[LINKED-DATA]] consists of structured links between pieces of data. For example, we can link the concepts “Thomas” and “Nikola” with a “knows” relationship: Thomas knows Nikola. Yet this alone cannot be interpreted unambiguously. Which Thomas and Nikola are we talking about? And what does “knowing” mean in this context?

We use URLs to clearly and unambiguously identify each of those three pieces of information:

<http://dbpedia.org/resource/Thomas_Edison>
      <http://xmlns.com/foaf/0.1/knows>
              <http://dbpedia.org/resource/Nikola_Tesla>.

If you, or a machine, want to know which “Thomas”, “Nikola”, or “knows” is mentioned, you can simply follow their URLs. This is the essence of Linked Data, and the above snippet depicts a Linked Data triple in the RDF model [[RDF11-CONCEPTS]]. RDF data exists in various concrete syntaxes:

Just like HTML, RDF is simply a format for documents—publishers can choose freely how they divide data across different documents. Each server determines which interface it chooses to offer Linked Data. Some of them might publish a few large documents with millions of triples, others might give access to very specific, on-demand parts of a dataset. In order to communicate with each other, clients and servers can reuse agreed-upon interfaces, each of which comes with its own balance of advantages and disadvantages.

Aim, scope, and intended audience

The goal of a Triple Pattern Fragments server-side interface is to provide low-cost access to Linked Data, and enable efficient live querying over the dataset on the client side.

This document defines Triple Pattern Fragments, a Linked Data Fragments type, by specifying their representation and effect on the application state. This allows to publish and consume Linked Data through a Triple Pattern Fragments interface.

This document is intended for people who want to implement a client or server of Triple Pattern Fragments, or for those who want to understand how such clients or servers work.

Document conventions

We write triples in this document in the Turtle RDF syntax [[!TURTLE]] and use the following namespace prefixes:

PREFIX rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX hydra: <http://www.w3.org/ns/hydra/core#>
PREFIX void:  <http://rdfs.org/ns/void#>
PREFIX foaf:  <http://xmlns.com/foaf/0.1/>

Document type

Overview

In essence, every client–server interface is characterized by two components:

These essential design decisions also determine Linked Data interfaces, through which a server publishes RDF triples from one or more datasets. Each Linked Data interface answers the above two questions differently.

The Triple Pattern Fragments interface, which is defined in this document, answers those questions as follows:

A Triple Pattern Fragment contains all triples from the interface's dataset that match the requested triple pattern, together with metadata and hypermedia controls. The remainder of this section precisely defines Triple Pattern Fragments and their components using the Linked Data Fragments conceptual framework.

The Triple Pattern Fragments interface consists solely of the definition of the Triple Pattern Fragment document type. No protocol-specific details are specified, since each Triple Pattern Fragment contains hypermedia controls that explain in-band how clients can engage in an interaction through a specific protocol. This means that, given one Triple Pattern Fragment, a client can consume the entire interface. As an example, the application of Triple Pattern Fragments to the HTTP protocol is detailed in .

Definition

Triple Pattern Fragment is a Linked Data Fragment [[!HYDRA-LDF]] of a dataset that consists of three parts:

data
A Triple Pattern Fragment MUST contain all triples of the dataset that match a given triple pattern “?subject ?predicate ?object.”.
A Triple Pattern Fragment MAY additionally contain other triples of the dataset.
metadata
A Triple Pattern Fragment MUST contain a triple with a void:triples predicate that expresses the estimated total number of matches for the triple pattern.
A Triple Pattern Fragment MAY contain additional metadata.
hypermedia controls
A Triple Pattern Fragment MUST contain hypermedia controls that allow retrieval of any other Triple Pattern Fragment of the same dataset. This MUST be provided as a form that allows to choose subject, predicate, and object of the selector's triple pattern.
A Triple Pattern Fragment MAY contain additional hypermedia controls. In particular, the IRIs of the entities of the data, metadata, and control triples SHOULD be dereferenceable.

A Triple Pattern Fragment of a dataset is fully determined by its triple pattern selector. In its most abstract form, a triple pattern selector consists of three components subject, predicate, and object [[!RDF11-CONCEPTS]]. The subject MUST either be a variable or an IRI; the predicate MUST either be a variable or an IRI; the object MUST either be a variable, an IRI, or a literal. These components MUST NOT be blank nodes.

A Triple Pattern Fragment is considered empty if it does not contain any data triples (regardless of metadata and controls).

The above constraints define the Triple Pattern Fragment document type, and are detailed in the following sections , , and . Triple Pattern Fragments SHOULD be paged, as detailed in .

Triple Pattern Fragments are not bound to a specific syntax, since the data, metadata, and controls can be represented in different ways. The server MUST, however, support at least one RDF-based representation. Servers MUST indicate the corresponding MIME type when responding with a Triple Pattern Fragment, so clients can correctly parse it.

For RDF syntaxes without multiple graph support (such as Turtle or N-Triples), the data, metadata, and controls MUST necessarily be serialized to the same graph. For RDF syntaxes with multiple graph support (such as JSON-LD, TriG or N-Quads), the data MUST be serialized to the default graph and the metadata and controls MUST be serialized to one or multiple non-default graphs.

Data

The data of a Triple Pattern Fragment is obtained by selecting all triples of a dataset that match the fragment's triple pattern selector. These triples SHOULD be ordered in some consistent way, such that Triple Pattern Fragments can be paged consistently. Triples SHOULD NOT contain blank nodes (if the original triples contain blank nodes, they SHOULD be skolemized).

If the RDF syntax supports multiple graphs, data triples MUST be serialized to the default graph.

Metadata

Each Triple Pattern Fragment, and each page of a Triple Pattern Fragment, MUST contain the estimated total number of triples that match the fragment's selector. This MUST be expressed as a triple with the following components:

subject
the IRI of the fragment
predicate
the void:triples predicate
object
an integer literal expressing the estimated total number of matching triples

The estimate MUST be a non-negative, finite, integer number with the following properties:

The metadata MAY additionally contain variations of the above triple. For instance, it is RECOMMENDED to add a triple with the same subject and object and the hydra:totalItems predicate.

If the RDF syntax supports multiple graphs, metadata triples MUST be serialized to a non-default graph. This non-default graph SHOULD be explicitly related to the Triple Pattern Fragment (for instance, using the foaf:primaryTopic predicate), so clients can interpret what resource this metadata belongs to. This graph MAY be the same as the graph containing the hypermedia controls.

Hypermedia controls

Each Triple Pattern Fragment, and each page of a Triple Pattern Fragment, MUST contain a hypermedia control that can generate the IRI of each other Triple Pattern Fragment of the same dataset.

This control MUST act as a function that accepts three parameters subject, predicate, and object, each of which can either be a variable, a constant IRI, or (in the case of object) a constant literal. It MUST then map these parameters to the IRI of the dataset's Triple Pattern Fragment whose selector has the given parameter values.

This control MUST be expressed as a form in the Hydra Core Vocabulary [[!HYDRA-CORE]] using triples with the following structure:

<http://example.org/example#dataset>
    void:subset <http://example.org/example?s=http%3A%2F%2Fexample.org%2Ftopic>;
    hydra:search [
        hydra:template "http://example.org/example{?s,p,o}";
        hydra:mapping  [ hydra:variable "s"; hydra:property rdf:subject ],
                       [ hydra:variable "p"; hydra:property rdf:predicate ],
                       [ hydra:variable "o"; hydra:property rdf:object ]
    ].

The above snippet assumes the dataset IRI is http://example.org/example#dataset, the fragment (or fragment page) IRI is http://example.org/example?s=http%3A%2F%2Fexample.org%2Ftopic, and the IRI template [[!RFC6570]] to retrieve Triple Pattern Fragments of the dataset is http://example.org/example{?s,p,o}. It furthermore assumes that the parameter names of subject, predicate, and object are s, p, and o, respectively. All of these MUST be adjusted to fit the configuration of a specific fragments server. Note that the form MUST be attached to the dataset, as the form filters the dataset and not the fragment, and the fragment MUST explicitly be listed as a subset of the dataset, in order to indicate the relation between the two.

This hypermedia control MUST be present because there purposely does not exist a fixed IRI format that servers of Triple Pattern Fragments need to follow. This means that clients of Triple Pattern Fragments MUST NOT need prior knowledge of a server, i.e., they MUST NOT assume a certain IRI pattern. Instead, clients MUST interpret the hypermedia control in each Triple Pattern Fragment in order to retrieve another fragment. Clients MUST NOT attempt to deconstruct IRIs of fragments, i.e., they MUST treat these as opaque identifiers.

The hypermedia control fulfills the hypermedia constraint that each representation should contain the controls towards next steps. As a result, clients can use Triple Pattern Fragments without any prior knowledge. This also means a server can freely choose the IRIs of its Triple Pattern Fragments, as well as the names of parameters (e.g., subject, predicate, object instead of s, p, o in the above snippet).

An equivalent hypermedia control for the above snippet could look as follows:

<form method="GET" action="http://example.org/example">
  <fieldset>
    <ul>
      <li><label for="subject">subject</label>     <input id="subject"   name="s" /></li>
      <li><label for="predicate">predicate</label> <input id="predicate" name="p" /></li>
      <li><label for="object">object</label>       <input id="object"    name="o" /></li>
    </ul>
  </fieldset>
  <p><input type="submit" /></p>
</form>

Clients can obtain the IRI of a specific fragment through a hypermedia control in fragments' representations. However, since hypermedia controls generally accept strings as input, we need to specify how to convert IRIs, literals, and variables to strings for use as subject, predicate, and object parameter values. The interface MUST at least support the following options:

constant IRI
the text value of the IRI, e.g., http://example.org/bar
constant text literal
the text value of the literal, surrounded by double quotes ", e.g., "my text"
constant literal with language
the text value of the literal, surrounded by double quotes " followed by @ and the lowercase language code [[!BCP47]] e.g., "my text"@en-gb
constant literal with type
the text value of the literal, surrounded by double quotes " followed by ^^ and the text value of the IRI e.g., "42"^^http://www.w3.org/2001/XMLSchema#integer
variable
either as the empty string, or as a string starting with a question mark, followed by one or more word characters, e.g., ?var.

The Hydra Core Vocabulary captures the above (see issue 30). Maybe it should not be repeated here.

We should talk about support for patterns such as { ?s ?p ?s }.

If the RDF syntax supports multiple graphs, control triples MUST be serialized to a non-default graph. This non-default graph SHOULD be explicitly related to the Triple Pattern Fragment (for instance, using the foaf:primaryTopic predicate), so clients can interpret what resource the controls are related to. This graph MAY be the same as the graph containing the metadata.

Paging

Triple Pattern Fragments SHOULD be paged to avoid overly large responses. A page of a Triple Pattern Fragment consists of the following three parts:

data / selector
The page MUST contain a subset of data of the corresponding Triple Pattern Fragment. The data of a fragment SHOULD be distributed over pages of a given page size n by dividing an ordered list of matching triples in chunks of size n.
Each data triple SHOULD only occur on one page of any given fragment.
The page MAY additionally contain other triples of the dataset.
metadata
The page MUST be linked to its corresponding Triple Pattern Fragment using hydra:view.
The page SHOULD contain all metadata of the fragment. In particular, it MUST contain a triple with a void:triples predicate that expresses the estimated total number of matches for the fragment's triple pattern.
The page MAY contain additional metadata.
hypermedia controls
A page SHOULD contain all hypermedia controls of the fragment. In particular, it MUST contain those controls that allow retrieval of any Triple Pattern Fragment of the dataset.
If a previous page directly precedes the page, this page MUST link to it using hydra:previous. The previous page SHOULD NOT be empty.
If a next page directly follows the page, this page MUST link to it using hydra:next. The next page SHOULD NOT be empty.
The page MAY contain additional hypermedia controls.

A page is considered empty if it does not contain any data triples (regardless of metadata and controls).

Pages MAY be accessible by page number through an additional hypermedia control. In any case, clients MUST NOT attempt to deconstruct IRIs of pages, i.e., they MUST treat these as opaque identifiers.

Triple Pattern Fragments servers

Definition

A server can make one or more datasets available as Triple Pattern Fragments. For a server to be called a Triple Pattern Fragments server of a given dataset, it MUST offer access to all possible Triple Pattern Fragments of that dataset. Triple Pattern Fragments MAY be accessible through one or more representations, at least one of which MUST be RDF-based [[!RDF11-CONCEPTS]].

Each Triple Pattern Fragment MUST follow the document type defined in and MUST be accessible through a canonical IRI. All representations supported by the server MUST be accessible through this canonical IRI; individual representations SHOULD NOT have their own IRI. Support for representations must be consistent, that is, if a Triple Pattern Fragment is available in a certain representation, then all of the Triple Pattern Fragments MUST be available in that representation.

Triple Pattern Fragments servers MAY additionally offer access to other Linked Data Fragments of the dataset.

HTTP-based implementation

As an example, we consider how a Triple Pattern Fragments interface could be implemented on top of the HTTP protocol [[RFC7230]].

A server assigns IRIs to all possible (pages of) Triple Pattern Fragments of its datasets. How IRI assignment happens is implementation-dependent; this choice is communicated to clients through the hypermedia controls of fragments' representations.

A IRI of a fragment identifies the fragment resource, not a representation thereof. All representations are accessible through the fragment's canonical IRI using content negotiation [[RFC7230]]. No specific IRIs are necessary for individual representations.

In order to allow browser applications to access fragments, Cross-Origin Resource Sharing [[CORS]] has to be enabled on the server. To this end, Triple Pattern Fragments servers should emit the following header and value on all HTTP responses to requests for Triple Pattern Fragments, regardless of their status code:

Access-Control-Allow-Origin: *

As this example indicates, a Triple Pattern Fragments interface is fully defined by its document type (). No protocol-level agreements are necessary. In the case of HTTP, these are provided by the HTTP protocol itself.