Consuming Linked Data at the BBC

Yves Raimond, BBC R&D IRFS / @moustaki

Talk in two parts

  • Use of Linked Data at the BBC
  • Challenges around Linked Data consumption

Use of Linked Data at the BBC

Radio since 1922

TV since 1930

On the Web since 1994

Programme support

1000 to 1500 programmes per day across 70 channels

www.bbc.co.uk/programmes

The web site is the API

schema.org

Using Linked Data (external)

Using Linked Data (internal)

Towards a Linked Data Platform

World Cup 2010

Tagging articles

Dynamic aggregations

BBC London 2012

BBC Ontologies

The BBC Archive

http://worldservice.prototyping.bbc.co.uk

Challenges

Challenge 1: public endpoints

Mitigation

  • Caching
  • Local aggregations
  • Replication and syncing
  • Testing and monitoring

Quite a significant amount of work - can it be made generic?

Challenge 2: searching and indexing

Mitigation

  • Lots of regexp and FILTERs...
  • Full text search extensions in SPARQL end-points
  • Document store indexing the results of SPARQL queries

4store + ElasticSearch

Challenge 3: consuming Linked Data (hmm...)

Consuming Linked Data is (still) hard

rdflib.js

            
var kb = $rdf.graph();
var fetch = $rdf.fetcher(kb);
var FOAF = $rdf.Namespace("http://xmlns.com/foaf/0.1/");
var uri = 'http://bblfish.net/people/henry/card#me';
var person = $rdf.sym(uri);
var docURI = uri.slice(0, uri.indexOf('#'));
fetch.nowOrWhenFetched(docURI,undefined,function(ok, body) {
  var friends = kb.each(person, FOAF('knows'));
  console.log(friends[0].uri);
});
            
          

rdfstore.js

            
rdfstore.create(function(store) {
  store.execute('LOAD <http://musicontology.com/specification/index.ttl>', function() {
    store.execute('SELECT * WHERE { ?s ?p ?o }',
                   function(success, results) {
                     console.log(results);
                   });
    });
})
            
          

What I ended up doing...

            
var sparql = "http://my-rww-triple-store.org/sparql";
$.post(sparql, "output=json&query=LOAD+<http://dbpedialite.org/titles/Irun>", function (load_data) {
  $.post(sparql, "output=json&query=" + query, function (data) {
    console.log(data['results']['bindings']);
  });
});
            
          

EasyRDF


$foaf = new EasyRdf_Graph("http://njh.me/foaf.rdf");
$foaf->load();
$me = $foaf->primaryTopic();
echo "My name is: ".$me->get('foaf:name')."\n";

JSON-LD

            
{
  "@context": "http://json-ld.org/contexts/person.jsonld",
  "@id": "http://dbpedia.org/resource/John_Lennon",
  "name": "John Lennon",
  "born": "1940-10-09",
  "spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
}
            
            
          

RDF API Editor's Draft

            
var people = data.getProjections("rdf:type", "foaf:Person");
            
          

Challenge 4: RDF data mining

  • A wide range of data is available in RDF
  • Little OSS tools available to do data mining over RDF data...
  • ... or machine learning
  • Or bridges to existing libraries (Weka, Mahout, scikit-learn)

rdfspace

            
from rdfspace.space import Space
space = Space('influencedby.nt', rank=50)
space.similarity('http://dbpedia.org/resource/JavaScript', 'http://dbpedia.org/resource/ECMAScript')
space.similarity('http://dbpedia.org/resource/Albert_Camus', 'http://dbpedia.org/resource/JavaScript')
            
          

Conclusion

  • We use lots of Linked Data at the BBC
  • But there are still lots of challenges to tackle to make its consumption easier, namely...
    • Generic caching/replication layers
    • Better support for search and indexing
    • Accessible libraries for dealing with RDF data
    • Better tools to learn from or mine RDF data

Thank you!

Photo credits:

  • http://www.flickr.com/photos/andyarmstrong/4402416306/
  • http://www.flickr.com/photos/nicecupoftea/8579975238/
  • http://www.flickr.com/photos/11561957@N06/5202870020/
  • http://www.flickr.com/photos/hubmedia/2141860216/
  • http://www.flickr.com/photos/allison_mcdonald/7604871594
  • http://www.flickr.com/photos/aayars/4072755936/