The BBC World Service Archive experiment

Yves Raimond, BBC R&D IRFS / @moustaki

The BBC Archive

Publishing our archive

The World Service archive

The missing metadata

Machine listening

Automated speech recognition

Automated transcripts

Automated tagging

Example results

Processing archives in the cloud


Algorithms and people

Data validation

Speaker segmentation

Crowd-sourcing speaker names


Propagating speaker names

Evaluating speaker identification

Refining our models

User activity

How good is the data?

  • Tags are a large and sparse space
  • When is a tag correct?
  • When is a programme tagged completely?
  • How do you measure crowdsourced data?

Who does the work?

Emerging shape of the archive

Visualising the archive

Semantic Web Challenge 2013 - First prize!


ClOud Marketplace for Multimedia Analysis


Thank you!

Photo credits: