An entertaining article by Erik Blankinship et al., “Closed Caption, Open Source” (PDF), presents a little system he and his friends hacked together to scan caption text, link it to video segments, and allow people to reorder and remix those segments.
Every Web browser since the original Mosaic browser has a “view source” menu item that reveals these HTML source files, making it possible for people to copy and modify a Web page and use it as a template for their own page. Television, a far more ubiquitous medium, does not have an equivalent option allowing people to “look under the bonnet.”
(The article was written by MIT grads but published in a British journal, giving it a few localisms that are amusing, like the one just above, and others that are annoying, as with frequent references to captioning as “subtitling.”)
They came up with their own XML schema to mark up closed-caption text. (That makes it about the 14th such markup language for subtitle and/or caption text that I know of.) Their system infers timecode, which you can modify without destroying the original inference. They roughly synchronize the captions to the video frame; they should document that algorithm. (The researchers should also be aware of cases in which we deliberately and for good reason do not time the captions perfectly. I suppose it’s a bit late for that now.)
You can thus re-edit the source video into your own remix using captions as a way of finding the scenes you want to reuse. The authors claim to “embed” the original captions in any video you export. Exactly what mechanism they use, and how they convert from Line 21 to QTtext or whatever format, is unstated.
Most fun of all, the researchers took this system to a con and let science-fiction geeks have at it, though people could not store or forward a copy of their remixes.
This part I don’t get:
It took a while for the participants to understand that only closed captions were used to organise the available video clips. For example, a few participants queried for “explosion” and expressed dissatisfaction when only dialogue about explosions was retrieved.
Explosions, even if they produce a visible result, are almost always captioned as such. I view that as an ambiguous case, since it is not always clear that what you are seeing is an explosion, nor is it always unclear. Remixers might have to search for the morpheme explo- rather than “explosion,” but they’ll eventually get something. If the system tagged non-speech information as such, then remixers could simply scroll through the sound effects. NSI is often lexically discernible, and it was always discernible on broadcast episodes of Star Trek: TiNG, since the Caption Center captioned all of them and their punctuation is unambiguous, viz. ( explosion ).
Copyright intéressé(e)s will enjoy the authors’ footnotes about trying to license even a single frame of The Lord of the Rings, which unambiguously would be fair use in the authors’ context. However, there is no question in my mind that these remixes are derivative works. That puts creators in a classic clash-of-rights scenario: Is the infringement of creating an unauthorized derivative work saved by the fair-use defense? Is that even possible?
The authors did take the next step, though: They linked the captions and video from a film adaptation back to the original book.
Our software program, Adaptation, displays linkages between the source material and the movie. It does this by displaying the book and the movie as two parallel timelines, and graphically connecting corresponding sections to show how the book may have been expanded or condensed in adaptation. Adaptation also calculates how many minutes per page are spent on adapted scenes, and contrasts the relative size of scenes in their respective media.
Publish that software! It’s a great idea. (Hansard staff at legislatures could use it; see relevant article.)