In 2005, Telecommunications for the Deaf, Inc. (TDI) petitioned the FCC in the U.S. to update and extend its captioning requirements. Among many issues was a request to impose quality standards in captioning. A red flag to someone like me, shurely?! Not really. As part of a longer-term plan, I decided to just wait and read what everyone else had to say first (and also await an FCC decision, as yet unreleased).

The 1,663-strong list of interventions is available in an extremely large (4 MB) Web page that you should only load if you really know what you’re doing. (You can look at just the first hundred to start.) Each of the document links is actually a CGI call that will ultimately send a PDF to your browser with an unhelpful filename; I’m not even going to bother individually hyperlinking these entries to a page that is so manifestly anti-Web. (I dispute whether a page like this actually complies with 508.)

What did some of the major intervenors have to say?

Accessible Media Industry Coalition

AMIC is a small organization formed as a consequence of the Caption Quality Initiative conference, to which I was not invited, which I did not attend, but whose documents I host. WGBH isn’t a member, curiously.

AMIC’s submission is as nicely written as one would expect from Jeff Hutchins, and contains a statement of general captioning principles:

The 16-CARAT approach to captioning quality

Definition: “Captioning is the textual display of soundtrack information of visual media that allows a viewer to follow dialogue and action of a program simultaneously.” In order to achieve this goal successfully, captions should meet the following 16 criteria:


  • Programs are captioned from start to finish
  • Every sentence is conveyed


  • Each word is the correct word
  • Each word is correctly spelled
  • Each sentence is correctly punctuated


  • Captions display with adequate time to be read completely
  • Captions are not obscured by the visual content of the program
  • Every effort should be made to avoid obscuring important textual and visual information
  • Captions do not compete with other displayed text
  • Captions are an appropriate size


  • Viewers can tell who spoke the captioned words and when the speaker has changed
  • Viewers can tell how the words were said, e.g. shouted, whispered, sung
  • Other auditory cues, such as music and sound effects, are described in the captions


  • Words do not appear too early
  • Words do not appear too late
  • Captions are timed to accompany the audio as closely as possible


  1. Captioners should make their best effort to obtain the final copy of a program or information about its final version;
  2. Captioners should be sensitive to the tone of the soundtrack, noting all essential nonverbal information that is conveyed, such as sarcasm, silence, musical moods, and background sounds;
  3. Real-time systems that lag far behind the audio should not be used.

Pretty solid, if generic; the list reminds me of my own standard techniques in audio description.

Global Translation/TranslateTV

This company has widely-used software that automatically translates English captions to Spanish. Maybe someday I’ll take a look at that, but I strongly dispute its effectiveness at any level. Even so, the company’s monitoring process has given it a vast database of real-world caption files.

  1. Complaints:

    [A]t the level of local and regional programming of live content, market forces are insufficient to assure quality closed captioning. TranslateTV’s logs of local station’s newscasts indicate that, on average, between 25% and 60% of all captioned sentences contain errors that substantially impede understandability. TranslateTV classifies these errors by type and severity. This high error rate is true at stations using both live captioning and [electronic newsroom captioning].

    The stations and caption providers do not systematically monitor caption quality. Instead, their chief feedback mechanism is consumer complaints. When a complaint is received, most stations and providers will respond individually to the viewer; however, there is no proactive monitoring system in place to quantify and analyze the overall rate of errors in captioning. The current feedback system is both reactive and narrow in its impact.

  2. Errors:

    Further, TranslateTV contends that the measures of quality used in the captioning industry are very misleading as indicators of the intelligibility and usefulness of the captioned text. The captioning industry uses a quality measuring system that counts the [total] number of… words in a caption, and calculates the percentage of words that are correct. This method doesn’t take into account the impact of the errors on understandability. […]

    Punctuation error (Harris County’s)
    Wrong-word error (“comfortable” should be “constable”)

    […] Although the passage is extremely difficult to understand, the captioning-industry standard would rate it as highly accurate, [with] a 94% accuracy because 16 of its 17 words are correct. However, the sentence in its entirety would be virtually impossible for a viewer to understand in the 1–2 seconds it is displayed. Even more importantly, because the critical phrase that is the subject of the story is rendered incorrectly (a “black female constable” = “black female comfortable”), the reader’s ability to understand the subsequent sentences is also impaired. Overall, the captioning industry would rate the passage 89% accurate because 6 of 67 words are wrong. However, the intelligibility and usefulness of this story to a viewer is actually very limited.

Caption Colorado

Caption Colorado shares with other intervenors the false, counterfactual, and disprovable belief that captioning features that impinge on psychology of reading – on the built-in and learned responses of the eyes and brain – are mere “style.” Is that like a “Flock of Seagulls” “haircut” or a really “bitchin’ ” “leather jacket”? (Most punctuation below rendered sic.)

“Placement”, “pop-on verses roll-up’’ and “type font” are all “style” issues that should not be considered in evaluating the “quality” of captioning. “Placement” in the context of the captioning window conflicting or covering up banners or other important information on the screen would be a “quality” issue. The cost difference associated with some “style” preferences can be substantial and not all types of productions require or warrant the highest cost style presentations.

Yet Caption Colorado seems to place appropriate emphasis on abiding by the orthographic conventions of the language in question:

Offline captions should have the appropriate punctuation to produce a highly legible reading experience while maintaining the meaning of what is said. End-of-sentence punctuation, commas and apostrophes are especially crucial. Dashes, em dashes, quotation marks [obviously a favourite of Caption Colorado’s] and ellipses should be dealt with as style issues, not punctuation. It is important to note that one can be “economical” with punctuation, especially considering our 32-character space limitations.

For example you may not need a comma for readability in a case where the “journalistically correct” way of writing would be to include a comma. Punctuation… should be used following generally accepted written English style guidelines such as “Chicago Manual of Style”, etc. Misused end of sentence punctuation or apostrophes (as these can change the meaning of the text) would be calculated as one error per occurrence. Commas and semicolons can be somewhat judgmental and will differ from what may be “correct”….


If non-technical quality standards were set, as an example, at the “average” or “mean” quality levels existing in the marketplace today, approximately half of the current captioners in the United States would be eliminated from captioning and overall capacity for television captioning would be cut in half.


When not gutting their captioning operations, our friends in Boston have been busy. This FCC intervention joins the lengthy list of WGBH efforts that I support in broad strokes. In fact, there’s not a lot in it I don’t support.

[T]he feedback loop between caption consumers and program providers and producers is very weak. Communication between caption consumers and program distributors requires clear points of contact, widely-published voice and TTY numbers and knowledge of relay services, staffing into the prime-time and nighttime hours, and knowledge and understanding of caption quality issues by program providers, local cable operators and local TV stations. Without those means of communication and points of contact, it’s no wonder that complaints rarely reach the providers. Without such a feedback loop, many providers may assume that a low-cost and low-quality captioning service is adequate, not hearing otherwise from consumers.

Complaints from consumers more frequently arrive on the doorstep of the captioning agency, which may or may not be responsible for faulty caption services. That caption agency is reliant on the good will of their clients and has an inherent conflict in bringing complaints to the large corporations which are funding them.

And another vote for limiting the use of scrollup captioning. This is kind of a litmus-test issue, you know: You can pretty much dismiss (in fact, you can categorically and immediately dismiss) any captioner that claims all or most or even a lot of programming can be captioned in scrollup. Live shows, sure. A wide swath of nonfiction programming, sure. But on the whole? Nope.

Caption styles should be used based on program type (i.e., pop-on captions with placement and/or speaker IDs for dramatic and comedy programs and movies, roll-up captions for talk shows, documentaries, news)


When not gutting their own operations, NCI too finds time to intervene. (Of course that’s unfair. The intervention happened before the actual gutting.) Just two quick items:

  1. The production standards used to review created captions include many of the elements listed… as nontechnical aspects of captioning – specifically accuracy of transcription of the audio, spelling, grammar (when not verbatim), punctuation, and the inclusion of identification of nonverbal sounds. The remaining elements on the list [of] nontechnical aspects of captioning may be better considered as style issues (placement, how nonverbal sounds are indicated, pop-on/roll-up style of captioning, and verbatim/reading-speed-edited captions…. [O]n mixed-case vs. upper-case captioning is a style issue….

    Again, don’t expect these people to respect a century’s research on reading. Then again, what would one expect of NCI?

  2. NCI has not seen a shortage of individuals to create, or who can be trained to create, captions. With each advance of the required level of mandated captioning, expansion of the industry capacity seemingly has occurred, and required captioned programming has not gone uncaptioned from the lack of qualified captioners.

    Note well that latter point, NCI terminated workforce. There is no “shortage of individuals” like you.

Media Captioning Services

A nice incendiary document here. Really, it’s a favourite.

Although many video programmers have accepted closed captioning as a cost of doing business, many have not…. We can cite two instances of predatory activity in the captioning market as follows:

  1. In 2002, MCS was invited to bid on 12,000 hours per annum of video programming. The video programmer’s objective was to receive, as noted in documentation sent to MCS, a no-cost proposal. In fact, a competitor captioning company [that] proposed to caption four networks of this video programmer at no cost to the video programmer won the bid for this business [costing MCS over $400,000 a year in lost revenue]. […]

  2. In 2004, MCS bid unsuccessfully on over 3,000 hours of national daytime news programming…. [T]he winning bid was “significantly below $95 per hour.” In fact, we learned shortly thereafter that funding for captioning by the caption company [that] won the bid was forthcoming from the U.S. Department of Education…. [F]ederal funds… were used by the captioning company to apparently subsidize the private-sector bid to the video programmer.

The captioner in that second example sounds a lot like NCI, but of course I have no information on that company’s recent grants or any proof of a decade’s use of grant money as a predatory weapon to undercut the competition.

MCS also proposes compensating very small captioners using the relay-service fund or from sale of analogue spectrum. One problem here is that pretty much all captioners are small businesses as classified under U.S. government terms, so this could end up funding MCS’s competitors. Even “three dominant companies in the industry” – presumably NCI, WGBH, and Vitac – “would still be classified on small entities.”


A few years ago I had the honour of visiting Black Rock (almost as momentous as visiting Parliament, but, in retrospect, less so than walking through the front door of Number 10 several times), and was given a grand tour of the CBS production facility. I met various captioning staff and was, on the whole, impressed to the point of stupefaction.

For real-time captioning… CBS provides its captioning agencies advance access to news scripts and rundowns, and provides CBS liaisons to the agencies to keep them apprised of planning and changes in programming prior to and during the airing of both regularly-scheduled CBS News programs and CBS News special reports.

Yes, they really are saying that somebody sits there all day and makes sure the captioners know what’s coming up. I have met that person. What’s your network doing?

Since 2001, the networks’ new contracts… have required that each real-time stenocaptioner assigned to broadcasts must be certified by the National Court Reporters Association (NCRA). The contracts reserve the right to request a change in stenocaptioners if they fail to perform adequately.

NBC Telemundo (sic)

I would not place NBC anywhere near CBS in the seriousness with which it addresses captioning and especially audio description. It is known that NBC selected a major captioning vendor after a reverse auction, i.e., for the lowest price that the bidding companies could tolerate.

NBC provides yet another encomium for a technology that does not now exist and will not exist in our lifetimes, speaker-independent voice recognition, as a captioning technique. I suppose this indicates contempt for captioning, captioners, and captioning cost. Think of how much tidier it will be when we can simply have a computer do all the boring stuff. I would not want to be deaf in that era. (Or deaf in a classroom that uses such technology.)

NBC also soundly disputes that quality captioning is even something we can agree on, let alone legislate:

[T]he Commission should not risk stifling further technological development by imposing stringent accuracy rates…. [P]rogress in technology can be delayed or stopped by regulation that poses significant and unexpected challenges to implementation of that technology.

Shorter NBC: If you insist that our captioning technology actually work, we may not be able to use it.

New accuracy requirements will pose such issues, especially as NBC Telemundo’s own analysis indicates that stations currently cannot expect leading real-time-captioning services to deliver more than 84% accuracy, and the 2006 advent of the 100%-captioning requirement and the current state of the captioning industry may require the use of undertrained personnel that cannot maintain even existing levels of accuracy.

Shorter NBC: We aren’t even at 85% accuracy as it is, and if you insist on stringency we may have to hire illiterate twits who can’t even manage that.

A subjective standard, or one that requires captioning of certain types of silent action, correct grammar or capitalization, or inarticulate sounds, is too dependent on the perceptions and biases of the person reviewing the captioning to be suitable for either technological development or Commission action.

Shorter NBC: Because one person may disagree with the captioning on a program, captions don’t have to be remotely correct, and you can’t require correct captioning of any technology or TV station.


HBO’s in-house captioners tend to simply leave out words they don’t understand (less so nowadays, but check early episodes of Six Feet Under). They also think that question marks and exclamation points are preceded by spaces and that “alright” is a word. Nonetheless, here is how they describe their quality control (emphasis added):

HBO’s quality-control program monitors 16 categories of potential technical issues associated with video, audio and closed captioning, the occurrence of any one of which is considered a disruption to the service. HBO’s goal is to have each of the 30 linear programming feeds it originates experience no more than 5.5 minutes of programming disruption per year – a reliability factor of more than 99.999%. Over the past two and a half years, more than 25 of HBO’s programming feeds have met this reliability goal consistently. Those linear feeds that fell short of the goal missed it by an insignificant amount on an annual basis….

HBO has found that closed-captioning errors on its feeds account for less than 10% of all disruption events (i.e., less than 30 seconds per year). In fact, in HBO’s experience, the errors in closed captioning are fewer than the miniscule amount of audio discrepancies. […]

Approximately 90% of… theatrical titles are provided to HBO with closed captions on the master tape which are checked for quality by the provider. On an as-needed basis, HBO outsources the remaining 10% of the theatrical titles (provided to HBO without captions) to third-party vendors who create and screen the captions for quality. HBO’s original programming is either closed-captioned in-house or is outsourced to third-party vendors. In both instances, the completed captions are screened for quality.

TDI’s reply comments

TDI, the original petitioner, pretty much blows out of the water the idea that viewers can simply complain about problems:

In order to bring a complaint, a consumer needs to (1) know to whom a complaint should be directed, and (2) have the means of transmitting the complaint to that person. At a minimum, consumers should be able to direct a complaint either to the Commission and/or to the distributor. The methods by which complaints can be made should include all of the following, with the expectation that such complaints are investigated upon receipt: E-mail, fax, TTY, mail, phone, and, preferably, a Web site designed to process such complaints.

Consumers often have difficulty determining where they need to file their complaints. Because of the complexities of television programming distribution, the average consumer often does not know who is responsible for compliance with the captioning obligations – most consumers are at a loss as to whether a complaint needs to go to the to the local station, a national network provider, a cable network or the local cable franchise. While it may be advantageous for consumers who are savvy enough to know how to bring their complaints to the appropriate entity in the video industry to do so before going to the FCC, all consumers should have the option of bringing their complaints to the FCC, wherein the complaint can be re-directed to the appropriate distributor for response.

