Peter Staley posted a scan of a foundational document from ACT UP, A National AIDS Treatment Research Agenda. I improved his version to something resembling archival quality.


  • Crappy scanned PDF with skewed pages deskewed. Poor-quality recognized text included. Tags added. Almost complies with PDF/A standard, but not quite

  • Thoroughly copy-edited HTML version:

    • merged pages one and two of original, because HTML is pageless

    • regularized some spellings (payers, Treatment IND)

    • altered copy inasmuch as hyphenation and nonbreaking spaces were added

    • could not reasonably regularize all capitalization and punctuation in original (the latter especially in headings); title case troublesome here, as it is with all amateur writers (Who Think All Headings Have To Be Capitalized Like This – They Don’t)

    • did not number everything the original does, because, while valid in HTML, the result is unexpected; specifically, numbered headings were reduced to the minimum possible

    • corrected inconsequential errors and marked up consequential errors found in original

If you’re concerned that the HTML version does not look, feel, and act like the PDF, understand that it cannot and should not.


I was explaining previously that the ACT UP Oral History Project is making a serious mistake – attributable to poor computer knowledge and Windows use – by publishing transcripts of interviews with ACT UP members as PDFs. I explained the need for correct archiving of these important documents and the need for an upgraded codebase for its site.

Nothing changed. Jim Hubbard and Sarah Schulman are not about to take my advice on anything even if it means that, someday soon, nearly 200 priceless interviews and transcripts will disappear or become unreadable. They would prefer to keep on doing what they’re doing if the alternative is taking advice from me. I get this a lot.

But for demonstration purposes, I decided to correct the archival errors in A National AIDS Treatment Research Agenda. My improved PDF is somewhat more likely to be readable in the future than Staley’s, and is already much more usable. For the rest of our lives and probably our children’s lives, as long as computerization still exists, my valid-HTML version will be readable. I guarantee what I just said. Valid HTML will not become less comprehensible.

Repairing this 16-page document took half a week of part-time effort. It was possible only because I am good at PDF manipulation (though with middling skills compared to people I know) and am an expert in HTML. I suspect I am the last person publishing nothing but valid HTML.

This was just one document. How do we solve the entire archive problem? I imagined a scenario in which we applied to Tim Gill for, say, a million dollars to correctly archive all AIDS-activist materials in the United States and Canada – yes, even of my enemies in the former AIDS Action Now. I then wondered which exact institution would apply for that money and how we’d get our hands on the original materials. Despite its being my idea, I knew I would be excluded from the outset or fired the day the cheque cleared. People who don’t know the difference would hire other people who don’t know the difference to do the work (as in captioning, a cadre of identical 25-year-old females with humanities degrees).

You can see, then, why it is obvious to me this will never be repeated. No archival ACT UP documents will ever be given this treatment again. This one was.

The foregoing posting appeared on Joe Clark’s personal Weblog on 2014.06.18 12:14. This presentation was designed for printing and omits components that make sense only onscreen. (If you are seeing this on a screen, then the page stylesheet was not loaded or not loaded properly.) The permanent link is:

(Values you enter are stored and may be published)



None. I quit.

Copyright © 2004–2024