I am the biggest pigheaded rat bastard in the world where it comes to Web semantics. Nobody holds a candle to my shit.

  1. I go to enormous lengths to use every XHTML element correctly, up to and including oddball Dickensian orphans like var, samp, and dfn.
  2. I have elaborate BBEdit keystrokes and glossary entries for semantic XHTML. I’ve got a new plug-in, to be released shortly and written by an esteemed colleague, that adds id attributes to everything on your page.
  3. I put lists inside lists. Three days ago I had one blockquote cite="" nested inside another. I’ve used markup sequences as complicated as ol li blockquote dl dd p dfn before. J the Z had to rewrite his stylesheet to make them work!
  4. I use correct HTML in comments on other people’s Weblogs. I correct people’s code when I quote them here. I encode copy-edits and alterations of quoted text inside ins, complete with visually-distinct shading. You can’t do that in print; it’s a power unique to the Web. (Or to structured documents, I suppose, but most of those are on the Web.)
  5. As I wrote in my first book, “What’s the best way to keep your text accessible? Use proper markup.” And by God I do.

Now, as standardistas will be aware, oldschool HTML and the extensible XHTML (which virtually nobody actually extends) do not come equipped with every element you could ever want. We are expected to pound square-pegged documents into round-holed code. Sometimes one must simply approximate. The poster child for such approximations is the definition list, which, by spec, is not limited to terms being defined and their definition phrases; in fact, dl can and should be used for appositional pairs. Tantek disagrees, but Tantek is sometimes wrong. Or is that merely Tantek’s informed opinion, at variance with everyone else’s?

That’s important. If it’s truly informed opinion, and you the author are clearly using semantic HTML correctly everywhere else, and if you are forced by circumstance to approximate, then you’re probably doing the right thing.

And yes, dear friends, that includes the b and i elements currently much discussed. (And u, not at all discussed, though again I was way out there early.) People act as though these disputed elements are forbidden even though they remain valid HTML.

What did Paul Ford have to say about it?

Why is [em] better than i? When I’m publishing content from 1901 and it’s in italics, it’s in italics, not emphasized. Typography has a semantics that is subtle, changing, and deeply informed by history. The current state of Web ignores this more or less completely, and repeatedly seeks to encode typographic standards and ideas into tree-based data structures, like in a q (quote) tag.

Why are some semantic constructs more privileged than others? Why are the blockquote, [em], strong, and q tags more essential than the nonexistent event, note, footnote, or fact tags? Because HTML tried to inherit the implied semantics of typography, that’s why! And those semantics are far more subtle and complex than most people (outside of the TEI folks, and their text-aware kind) will acknowledge. But sticking with them means we have a typographically and semantically immature web… oh, it is madness, madness.

I have faced the same dilemma Ford did: Every time I retype a passage from Spy, I have to impose 21st-century markup on a magazine published when compact discs were viewed as impossibly space-age. I use headings, lists, the whole shebang. They were never there in the source document; I inferred them from structure and graphic design. That is my job as a standards-compliant reviewer of these sacred texts.

One usage for which I decided to apply i for italics was in Spy’s responses to letters to the editor (all inside blockquote). True to Fordian form, such responses are not emphasis, although, amusingly, I use cite and em inside that i markup. The use of italic is a graphic-design standard.

My esteemed colleague MC May Techno Dance Remix is thus quite correct in telling us that Chicago exhaustively lists the permitted usage of italics. Except Chicago is working at the phrase level (the level of typography) rather than the document or page level (the level of graphic design).

(Remember the old saw: “It’s typography when it’s this close [holds paper to nose] and graphic design when it’s this close [holds paper at arm’s length].”)

They’re telling you how to mark up the trees, not the forest. Some of us have to mark up forests.

In another Spyism, I manage to miscegenate a disputed element and two approved elements inside a single phrase:

<em>You <strong>Are <u>There</u>!</strong></em>

Such was my best approximation of the original printed source.

And that is what we must sometimes do: Approximate. It’s better to approximate using an element that, by spec, is already approximately correct than by using something like span or div, which, again by spec, is so generic it is correct everywhere and nowhere.

Further, the default visual and auditory renderings of b and i are themselves based on their typographic antecedents. It is proper to use those renderings as a basis in your decision-making. (Interestingly, the current Opera 7.50 won’t italicize a citeation if an italic font isn’t available, which makes some of my sentences hard to understand. Is that really better?)

So please stop being holier-than-thou and please get off our cases. If smart, informed people are using b or i, it’s because they have made smart, informed decisions to do so. We’re not slacking off; we’re not making a mistake; we’re not harming the grand ideals of semantics or accessibility. We’re not doing anything but using b or i. Get over it.

Fun fact: In a couple of weeks, when a long-delayed set of documents I wrote for the TILE project is finally released, you’ll find a DTD and tutorial custom-created by the Literary Moose. He works, behind his pseudonymous shield, to add semantics to XHTML for literary usage so that we in fact will be using the right markup. All to the good, I should think. That’s what extensibility is for.

The foregoing posting appeared on Joe Clark’s personal Weblog on 2004.05.16 15:18. This presentation was designed for printing and omits components that make sense only onscreen. (If you are seeing this on a screen, then the page stylesheet was not loaded or not loaded properly.) The permanent link is:

(Values you enter are stored and may be published)



None. I quit.

Copyright © 2004–2024