microformats and accessibility

A while back, Bruce Lawson and James Craig wrote hAccessibility, about the abbr design pattern and potential problems that it raises for people who browse the web with screen readers. Since then, there has been some useful discussion on how to make microformats more friendly to screen readers, which is great.

The core problem is this – certain microformats require machine readable data to be encoded in the HTML of a web page. For example, hCalendar stores a machine readable date which can then be used to export your event data to a calendar. Cool stuff, but the readers of your web page get a bunch of gobbledigook presented to them in a little tooltip. Try hovering your mouse over the following phrase to see what I mean – this Wednesday. A blind reader hears the phrase ‘this Wednesday’ expanded as bunch of meaningless numbers. Someone looking at the web page (and using a mouse) will see a tooltip with a bunch of meaningless numbers (I’m assuming here that most people don’t automatically recognise ISO-formatted dates, which is what 2007-08-08 is).

A solution to this problem would be to fix screen readers so that they read out the string ‘2007-08-08’ as ‘8th August 2007’, or whatever the local equivalent for publishing a date is wherever you happen to be reading this. Effectively, it seems to me, the code above needs to be transformed to – this Wednesday.

I wonder, though, if this transformation isn’t better performed by the web browser itself, then sent to the screen reader to be read? The web page that you read is not the original HTML sent from the web server, but the resulting DOM after it has been parsed by your browser’s HTML parser and processed by any scripts or browser extensions that happen to be running. The machine-readable date stored in the abbreviation title in the original HTML could be processed by a script, or browser extension, and transformed into a localised, human-readable title, after the HTML is loaded but before it’s rendered to the viewer. A javascript to do this shouldn’t be hugely difficult.

This approach solves the fundamental problem of misusing abbreviation titles to contain unreadable content – surely a breach of the accessibility and usability guidelines which ask us to write our content in the clearest, simplest language for our users? The title attribute is content, after all.

However, I still have reservations. Firstly, using a script to process a microformatted date might not be readable by JAWS or Window-Eyes. Steve Faulkner recently demo’ed some cases in which the parsed DOM is modified with a script, but screen readers don’t pick up the changes. Personally, I think this has to be addressed by the makers of assistive technology. A web browser which can’t read out scripted changes to the DOM is pretty much crippled on the modern web. It is a real, practical issue, though – if a solution works for sighted users in IE/Firefox/Safari/Opera etc. but doesn’t work in a popular screenreader, such as JAWS, then is it a useful solution? It’s certainly not an accessible solution, unfortunately.

Secondly, does the person reading the page need a date expanded as an abbreviation in the first place? Reading the phrase ‘this Wednesday’ above, did you really need to be told that this refers to ‘8th August 2007’. The abbr design pattern is a hack, and not a particularly elegant one, to make up for HTML 4 not containing a tag, or attribute, to semantically describe dates and locations. So the microformat is very useful to machines, but I’m not aware of any usability studies which show that it enhances the value of the page for people.

Finally, taking hCalendar as a specific example, there doesn’t seem to be a lot of demonstrated value in microformats at the moment. They are nice, really really nice, and some great things could be done with them. But I’m not sure that I can, for example, use hCalendar on the National Maritime Museum’s event calendar and have the events automatically recognised by a search engine, or harvested by a service such as Upcoming. So I’m not sure that there’s a great incentive to publish microformatted data yet – it still seems to me very much like a geek, niche thing. Lots of people are publishing them, but is anyone doing anything useful when it comes to consuming them? Because if that started to happen on a wider scale, then there would a real, demonstrable value to adding microformats to a web page. At the moment, the lack of this, coupled with accessibility problems makes me nervous about using microformats myself.