Posted on January 4th, 2010 at 04:50 9 comments
Last week I posted a rather detailed discussion of the Custom XML modifications that Microsoft was implementing in Word 2007, to comply with a court order that found MS violated a patent by i4i.
Now, at looooooong last, we have technical details about what’s changing in Word 2007 (and therefore in Office 2007). Knowledge Base article 978951 addresses the issue:
Versions of Office Word 2007 that are distributed by Microsoft after January 10, 2010 no longer read the custom XML markup that may be contained within .docx, .docm, or .xml files. The new versions of Office Word 2007 can still open these files, but any custom XML markup is removed. Custom XML markup in Word documents is visible in the Office Word user interface as pink (the default color) tag names surrounding text in a document…
Office Word content controls are not affected by this update. Content controls are a common method of structuring document content and mapping content to the XML data that is stored in a document…
Custom XML markup that is stored within Word 97-2003 document (*.doc) files is not affected by this update.
Ribbon XML and Ribbon Extensibility are not affected by this update. The Word object model is not changed by this update. However some Word object model methods that deal with custom XML markup may return different results.
Sound confusing? Yeah, it is, particularly because MS isn’t changing content controls, but it is zapping manually defined custom XML – but only in Word 2007 docx, docm and XML files.
I have absolutely no idea how these changes map to the patent infringement judgment, and would welcome any enlightening words in the Comments to this post.
Posted on December 23rd, 2009 at 12:37 7 comments
There’s a lot of misinformation about this in the press, so let’s start with the basics.
You know about markup languages, yes? In its most basic form, a markup language lets you turn plain text into fancy text. For example, if you want the word Mxyzptlk to appear in bold italic, a markup language might understand something like:
< bold > < italic > Mxyzptlk < /italic > < /bold >
and display Mxyzptlk the way you want. Those thingies inside the < brackets > are called tags.
(If you’re an old WordPerfect user, you might remember a feature called “Reveal Codes.” In many ways, WordPerfect’s reveal codes are just a particular kind of markup language. When Microsoft introduced Word 1.0, it determined that Reveal Codes were harmful and hateful and fattening, and banished them from Word. Much wailing and gnashing of teeth emanated from the WordPerfect camp. But the worm has turned.)
XML, Microsoft’s eXtensible Markup Language, goes one step further and lets you define your own tags. So for example, you could create a formulation like this:
< bit > blah < /bit >== < bold > < italic > blah blah < /italic > </bold >
and the new tag < bit > suddenly takes on meaning.
In Office 2007, Microsoft introduced a new set of file formats based on XML. The .docx, .docm, .xlsx, .pptx and other formats you’ve probably sworn at, embody Microsoft’s attempt to move from a document file format that absolutely nobody could understand, to one that’s at least somewhat less inscrutable. If you crack open an Office XML file, you find that – to a first approximation anyway, and with a few if’s and but’s – it consists of a bunch of zipped text files, and a little bit of glue that holds the zipped text together. If you save a PowerPoint presentation in .pptx format, for example, each slide becomes its own zipped text file inside the pptx file.
With me so far?
Now for Custom XML. You can create your own, custom XML tags and stick them inside one of the new Office 2007 files. Not many people have the insane desire to write custom XML, but programmers (who may or may not have insane desires) use them on occasion. One example that Microsoft gives is for PowerPoint: if your company has a gazillion PowerPoint slides, you could write a program that scans the slides and sticks data inside custom XML tags that describes the slides. The data would be stored in the .pptx file, so it travels wherever the slides go. You could then write another program that asks a lowly human for his or her preferences, then scans all the slides in a particular slide dump, and assembles a new presentation based on whatever criteria the human had the temerity to give. The Really Neat Thing about PowerPoint Custom XML tags is that the data can be associated with a specific slide: the Custom XML contents get stored in a zipped file inside the pptx file, but the glue that holds the presentation together creates links between the Custom XML zipped file and the zipped file that holds the individual slide. Thus, the programer can reach into the presentation and gather slides like daisies in May and – this part is important – the program never has to use PowerPoint itself. The bloat and overhead that comes with dealing with PowerPoint never rears its ugly head.
So now you understand why Custom XML can be important, especially in big companies, and why mere mortals rarely use it. You can probably also see that there has to be a way for the glue inside the pptx file to bring together the file itself and the Custom XML data.
Back to the headlines. Back in June, 1994 (!), a little company in Toronto, Ontario (in Canada, eh?) applied for a US patent on a specific method for making the glue that binds parts of the documents and add-on files. Ends up that the method they invented is very close to the way Microsoft uses to bind pieces of Office 2007 documents and their embedded Custom XML zipped files. On May 20, a federal jury in Tyler, Texas, found Microsoft guilty of violating the i4i patent, and order Microsoft to pay i4i $200 million. Microsoft appealed. On August 11, Judge Leonard (no relation) Davis, citing Microsoft’s lawyers’ hijinx, slapped another $40 million onto the judgment for willful infringement, and cited $37 million in pre-judgment interest. Microsoft appealed, and lost its appeal yesterday.
Microsoft’s press release gives a very succinct and (far as I can tell) accurate assessment of the situation:
This injunction applies only to copies of Microsoft Word 2007 and Microsoft Office 2007 sold in the U.S. on or after the injunction date of January 11, 2010. Copies of these products sold before this date are not affected.
I’ve been searching up and down, and can’t find out why the injunction specifically applies to Word 2007, without also bringing down the wrath of the Court on Excel 2007 and PowerPoint 2007. My conjecture – and it’s only a conjecture – is that the case was so difficult, technically, that the i4i attorneys didn’t try to cloud the issue with the other products.
Microsoft’s been preparing for this eventuality for a long time. For example, companies that put together PCs with Office 2007 pre-installed have been installing versions of Office 2007 without Custom XML since October, per this advisory. (Thanks, Susan!) As of a couple of minutes ago, I can’t get through to that page. It’s possible that Microsoft took it down. If you can’t get to it either, here’s what it says:
Microsoft has released a supplement for Office 2007 (October 2009). TheÂ following patch is *required* for the United States. /The patch will work with all Office 2007 languages/.
After this patch is installed, Word will no longer read the Custom XML elements contained within DOCX, DOCM, or XML files. These files will continue to open, but any Custom XML elements will be removed. The ability to handle custom XML markup is typically used in association with automated server based processing of Word documents. Custom XML is not typically used by most end users of Word.
Note that this patch is only for OEMs – the companies that put together new PCs. It doesn’t affect any customers, like you and me.
Several of you have asked what I think will happen next. Obviously, Microsoft’s attorneys are burning the midnight oil, trying to reverse the Federal Circuit Court of Appeals decision – but at this point their options are very limited: get the Fed Court of Appeals to re-hear the case seems very unlikely, and the Supreme Court looks to me like an even longer shot.
Will i4i go back and try to get damages for copies of Word and Office sold prior to January 11? Hell, if I was in their shoes, I would try. Apparently the Custom XML Schema technology in Word 2003 may infringe on the patent, as well. And if Word 2007′s a dirty patent-buster, Excel 2007 and PowerPoint 2007 must be in the same pigpen.
I think it’s highly unlikely that Microsoft will cut a deal with i4i – which they obviously should’ve done from the get-go. I also don’t think that the Redmondians will have a sudden change of heart, decide that they shouldn’t have violated the patent in the first place, apologize, and compensate i4i. Naw. Never happen. Too many Microsoft lawyers making too much money off this one.
Funny. Sometimes the American legal system actually works.