XML File Formats in Office 12

Brian Jones, program manager on the Word team posted news about the new XML file formats in Office 12. I have posted a few times on XML file formats, so I am interested in what Office is doing. Some interesting points:

  • Full fidelity. As good as the binary format.
  • Compressed. Lots of file parts stored in a structured ZIP archive, similar to OpenDocument
  • Fully documented. Whitepapers, documentation and XSD schemas.
  • Robust. A lot less likely to get a corrupt file that can’t have pieces recovered.

IMO, the above points could be true for any product using an XML package format, but the fact that Office is doing it means millions of users can escape the data roach motel.

Scoble has more on Channel 9 as well as his own blog. Jones plans to link to whitepapers as soon as they become available on MSDN.

Granted, Office is following OpenDocument here, but this is a really good thing.

3 Replies to “XML File Formats in Office 12”

  1. Speaking of XML and serialization, I’ve been using it for a time now in a couple of projects I’ve been working on, and I think it works pretty smootly.

    My approach is to just let each serializable object support the following two basic functions:

    procedure WriteToXML(XMLNode: IXMLNode); virtual;
    procedure ReadFromXML(XMLNode: IXMLNode); virtual;

    Then, to stream out an object tree, I just call WriteToXML on the root object, and everything else follows automatically.

    Works pretty well, I think. But I still haven’t started thinking on how to deal with changes in the object structure, and how to deal with backwards file compatibility.

  2. If you look back over some of my posts, you’ll see that I favor separating file/data structure from object structure, mainly for versioning reasons.

    Your approach could also be problematic when trying to write or read a fragment of the object structure (i.e. not starting at the root). I know because I have done the exact same thing in the past. Hooking up shared pointers is much easier when dealing with the entire object structure instead a partial object structure.

  3. I AM currently using an object bound streaming model to read and write separate fragments of the object tree, and I don’t see any NEW problems arising from this particular use of it. At least not so far in the process.

    Among other things, I’m using the model to create a generic history model where I store individual history states of objects by writing them to separate XML streams that I can keep in-memory. Works pretty OK this far.

    However, I DO anticipate that there will be some problems when I start to cover the grounds of versioning and backwards compatibility. I’m thinking that then I’ll EITHER have to let each object have support for the multiple streaming versions as needed, or I’ll have to do as you say and disconnect the file/data storage from the object structure itself.

    Also, I didn’t quite get what you meant by shared pointers.

    Elling
    elling.bjastad@gmail.com

Comments are closed.