Wednesday, October 11, 2006

Petabytes of Personal Data

A couple of days ago, I posted a note about Microsoft having 5.5 petabytes of crash dumps of the Vista release candidate, collected presumably from all around the world from members of their beta test program. If each crash dump is a gigabyte, that's 5.5 million individual crash dumps.

My first reaction, as an engineer, was to be impressed and a little envious. Even as we near a terabyte per spindle, building a multi-petabyte archive, collected over the Internet in half a year or so, and processing it is quite an accomplishment. It's an incredible engineering resource, and it must be fascinating to write tools that accelerate debugging by leaping from dump to dump, looking for data that will confirm or disprove a hypothesis about a particular problem. Certainly a problem related to a specific hardware configuration must stick out like a sore thumb.

My second thought, as a smug Linux user, was that it would take a really long time to get 5.5 million crashes, even if everybody in the world switched tomorrow.

Then this evening it occurred to me that Microsoft now has the memory contents of millions of people's PCs. I wonder what's in there? Bank account info? IM from a congressman? Crypto keys? It seems likely that Intel and Oracle have extensive beta test programs; perhaps part or all of a chip design or database product strategy?

You don't have to be a conspiracy theorist, or even loathe Bill Gates, to think that any one organization collecting the memory contents of millions of computers is a questionable idea. It has to be a tempting target for hackers, ambitious Justice Department folks, or even curious Microsoft employees.

I'm sure there are people out there who are members of the beta program. Were you made aware that your memory contents would be sent to Microsoft in the event of a crash, and were you warned not to use it for sensitive work? Was there and agreement you had to assent to? I wonder if this violates any EU privacy laws... Does anybody know the technical details of what is and is not included in a crash dump? (e.g., is the screen memory dumped?)

No comments: