Didn't really know which section of the forums to post this in, so I stuck it in "Friends"!
The article linked at the bottom of this post makes some interesting points - according to a study, only .003% of our data isn't stored on some form of computer today.
Needless to say, this is a somewhat disturbing statement. Having had loads of experience with computing hardware, one eventually realizes that most of the hardware out there is actually rather unreliable - hard drives, for instance. Anything that has to spin at over 3,000RPM for a couple of years is not suited to long term data storage. Also, the magnetic fluxes that today store most of our data could be wiped out by a single big magnet (not counting CDs of course).
Even CDR media, purportedly the saviour of mass data storage (and DVD R[w] along with them), only last a maximum of about a century or so. And that's an optimistic estimate. What will future historians find when they dig up our civilisation's remains? Nothing. All our data will have dissipated like motes of dust before the wind. Our society today is defined by information - try living for a year without your credit card, bank account, and without touching any computer - it's virtually impossible.
This all brings me to a related question: what is being done to backup this place? This DB? I'm not being an upstart, I'm just curious to find out how our words that we spend so much time writing are being backed up...and on a similiar note, out of curiosity, how large is the info stored by us on this forum in total? That's a question for Simon
As usual, there is a rather simple solution to this problem. All we need to do is build a generalized cross-computer "fuzzy storage" system - others have written about this idea before too. What that would entail would be a system where your computer would be connected to a very large number of other computers by an extremely big pipe - something like the T1 of today or perhaps faster would be required for decent latencies across such a large network. This network would specialize in abstracting the storage of data - a single large file, for instance, would be stored in a distributed fashion on a thousand other computers, thoroughly jazzed and encrypted simultaneously, with only the computer owning it having access to the data. On a corporate network, perhaps the administrators would have access to all the files, but if such a system were implemented properly using some sort of hybrid public-key system, it would be fairly simple to ensure anonimity, user file control, and very good data redundacy. Each person would be allowed to store exactly a certain fraction of the total storage capacity THEIR computer brings to a network (for fairness - perhaps a service would exist that would sell people super-redundant disk space. Such services do in fact already exist, but not in the super-redundant format proposed here...)
The core of this whole concept involves each computer following enough levels of indirection to ensure that the very large amount of data flooding this network would render it virtually impossible to intercept transmissions and decrypt them. By the very nature of it's distribution, such a network would already make the task of tracking down all the fragments of a file very difficult - only the central algorithm on everyone's computer, primed with the correct one-way public key, would be able to integrate and maintain a virtual filesystem on such a network. Mathematics like this are quite out of the grasp of my mind - but no doubt somebody smart at MIT can figure out a workable way to accomplish all of this with minimum overhead. We already have journalled filesystems - this just distributes the journalled filesystem across a giant network of computers, leading to ultra-high redundancy data storage. Just like the Internet, if there are enough nodes, and a sufficiently high number of redundant copies of each file fragment are made, there will be virtually no way to destroy the data, unless you destroyed the entire network or a very large region of it.
Unfortunately, all of this piggy-backs on existing technology we have today, such as magnetic hard drives and flash memory. These mediums are notorious for their unreliability - they are far from suitable for very long term data storage and archival. However, the Abstracted Redundant Data Network (ARDN) (OH I love making up acronyms!) would give us an entry point to truly redundant data storage. We have already implemented the abstraction and fuzzification of the location of our data with ARDN - now all we need to do is archive the changes made to the data on the network, bit by bit, in a more permanent medium, such as a holographic storage device. Technology to do this is still being developed - what it would effectively mean is, the further back you want to go in the data's history, the more holocubes you'd have to process. You'd have to unfold the history like flower petals, until eventually you reach the very first modification. Naturally this raises privacy issues - the data will be laid bare for all to access!
But not if the ARDN implements the fragmentation process properly. Of course, there is no such thing an unbreakable encryption code, especially now that quantum processors are being theorized about and researched - but the levels of protection ARDN will afford will be very high indeed, far higher than today's average unencrypted FAT32 or NTFS or EXT3FS or ReiserFS file system!
So what do you guys think about this? Am I totally OT? Am I just a smartass with too much time on his hands? Speak up!