The interesting thing about archiving and long-term-storage is the fact that we are not capable of correlating long-term thinking with our day to day business.
Remember: each day we see the effects of Moore’s Law: storage becomes dead cheap, CPU power becomes immense, bandwidth and online access becomes ubiquitous. Each year. No exception.
And now we think about long-term storage. Hey folks, this isn’t two years time span here. Our customers are talking about 10-50 years. And of course, we all assure them that their data is safe for this time and that in 50 years they can still read it.
Sheer folly. Nobody knows what will be in 50 years and whether storage then will have any similarity to today. Essentially, we can only say that we do our best to ensure that their data will survive all evolutions and revolutions in between, and that in 50 years, their implanted data processor will render some kind of representation of todays document to their retina. As I said: do our best.
Let’s look at some areas which come to mind, where this problem hit’s home and people offer solutions which frankly spoken will be obsolete within the decade:
WORM storage media
Simple question: do you still have WORM media from the 90′s? You remember, the big 5,25 inch disks with a whopping 600 MB capacity. Of course you don’t. Aside from the fact that one of the large optical libs from the 90′s will now fit smoothly on a USB stick for 50 bucks, all of that stuff has been migrated at least twice in the mean time. And that was just 10-15 years ago.
So when I see people today campaigning for WORM storage platters of any material, and promising durability for 50 years, I just have to say: so what? Even if the thing would exist in 50 years, the last drive capable of reading it would have perished about 35 years earlier. Again: just try to read a WORM from 98. Good luck. And do you remember what your customers go in the 90′s when WORM media got damaged? Yes, exactly, a new medium. Data? What data? :-)
So: for storage media we must embrace the fact, that every 5-10 years, data will and has to be migrated to the next better storage medium. We must ensure that the medium does prevent accidental or evil tampering, but there is no need at all to go for strong durability, at long as the data is safe for about 5-10 years.
Signature based authentication
Now that’s my absolut favorite. People go to extremes to create technology and infrastructures to cope with the fact that the signature algorithms (or rather, the encryption schemes used) will get weak after some time.
A whole industry revolves around this topic, and around new technical specs such as the infamous TR-VELS in Germany, now named TR-ESOR (and quickly moving to TR-ASH, hopefully).
Now the first joke is the fact that at least part of the stuff is targeted at the hash algorithms becoming weaker. While this is probably true, actually this is quite irrelevant. Even if you could crack a hash algorithm by creating a second document with the same hash code, what you need to do is to create a manipulated document with the same hash code. So you need to add this interesting signature or remove the incriminating sentences, and still keep the hash. Good luck. Not very likely, to say the least.
And now comes the actual problem, the asymmetric encryption algorithm used for authenticating the originator of a document. Remember, the hash code tells you about the content, and the RSA encryption tells you that the document was indeed created by the sender.
All the hash tree algorithms and the legal requirements to re-sign (pun originally unintended) documents stem from the fact that people assume that in 20-50 years, you will still have the same kind of signature method, i.e. hashing and public-key encryption.
Unfortunately, public-key encryption is not only attacked by Moore’s law, where you could simple increase key lengths more and more, but by the advent of quantum computing. Quantum computers will effectively crush RSA and the like, because they are so much faster in factoring primes. There are ways to counter that but they will surely not have the same architecture as the stuff used today (e.g. quantum cryptography).
So that is happening with these signature systems today: a grossly expensive infrastructure is created which will become somehow less viable than the Dodo once quantum computers arrive.
To sum it up: when you think of long-term-storage, try to attack the problem decade by decade and to attempt to create any technology, infrastructure, legislation which only has value if it still exists in 30 years. This is no value at all. The customer needs a solution now, and it must be cost effective.
And let’s be honest: constant change is what happens, and long-term-storage can only be provided by constant adaptation to short-term technological advancement.
So keep it simple, keep it stupid and invest the savings into adaptation.