Monday, February 4, 2008

Data erasing woes

Clearing up the used disk has been a real burden for sysadmins, as exploitation of carelessly-discarded information becomes popular. In Japan, people are extremely sensitive against how their personal data records are kept secret and controlled. The reality is, however, that data disclosure incidents keep happening, no matter the government enforces the law, and companies make frequent apologies for those incidents. Once the data get out, you won't be able to take them back.

I think a part of the reason of continuing data disclosure incidents is that erasing data is simply a difficult task. For example, even if you store only 10G bytes into a 160G-byte disk, you need to thoroughly sweep the whole 160G bytes, to guarantee all data are cleared. Also, you've got to be careful to deal with the hidden data, which is not accessible by the data erasing software. If you really want to make the data not recoverable at all, you need to erase the trace of electromagnetic residue on the hard disk platters, which is hard to perform in a usual business or office environment. Destroying a usable device is not an environmentally-friendly practice either.

Another problem is that the sweeping process is slow. The physical writing speed governs the whole performance. For example, one of my old portable 40G-byte hard disk can be written in only 20M bytes per second. So it will take at least 2000 seconds or about 34 minutes to sweep out the whole disk once. You need to do this multiple times to ensure the residue of data is not easily detectable, so the whole process may take 2 or 3 hours. 3 hours for just 40G bytes. 12 hours for 160G bytes. (sigh)

The data storage which an individual has to manage is getting bigger and bigger every year. I wonder how people can cope with this. Do you sweep and erase the used disk data before you resell or give it away to somebody?