Concurrently Chaotic: February 2008

Thursday, February 21, 2008

Teleworking technology: we already have it

I've been thinking about months for what technology can help people to discover and get the most out of teleworking. I'm getting closer and closer to the following conclusion:

We've already built teleworking infrastructures, at least in my home country Japan, and many other nations and regions which have already deployed the Internet. Period. We need to focus on how to break the barriers and remove the impediments to people to embrace teleworking technologies as their own tools.

Let me simply put in this way: teleworking is no longer something special. For example, writing this blog itself is a very good example of teleworking technologies. All I need is a decent Web browser capable to run JavaScript for Blogger's writing environment; and I can publish and show you what I write, like what you read now.

Collaboration is getting more and more easier also. LUNARR allows you to flip back an electronic document and let you and other people write something on the back. This is a rather intuitive way to co-author a document, which I had never thought about until I saw it. Geeks will do the whole things in more geeky ways using SSH, rsync, CVS, subversion, or whatever else for software development, but it's not only for geeks anymore. Anyone who has to write a document can do it on a teleworking environment.

So let's stop saying "I'm not allowed to work from (wherever you want other than your office) because blah blah blah..." and talk how you can leverage your productivity from introducing teleworking into your lifestyle. If you and your employer think teleworking is something restricted for special people, that mindset is archaic, and no longer applicable to the 21st-century version of modern lifestyle. And let's not take commuting as a duty; it should be a choice and it has to be, to reduce the gross amount of time and energy wasted by being forced to commute.

But beware that teleworking is not necessarily a duty either; don't do it like this as in an xkcd article. Don't leave your partner alone just because "something is wrong on/with the Internet" (grin).

Sunday, February 17, 2008

The art and limit of dependencies

If you want to get a job done, you need to list up the necessary things and tools, and procedures to use and apply them. In other words, you have many dependencies on them.

In a modern computer software development, no piece of software can withstand without any dependency to other pieces. You do not want to make a C program without the standard I/O library. Dependencies to the tools are also important and critical; if your codes include those written in FORTRAN, C, C++, and Java, you need the four compilers and language execution environments.

Computer programmers have been making the tools to automatically resolve the dependencies. make is a popular one, derived from the UNIX programming environment, which parses the rulesets called Makefile and determine whether if you need to rebuild a result from the source files, by comparing the timestamps. If make finds out one of the source files are newer than the result, it will invoke the command to rebuild the result.

Version control systems such as subversion, CVS, and RCS, are another good examples of dependency management tools. You can save the history of changes on a file, a directory, or a set of directories. You can make a software package by checking out a set of tagged files; you can even make multiple branches of a code.

FreeBSD operating system has its own dependency management system of externally-contributed programs called ports and packages. A port means a set of rules, configuration files, and source codes necessary to build a program. A package means the derivative of the port, built by another computer. Many essential parts of FreeBSD subsystems, including the X Window and Perl programming language, are installed as packages, because they are not considered as the core parts of FreeBSD.

I'm always pleasantly surprised when a very complicated port, such as Japanese version of LaTeX, a typesetting and documentation set of programs, can be built without major glitches, including the automatic installation of depending programs, such as Ghostscript. For most of the ports, FreeBSD volunteers are always doing the outstanding jobs.

I feel very much annoyed, however, when I have to untangle the web of dependencies when the installation/building of a software toolset from the scratch. Unfortunately, Xorg 7.3, a free implementation set of X Window programs, fonts, and tools, was not able to build in my environment. I had to copy three prebuilt sets of files: include files of C/C++, shared/static libraries, and the very basic fonts, to finish building the necessary programs.

In my case of Xorg kitbuilding, some very old fonts existing since late 1980s were not successfully compiled and converted into various ISO8859 part codesets. And I could not build the font handling libraries. This glitch killed the whole automatic compilation task, which was supposed to untangle the enormous list of dependencies. I also found that the target directory name was merged to /usr/local from the traditional /usr/X11R6, and I wasn't sure whether if I moved the old files and subdirectories under the old directory safely to the new directory.

Untangling the web of dependency is a very hard task, since the parameters you need to examine are scattering around all the directories, in all the ports, and you often also have to set an environment variable to do something extraordinary. While I can guess what I should do because I've done a lot of this kind of tasks, I never want to do this at any time because scanning your memory and poking around the files are very much mentally painful. In a recent hostile computer environment, vulnerabilities can easily sneak into such a complicated and unformalized tasks.

How do you manage your computers? Do you care about the dependencies of the tools, programs, configuration files, and other objects?

Sunday, February 10, 2008

Accountability of an SNS user

In a major Japanese SNS mixi, I feel I have been always exposed to a strong peer pressure, which I find common in Japanese society, that you should not argue with others unless you really want to break up with those people. This sort of fuzzy feelings covers over the whole society of Japan, including that on the Internet.

I really don't like the fuzzy Japanese social atmosphere, though I don't favor the hostile and negative environment which I always find in anonymous and open bulletin boards and blogs in Japan nowadays either. I always want to have a creative discussion with a constructive criticism. People tend to go the opposite way, however, in most of the times. This is what I've learned both from the real and Internet communications.

One of my friends claimed to me that the warm fuzziness surrounded him in 1980s in the Japanese online community of bulletin boards was something completely different from that after the popularization of Internet in Japan. He told me the fuzziness which I don't like actually helped the self-governance of the users. I asked him why.

He told me that the bulletin board operators could individually locate and even persecute each account holder, by exchanging the written agreement in paper, and charging access fee individually from each user's bank account, automatically drawn each month. He said he had many anonymous or pseudonymous friends there, because he could trust those people even without knowing the real names, by the guarantee of financial and social accountabilities for each users that the BBS operator provided. In other words, the participants were supposed to behave nicely online, and the participants could prevent each other from the extreme cases of disputes.

The friend and I agreed, however, that you could no longer expect such a high degree of individual responsibility on the current SNSes and other Internet communities. Many people belong to many different systems, and those systems are usually supposed to keep the personal information undisclosed except for a legal request for the authorities. This means all disputes between Internet participants have to become legal issues, or lawsuits.

And even if you win a legal battle, you can't really collect the compensation justified by the law, unless the law enforcement officers really pursue the defendant to do so. The legally-responsible individual of one of the largest anonymous BBSes in Japan keeps refusing paying the money he has been requested from the court, though he loses on multiple lawsuits, and he is actually winning the battle by not adhering to the legal requests, because those lawsuits are all civil law issues and not the criminal ones. This is a good example of the limitation of depending resolution of individual disputes over Internet on the legal procedures. Forcing people to use "real" names will not solve this problem at all.

How can an SNS user be held accountable, for his/her legal and social activities? This is not a direct technology issue, but technology can and must help the issue to be resolved. And I think tagging individuals and restricting online identifiers do not necessarily effectively work on this issue.

Friday, February 8, 2008

CALLing disaster during MySQL upgrade

I've been upgrading one of my servers for the daily use, by migrating the running environment between two PCs. It's not a mission critical server because I don't run a public service there, but I need to do the upgrade carefully anyway because the version number of running FreeBSD and other applications have been changed. I want to keep the old environment as long as possible to use it as a reference, so I'm doing the migration manually by recompiling and reconfiguring the software.

One of the glitches I faced was about MySQL. It's not about the bugs in MySQL, because the database server load is very small. I was trying to transfer a dataset from the old version 4.0 to a new version 5.1 software of MySQL. The mysqldump result of the old 4.0 output didn't get through and reloaded into the 5.1 server. I could not even perform CREATE TABLE. The reason: the column and database identifiers were not properly backquoted.

The database was for my amateur radio activity. Amateur radio stations have callsigns, and in a popular contacting log exchange format called ADIF, the other party's callsign is represented by the identifier CALL. Unfortunately, MySQL 5.1 made the word CALL a reserved word, while in the 4.0 version the word was apparently not. This is a tragedy for an amateur radio enthusiast, and a careless programmer like me who tends to omit proper (back-)quotation.

After an investigation for a few minutes, an idea of referring to the result SHOW CREATE TABLE from the 4.0 server came into my mind. I did it and fortunately the table definition was properly backquoted, so at least I could rebuild the database skeleton. The dumped data was a set of INSERT INTO statements with the VALUES and they were all properly quoted, so I could rebuild the database.

The new mysqldump command's output of MySQL 5.1 looks much better and properly backquoted and quoted all the necessary strings, and even put the SQL statements to lock and unlock the database. The entire dataset is represented by a single INSERT INTO statement VALUES set, so you've got to be careful when you want to use the data set not in the full contents.

I should note that I had to rewrite all the software which generated the SQL statements for the proper (back-)quotation. This was a handful of complicated tasks of fixing the Bourne shell and Perl scripts.

And I realize why SQL injection is so popular for attacking the database servers. Parsing SQL correctly is a non-trivial process. A word can be either a part of a directive or a target identifier, depending on the position where it is in an SQL statement.

So when you want to store CALLs into a MySQL database, you've got to do it carefully with (back-)quotation.

Monday, February 4, 2008

Data erasing woes

Clearing up the used disk has been a real burden for sysadmins, as exploitation of carelessly-discarded information becomes popular. In Japan, people are extremely sensitive against how their personal data records are kept secret and controlled. The reality is, however, that data disclosure incidents keep happening, no matter the government enforces the law, and companies make frequent apologies for those incidents. Once the data get out, you won't be able to take them back.

I think a part of the reason of continuing data disclosure incidents is that erasing data is simply a difficult task. For example, even if you store only 10G bytes into a 160G-byte disk, you need to thoroughly sweep the whole 160G bytes, to guarantee all data are cleared. Also, you've got to be careful to deal with the hidden data, which is not accessible by the data erasing software. If you really want to make the data not recoverable at all, you need to erase the trace of electromagnetic residue on the hard disk platters, which is hard to perform in a usual business or office environment. Destroying a usable device is not an environmentally-friendly practice either.

Another problem is that the sweeping process is slow. The physical writing speed governs the whole performance. For example, one of my old portable 40G-byte hard disk can be written in only 20M bytes per second. So it will take at least 2000 seconds or about 34 minutes to sweep out the whole disk once. You need to do this multiple times to ensure the residue of data is not easily detectable, so the whole process may take 2 or 3 hours. 3 hours for just 40G bytes. 12 hours for 160G bytes. (sigh)

The data storage which an individual has to manage is getting bigger and bigger every year. I wonder how people can cope with this. Do you sweep and erase the used disk data before you resell or give it away to somebody?