Concurrently Chaotic: 2009

Wednesday, December 30, 2009

Persistent Erlang processes or process pairs

Erlang's processes are the minimal unit of execution of Erlang BEAM virtual machine. Each process has its own ID, and can send/receive messages, register the name to the BEAM it's running, and can link with another process for error handling and monitoring. It even has its own process dictionary.

I've been thinking about a question for a few days: can you make a computer holding a process for more than 100 years? The word process here does not necessarily have to be the Erlang one, but the Erlang process will be a good candidate because the working environment it has to carry around with it is minimal and much smaller than that of UNIX process.

Keeping a process alive for a long time would not be possible if a process is confined to a single machine; the machine's failure means the immediate death of the process. So the process should be able to move around between multiple machines and make its clone on its own. Using shared memory should be avoided as possible. Realizing these characteristics with Erlang is less difficult than in other computer language systems.

I discovered the idea of having a persistent computer process is actually not my original. Google search engine tell me that Jim Gray has already published a technical report when he was in Tandem in 1985 (PDF copy from HP Labs) (a scanned text file of the report) with an idea of persistent process-pairs, as a part of his model of transactions for simple fault-tolerant execution.

In Gray's report, he describes a much smarter approach of making two persistent processes a pair to realize the persistency. If one of the pair fails, another one will show up and take over the actions of the failed one. This idea is much wiser than trying to keep a single process alive.

So now I find a way to realize persistent processes; next I need to learn how to implement them. It'll be a part of my new year's resolution for year 2010. A happy new year to you all the readers.

Monday, December 21, 2009

DNS operation is utterly neglected by many people

Twitter outage via DNS hijacking showed another case of common symptom: DNS operation is simply neglected by people doing business on the Internet.

I was doing research on DNS transport security from 2002 to 2008. One of the reason I quit focusing on the research was that most, if not all, of the DNS problems are caused by operation failures, not necessarily due to technical deficiency of the DNS protocols and systems. In short, it's too political and social to do the technological experiments over DNS.

I still think DNS transport protocol issues are critical for stable Internet operation. But solving those issues does not help recovering human errors, such as lame delegation (missing link) between the domain name hierarchy. And stable operation of DNS systems is very difficult to maintain without stable hardware, software, networks, and operators.

I notice many small companies (especially in Japan) keep their authoritative servers inside their office, which is not good from the stability point of view. Actually, for many small Internet sites, including mine, not so many DNS zone records have to be exposed to the public. So I've already outsourced the DNS authoritative servers, while I periodically watch whether those servers do the right thing.

DNS is by definition a distributed system; and the management standard is much lower than what people (and even Internet engineers) believe. For the further details of how DNS is not well-managed, I suggest you to read a more detailed commentary on how important DNS is as an asset, by Danny McPherson of Arbor Networks.

Sunday, December 6, 2009

Bruce Schneier's speech at IWSEC2009

I had a chance to meet Bruce Schneier face-to-face for the first time, when I attended his invited talk session at IWSEC2009 conference in Toyama, Japan, on October 28, 2009.

I once worked for translating Schneier's book Email Security (published in 1995, which is now declared outdated by him) into Japanese. At that time he was a technologist on cryptography. The keynote speech in Toyama showed, however, that he was rather interested in psychology and human behavior, which is not necessarily logically predictable and often considered errorneous from technological points of view.

While I read a few people who apparently tweeted Schneier's speech was boring, I found his speech on the psycology of security rather refreshing and interesting. Maybe that's because I've been frequently disillusioned by how technological solutions often backfire. Of course it's not about the details in cryptography or other security protocols which are the primary topics of IWSEC so that might have been boring for the majority of the participants.

I won't go into the details of Schneier's speech, because most of the individual topics are frequently covered in his blog. Let me write about one of the things intrigued me the most; it was about the risk heuristics. People are risk-aversed or trying to have sure gain. And at the same time, they prefer probabilistic loss or risk-taking behavior when they have possibilities of losing something. With this heurisric way of thinking, people usually don't want to pay for having less risky life, and this is exactly one of the reasons why security products don't make good sales.

After the speech, I asked him why he converted from pure technologist to rather a scientist of broader topics including psycology and sociology. Unfortunately I didn't get a definitive answer on what made him so; he only emphasized the sociological aspects of security were equally important and critical as the technological ones. Maybe I could find the answer in one of his books; especially if the reason is a highly personal one, which no one will ever know.

Saturday, December 5, 2009

Erlang and Github

Erlang/OTP is now officially maintained under the Github repository, since the release R13B03. I think this is a milestone for the language, because the Ericsson development team finally decided to show the interim results of what they are doing for the time being.

One of the characteristics I like about Erlang is that the language specification and libraries have been maintained by a single entity called Ericsson's Erlang/OTP Development Team. I do not want an anarchy for computer language and operating systems. I prefer BSDism than Linuxism in this sense; I think pieces of code should be rather controlled by the core people while sufficiently accepting improvements from the other developers.

The old Erlang/OTP daily snap archives, however, are no longer sufficient to catch up with the daily development cycles. And many non-Ericsson authors have put in their patches into Erlang, including mine. So there had to be some systems to accept user feedbacks.

Using an open repository system such as Github is a wise idea for incorporating new code into Erlang/OTP, and showing the official status of modifications. Git is flexible enough to allow per-user and per-purpose branches. And Github allows forking between the users. The Ericsson's Team doesn't have to build and publicize its own code repository system for Erlang/OTP, which will cost them significant amount of human and financial resources.

And now I have an official requirement to learn Git; to catch up with the Erlang/OTP development cycles.

Wednesday, September 2, 2009

The definition of eventually secure systems

I've been using Web services under a new assumption of integrity these days, which allows the data inconsistency during a span of a few minutes. The designers of those systems allow such a relaxed condition to data consistency, for putting higher priority to availability and tolerance to split database subsystems within a cluster representing an integrated database.

Then a question comes into my mind: what does it mean for a database to be secured, while allowing unstable condition in a range of few minutes? Of course guaranteeing unconditional access restriction is a solution to claim a database secure, provided each party who is allowed to get access to the database does not harm the integrity at all. This sort of strict access limitation, however, is impractical for a public system. So, a new notion of security, probably called eventually secure systems, should be introduced. But how? I still have no idea about this.

Traditionally, databases are designed under the restriction of Atomicity, Consistency, Isolation and Durability (ACID) for every query and update operation. The ACID policy demands locking of critical sections between conflicting database requests and causes performance degradation.

On the other hand, Gilbert and Lynch [1] claim in their CAP Theorem for a distributed database, that the three properties of a database will not be realized at the same timing: data consistency, availability, and tolerance to network partition. BASE [2], which stands for basically available, soft state, eventually consistent, is an example of anti-ACID design policies based on the CAP Theorem, giving higher priority to availability and tolerance to network partition than the data consistency.

Vogels [3] also explains the idea of eventual consistency, or an eventually consistent change of states, as an analogy to Domain Name System (DNS), which allows the clients to query the distributed database to see the inconsistency during the propagation of database update events, while the inconsistency will be resolved in a finite period determined by the configuration of the replication network between the database caches.

While CAP Theorem, BASE, and the notion of eventually consistent systems are effective to relax the boundary condition of data inconsistency for making a very large-scale systems, those ideas will not solve the core issue: how to keep the consistency of a cluster of a database in a finite predictable time range. I understand many applications do not require atomic consistency of data, especially those for casual conversation, such as Twitter or Facebook. I don't think, however, that a bank system can be created under the BASE principle, unless the maximum allowance of temporal data inconsistency or the maximum time of eventual convergence are given and proven.

And I think on running large-scale systems, things are often getting eventually inconsistent and disintegrated, rather than eventually consistent. I still wonder how we can solve this problem consistently.

References:

[1] Gilbert, S. and Lynch, N. 2002. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33, 2 (Jun. 2002), 51-59. DOI=http://doi.acm.org/10.1145/564585.564601
[2] Pritchett, D. 2008. BASE: An Acid Alternative. Queue 6, 3 (May. 2008), 48-55. DOI=http://doi.acm.org/10.1145/1394127.1394128
[3] Vogels, W. 2008. Eventually Consistent. Queue 6, 6 (Oct. 2008), 14-19. DOI=http://doi.acm.org/10.1145/1466443.1466448

Wednesday, August 12, 2009

Current and outdated references of secure C programming

C is the modern assembly language for many architectures, and still the most useful computer language for me. C does not have a rigid grammar and has a lot of variants and local dialects, and have revised a few times including the old UNIX C, ANSI C 1989 which first introduced prototypes, and C99. Finding out the de-facto standard elements of C is a complicated work. You can find a bunch of different indentation and writing styles on C code. I do not recommend a specific coding style in this article; I can only recommend you need to follow the mainstream style when working in a project. Sometimes you have to read the books for discovering what is the most right thing to do. I recommend following books for C programming now:

C: A reference Manual (5th edition) (2002), which has a lot of precise and detailed examples on ambiguous usage;
Secure Programming Cookbook for C and C++ (2003), which explains what to do and what not to do to write secure code; and
Advanced Programming in the UNIX Environment (2nd Edition) (2005), which explains usages of modern UNIX system calls and libraries.

For practical programming, however, depending on books is not enough. Actually those books I recommended above are 5 to 7 years old as of 2009, so if you want to know the cutting-edge details of programming, you should read the latest software. Consulting a C compiler manual and well-written source code such as that of BSD kernels is a must if you want to write an efficient code (those are freely available). One thing to which you've got to pay special attention is that books are eventually but surely getting outdated. Books are not the Web articles; they are static and will not change. The lifespan of a reference book for computer science is typically very short these days, due to the rapid change of technologies. Books about C is not an exception either. And I should confess that a few days ago I decided to sell the following old worn-out books because I found out them simply outdated (and I no longer recommend the following two books any more):

The C Programming Language (2nd edition) (1988)
The Standard C Library (1991)

The reason that I found them outdated were as follows:

They are old, written in approx. 20 years ago, and they do not reflect the changes of C99 and other additional elements;
Not mentioning secure programming at all, including
- avoiding reference to non-existent data objects,
- preventing buffer overflows,
- limiting the length of a string;
and
The C library structure and source have been changed a lot for these 20 years.

Frankly speaking, I loved those old books, especially which I referred to the most during my apprentice time of learning the language in the late 1980s. Those books were the only source before the Web. I had to repeatedly read the old bestsellers many times to discover the details. I do respect the authors of those books. They are pioneers of UNIX and C programming. Nothing is eternal, however; and I suggest use to stop using outdated reference books ASAP for every subject, not only for programming.

Monday, March 30, 2009

Moving away from heavyweight blogs

I admit I've been dormant on blogs. Instead I've been active on Twitter and Tumblr; you can be reactive and posting short messages on those so-called microblogging systems. Time is the most scarce resource for me, and I want something with smaller overhead to write.

I will not entirely remove the contents of this blog, but I will not post a new article here, without further notice.

FYI, the pointers for my microblogging URLs:
(Note: the tumblr sites have been deleted by July 2009)

~~jj1bdx: quote of technology~~
~~Cyberperiscope Reviews (in Japanese)~~
~~Twitter: jj1bdx (in Japanese)~~
Twitter: kenji_rikitake (in English)
Concurrently Chaotic (in English)

Monday, January 26, 2009

Open-plan office and peer-monitoring socialism against creativity

When I started working as a programmer after I graduated from college in 1990s, I was fortunate enough to have a wall-separated booth, thoughwithout a door. This is something which workers have taken for granted atresearch laboratories in the USA or Canada. But things have beendifferent in Japan, where I live and work.

Having a separated space for individuals has been considered a luxury inJapanese companies, where people think space is money. So I should emphasize I was fortunate; because in Japan still corporateoffices are mostly open-planned: everybody seeing each other with nowall, whole bunch of noise, and is forced to listen to each other.

I had to work in 1980s with an open-plan office in Japan as an inturn,and I thought working in the office would surely hurt my body anddegrade the quality of my thinking. If I were just moving around anddoing ordinary tasks, I wouldn't have considered it much. But I had tothink there for writing a technical report. So I thought something hadto be changed.

I do not reject the idea of shared meeting space or the importance offace-to-face meetings. Those are vital factors of successful companies.But without a place for solitude, nobody would be ableto think. Without thinking, no innovation will come, and nonew idea will emerge. How can you think without being alone?

Recently I've found an article on Web which says working in open-planoffice makes you sick and is hazardous to your health.

A recent study of Dr. VineshOommen and his group in Queensland University of Technology showsthe following results:

Results: Research evidence shows that employees face a multitude ofproblems such as the loss of privacy, loss of identity, low workproductivity, various health issues, overstimulation and low jobsatisfaction when working in an open plan work environment.

Well said.

Tom Demarco and Timothy Lister also write in one of theirclassics Peopleware (2nd Edition, 1999, Dorset HousePublishing) as follows (in Chapter 12):

Management, at its best, should make sure there is enough space, enoughquiet, and enough ways to ensure privacy so that people can create theirown sensible workspace.

I've read the 1st edition of Peopleware (published 1987) in 1989, so theworkplace privacy issue is well-known for at least 20 years.

On the other hand, Japanese workplace has little changed for the past 20years. I still see many open-plan offices, especially amongnon-engineering workers.

I suspect Japanese open-plan offices are designed for managers to putthe subordinates under surveillance during the working hours. This isan example of a dark side in Japanese workplace socialism.

In a typical office layout, a manager in a team has the own desk besidesthe cluster of the desks for the team members. A team member can't takea rest or make a physical movement during working hours. I think thissort of desk layout does not respect the health of the team members, letalone the privacy or the productivity.

I've found quite a few articles about this open-plan office sicknessissue on the Web. So I think this is a matter of concern for manypeople. Maybe this is a sort of backlash due to the recent economydepression.

I'd rather work alone if I were put into an open-plan office every day again, solong as my brain and my ideas are the source of my income.

Saturday, January 10, 2009

The risks of systems left alone and untested

Computer systems left alone unmaintained are a premier source of risks. Those systems may cause a serious crisis and a major service disruption.

On September 14, 2008, All Nippon Airways (ANA), a major Japanese airline, caused the disruption of the ticketing service due to the cryptographic function software expiration (as they announced in the Japanese press release), which is logically assumable about the PKI certificate, according to the other Japanese-written press reports like ITmedia's and Nikkei ITpro's.

The chilling fact revealed was that the ANA left the cryptographic function unused for 2 years and did not make a review about expiration at all when they activated the function for the terminals used by the ticketing agents. This is an awful example of software development indeed.

I wrote about the service disruption for RISKS-DIGEST 25.34 just after the incident occurred. And recently I knew the article was quoted by another blog article yesterday.

The expiration issue is not only about the PKI certificate; domain name registration is another source of expiration risks. An expired domain can be abused for phishing and overtaken by attackers. Software license is another good example. In general, Expiration is a part of overall misconfiguration. So when did you review the expiration date of the system resources under your control last time?

Monday, January 5, 2009

Chain of distrust

Communication is a collection of trust between the involving parties. Unfortunately, the trust is eroding in Internet, or in the society itself; and I see the emerging chains of distrust.

An idea called Chain of trust is a practical implementation of authentication. Let me put it in this way; when Alice trusts Bob and Bob trusts Carol, then Alice assumes Carol is trustable. In this way, Alice doesn't have to directly authenticate Carol. Internet is another good example of chain of trust; each router assumes the peer routers will forward the packets originated from itself.

But the chain of trust is not what should be taken as it is, in the real world. In the Alice-Bob-Carol case of the previous paragraph, the peer-to-peer trust relationship between Alice and Carol is not necessarily established; the existence of distrust between Alice and Carol is even possible, and they may don't want to talk to each other. Communication through a proxy is in fact quite common between the distrusting two parties. Should I call this a chain of trust? I should rather call this a chain of distrust.

The current Internet is full of chains of distrust. Maybe I should rephrase it for accuracy; the chains of limited trust. For example, your employer will not unconditionally trust you to protect the employer's privacy, so you have to communicate outside the employer's network through a firewall, usually made of packet filters and proxy servers. Your employer gives you a limited trust for the external communication. This sort of limitation may cause your distrust to your employer, but the employer usually considers this is a security feature to protect the relationship with you. The difference of interpretation to the situation of limited trust can be a source of distrust.

In a set of trusted parties with a limited size, each party does not have to spend time on authenticating each other for every packet they communicate with each other. The trust is proven through the physical connection and perimeters. Internet's packet forwarding system extends this idea of physical connection to the chains of trust by reliable communication with discrete packet deliveries, and the idea has worked well in a limited community where the people are trustable with each other. The end-to-end principle [1] has worked so effectively that the engineers of Internet firmly believe in it.

The reality we are facing, however, is that the people are no longer trustable with each other and rather distrusting one another. People are seeking for a safe haven by creating a chain of distrust, which is apparently a false sense of security, considering that the chain of distrust is easily broken if the proxy between the distrusting two has a malicious intent.

We are heading into the very difficult times, where the security engineers ought to secure the chains of distrust as well as the chains of trust.

Reference:
[1] Blumenthal, M. S. and Clark, D. D. 2001. Rethinking the design of the Internet: the end-to-end arguments vs. the brave new world. ACM Trans. Internet Technol. 1, 1 (Aug. 2001), 70-109. DOI=http://doi.acm.org/10.1145/383034.383037