This chapter is about protecting your Certificate Directory, which is assumed here to be a public directory, from privacy abuses. We first need to discuss this concept on an abstract level. How can public information be of privacy concern? On the surface, this might look like a contradiction. This introduction borrows some ideas from ``The Ethics of Information: Protecting Privacy in the Computer Age'' [36].
The concepts of private information and public information are key to this discussion. The traditional notation of privacy is that of protecting private information. Here, we will try to argue that, in the light of computer technology, the notation of privacy needs to be reconsidered.
Information about a person such as name, address, telephone number, employment, passport photo, etc are by the strictest sense of the definition public. These seemingly unrelated pieces of information, taken together, present a privacy concern to most people. Traditionally the ``taken together'' part means costly intelligence work. It would cost a lot of time and money. Traditionally, you are safe. But with public databases and high-speed connections, it is easy to collect information about a vast number of people at the same time and process it locally. All such databases with ``public'' information are a privacy concern.
Today we thus have to replace the ``time and money'' factor that secured privacy in older times, with a new factor that makes ``data mining'' infeasible.
Now, if we return to our application, we have a certificate directory. Certificates often carry additional information, used to authenticate the certificate holder for certain purposes. Examples of additional information are qualification, licenses (attorney, doctor, ...), official approvals (vehicle driving licenses, ...). One can imagine a credit card vendor using certificates containing users credit card numbers. This information presents the privacy concern.
Secure DNS [20] contains a serious problem in this regard, when used as a Certificate Directory. This chapter presents this problem in detail, and a new idea to solve this problem.
Secure DNS is a recent development in the DNS field. It is currently in testing for deployment by various organizations [67] [76]. The following three distinct services are the goal of Secure DNS [20]:
Authentication of data is provided by cryptographically signing data stored in DNS. Both keys and signatures are stored in DNS, together with the data it authenticates. Since keys are used in authenticating these signatures, they need to be authenticated just as other DNS data. Of course, this chain of keys and signatures must have a locally trusted root, or the data cannot be trusted at all. Secure DNS aware clients are usually configured with a (small) number of trusted keys.
The keys used to sign data in DNS are attached to each collection of DNS data (a ``DNS zone''). This is in contrast to attaching keys to individual servers involved in querying the information. This design allows for ``off line signing'', so that even a compromise of the servers involved does not have to affect the security of the data.
Data non-existence services are also provided. This means that it is possible to strongly secure that a certain name does not exist. E.g., a client asking for maria.josefsson.org should be able to trust that this domain does not exist (if that is the case).
To support Data Origin Authentication, Secure DNS defines a method of storing keys in DNS known as ``KEY records''. Thus DNS is used as a public key distribution mechanism (a ``Public Key Infrastructure''). This PKI is used to support the public-key cryptographic operations required by Secure DNS itself, but it can also be used for other protocols.
Each individual transaction can also be protected by means of Secure DNS. This is in contrast to only protecting the data involved. This makes it possible for a DNS client to be sure it is at least getting responses from the server it thinks it queried, and that responses are to the query it sent.
Outside the goals of Secure DNS, and hence not implemented, are means for confidentiality of queries or responses (e.g., information is considered to be public), and protection against denial of service attacks.
This section gives an abstract description of how Secure DNS achieves strong data non-existence. Again, ``data non-existence'' is how clients (securely) trust that a certain domain, say maria.josefsson.org, does not exist.
We illustrate this by presenting a scenario, a naive first attempt at a solution to this scenario, and then discuss some problem with that solution. We proceed to describe the currently implemented solution, NXT records [20, section 5], that overcomes these problems. Since we have found further problems with this solution, we describe these problems. We conclude by presenting our own solution, that overcomes these new problems [50].
Scenario: Consider a set of keysthat uniquely identify a domain name (e.g., kalle.josefsson.org) from an alphabet (strings of alphabetic characters, ``a''-''z'') that each is attached to one or more data item(s) (e.g., certificates). Client entities (DNS resolvers) can send queries to server entities (DNS servers) for a key, expecting the corresponding data back. The data is cryptographically signed to provide authentication of data origin. However, if a certain key does not have any attached data, a message need to be returned with that information. The problem of data non-existence is to strongly authenticate this last piece of information.
The naive implementation, illustrated in figure 5.1, is for a client A to query a server B and expect a cryptographically signed answer back. To prevent replay-attacks to occur after new names have been introduced, the signature should have limited life length.
First we convince ourself that the naive implementation does indeed work. It does, since we have:
Three major limitations that this naive implementation faces, and that
Secure DNS had to overcome, are the following:
That is, available for signing operations when answering queries. This would be a security concern, since breaking into the servers would mean compromising the security of Secure DNS.
This brings up two related problems. First, it would be easy to overload a server by repeatedly asking questions, a so called ``Denial of Service'' attack. Secondly, today large servers may answer several thousand queries per second, cryptographic signing hardware with this kind of performance is not commonly available (today).
A negative answer for a domain may be recorded, and if the domain is later added, it may be possible to use the signed negative response to deny existence for a domain. (This may be solved by adding a challenge/response scheme for each DNS query, but DNS does not include one today.)
These problems are easily translated into requirements for a better solution:
We first mention that a naive approach of signing all possible keys is
not feasible, since this is practically a infinite set. We are now
going to study the currently implemented solution, NXT records, but
first we need som data to work with. Consider table
5.1.
By introducing a canonical ordering of keys, we can enumerate all keys that have data attached to them. By ordering keys we can construct links between each key in the sorting order. This piece of information can be signed and used as a ``data non-existence proof'' (discussion below). To illustrate this, consider table 5.2 where some new keys and data are added to our previous table.
|
We now describe how this new information is used in a clever way to
achieve a ``non-existence proof''. Entity A queries for a non-existent
key maria.josefsson.org. The server replies with a tuple,
(lotta.josefsson.org, Next key: simon.josefsson.
org). This is illustrated in figure 5.2. The ``next''
tuple is cryptographically signed, to provide authentication of data
non-existence. These ``Next'' tuples can be calculated in advance, and
be cryptographically signed off line. Thus it fulfills our two earlier
requirements.
Now, how does the entity A know that maria.josefsson.org does not exist? By using the canonical ordering process, it knows that maria.josefsson.org sort later than lotta.josefsson.org and earlier than simon.josefsson.org. Since these ``next'' records were created using all existing keys, the entity A now can be certain that maria.josefsson.org indeed do not exist. (Of course, assuming that the cryptographic signature was correctly verified.)
This form of non-existence proof raises an immediate problem. From the previous paragraph, the entity A that queried for maria.josefsson.org learned more than that it does not exist. Explicitly, it learned that both lotta.josefsson.org and simon.josefsson.org do exist. Continuing by asking for a domain that sorts earlier than lotta.josefsson.org, it is easy to see how this can be abused to learn the entire content of the entire data set.
Besides privacy concerns, gaining knowledge of a victim's DNS information opens up more direct forms of attacks [6]. In short, the knowledge of a certain organization's DNS information provides clues that help an intruder. Although most of the dangers in the scenarios described in the document have become obsolete (like allowing zone transfers), NXT chaining in Secure DNS re-introduces one of the concerns.
An alternative that reduces information revealed by non-existence responses is needed. The next section presents our idea.
The idea is to replace the sensitive information of NXT records of figure 5.2 with something that serves the same purpose, but cannot be used to discover names of other keys in the data collection. This is illustrated in figure 5.2.
Obviously the function ``f'' of figure 5.2 should not be possible to invert, or the idea of replacing the original data is lost. We suggest using a well-known cryptographic hash function, SHA-1 [77]. We illustrate the corresponding DNS data in table 5.3. We see encoded SHA-1 values as valid keys, thus re-using the same canonical ordering process of Secure DNS.
|
We first verify that this solution still works. We do this by describing how a client builds trust that its query for maria.josefsson.org does not exist. Assume this key hashes to 142. As illustrated in figure 5.4, the server can respond to the query with a tuple (127.josefsson.org, Next key: 156.josefsson.org). A client now makes sure a hash calculation of its query hashes to a value between the two returned hash values. If this is the case, it can be certain that the query did not ave any corresponding data. (Of course, it must also verify the corresponding cryptographic signature.)
|
We return to our problem with NXT records, that a client learns additional information from the non-existence proof. This is now replaced by only learning the hash value of two existing keys in the data collection. This can be abused to chain through hash values of all existing names in the zone. Assuming the hash function is not invertible, this is not a problem. However, it is possible to use this to determine the number of names in the zone.
The previous idea has been described in a standards document [50, work in progress]. It introduces two further technicalities that are of interest here:
To reduce size of NO records, hash values can be truncated.
To reduce number of records, hash values from several consecutive NO records can be merged into one, larger, record.
Besides mentioning these, going more into the details of the technical description is outside the scope of this chapter. The technical document, with a complete description, can be found in Appendix A.