How to Measure Deployment Ratio of Domain Authentications

Writer: WIDE antispam WG
Editor: Kazu Yamamoto
Written: January 6, 2006

Each domain authentication technology consists of sender side and receiver side. There is no way to see if a site has introduced a domain authentication technology to its receiver side. On the contrary, since its sender side declares authentication information on the DNS, we can tell introduction of the domain authentication by checking it.

The WIDE project signed a contract of collaboration research with JPRS and has been measuring deployment ratio of domain authentication since April 2005. JPRS gives us a list of domains under .jp and the WIDE project checks whether or not authentication information exists for each domain.

Suppose one of domains given by JPRS is "example.jp". We are not sure that this domain is directly used for email or subdomains is defined for email. However, it is typical for sites expect universities to not define subdomains. Actually, when we checked MX RRs(resource record) of domains provided by JPRS, about 80% had it. Thus about 80% of the domains are probably used for email without subdomains. We can therefore consider that measuring deployment ration of domain authentication gives appropriate results even if we do not guess their subdomains.

We target the following domain authentication technologies:

SPF, Sender ID
DomainKeys, DKIM

We are using a PC provided by JPRS and using "bind" on the machine as a resolver. The bind obtains data from authoritative DNS servers with recursive queries.

SPF, Sender ID

For both SPF and Sender ID, SPF RR (type 99) should be declared. Since SPF RR is not widely deployed, TXT RR is typically used alternatively. Characteristic of SPF and Sender ID is as follow:

SPF has "v=spf1" in the right side of an SPF RR
Sender ID has "spf2" in the right side of an SPF RR (Note that "v=" must not be written)

On a receiver side of Sender ID, "v=spf1" is treated as "spf2.0/mfrom,pra". We believe we don't have to distinguish SPF from Sender ID, so we treat them as the same in our mesurement.

To measure the deployment ratio of SPF/Sender ID, the following procedure is repeated for each element of the domain list.

Look up an SPF RR for the domain.
- If an SPF RR is contained in the answer section, introduced. Go to the next candidate.
- Otherwise, fall through.
Look up a TXT RR for the domain.
- If an SPF RR is contained in the answer section and the value of the SPF RR contains the string "spf", introduced.
- Otherwise, not introduced.

DomainKeys, DKIM

When using DomainKeys, a sender side opens its public key on the DNS. For instance, the name of the public key for "example.jp" can be "<selector>._domainkey.example.jp". We can only find <selector> in a signature in messages singed by "example.jp". Since <selector> cannot be guessed, it is quit hard to see if a site has introduced DomainKeys according to existence of its public key.

The Old Measurement Algorithm

We can also declare our policy for DomainKeys in the DNS. The name of the policy is "_domainkey.example.jp" and thus there is no parts to be guessed. When we started the measurement in April 2005, we used an algorithm that sees if policies exist. Since declaration of policies is optional, this algorithm cannot pick up domains which have introduced DomainKeys without declaring their policies.

The New Measurement Algorithm

We have been using a new improved algorithm which checks the magic subdomain "_domainkey" since October 2005. This new algorithm can pick up some domains which have introduced DomainKeys without declaring their policies.

There are two ways to define the subdomain:

Create the "_domainkey.example.jp" zone
Describe "_domainkey.example.jp" in the "example.jp" zone

If an SOA RR for "_domainkey.example.jp" exists, we can think this case is 1. On the contrary, if we look up an SOA RR for "_domainkey.example.jp" and "name error" is returned, the case is neither 1 nor 2.

In other cases, if we look up a TXT RR for "_domainkey.example.jp" and "no error" is returned, this may be 2. However, if any RRs is defined for "*.example.jp" "no error" can be returned even though "_domainkey.example.jp" does not exist. We thus then look up an TXT RR for "*.example.jp". If "name error" is returned, it means "*.example.jp" does not exist and we can tell that "_domainkey.example.jp" exists.

If both "_domainkey.example.jp" and "*.example.jp" results in "no error", there are following possibilities:

"_domainkey.example.jp" exists and "*.example.jp" exists, too.
"_domainkey.example.jp" does not exist and "*.example.jp" exists.

To distinguish these two, we should look up some RRs (an TXT, for example) for both, and compares the answers. If any inconsistency is found we can think this case is the former. If they are consistent, this case is possibly the latter.

Summary of the New Measurement Algorithm

With the new measurement algorith for the deployment ratio of DomainKeys, the following procedure is repeated for each element of the domain list.

Look up an SOA RR for "_domainkey.<domain-name>"
- If "no error" is returned and its answer section contains an SOA RR, introduced. Go to the next candidate.
- If "no error" is returned and its answer section is empty, fall through.
- If "name error" is returned, go to the next candidate.
Look up a TXT RR for "_domainkey.<domain-name>".
- If "no error" is returned, save its answer section, say (a), and fall through.
Look up a TXT RR for "*.<domain-name>".
- If "name error" is returned, introduced. Go to the next candidate.
- If "no error" is returned, save its answer section, say (b). and fall through.
Compare (a) and (b). If they are identical, not introduced. Otherwise, introduced.

Since ill-behavior DNS servers exist, the actual algorithm is more complicated. For the backward compatibility to the old algorithm, the new one also checks existence of policies.

DKIM

DKIM is the same as DomainKeys except that its policy's name is "_policy._domainkey.example.jp". It is impossible to distinguish DKIM from DomainKeys by checking existence of the subdomain "_domainkey". Our measurement thus does not distinguish them.