I'd be happy to expand on the theoretical background for you.
As you mentioned, there are two key components to the Spectre algorithm. They each have a role to play. Let us begin with the
site-key, which is the secret we are primarily interested in, since we use it to derive a password for the site:
MAC( user-key, site-salt ) = site-key
We need two inputs: a user-scoped secret
user-key, and a site-scoped identifier
site-salt whose purpose is to convert the user-scoped key into a site-scoped key.
To evaluate the relevant security properties, let's consider what's required for it to fail:
- Since the
site-salt is not a secret identifier (Spectre uses the site's domain name), this component's strength rests solely on the
user-key's ability to remain uncompromised.
- An attacker who can identify both the MAC algorithm and a target
site-salt could pre-compute a large table which randomly maps
site-keys for a given
site-salt. This table by itself is not very useful, but:
- If the site's passwords get leaked, this table could be used to aid in a reverse lookup, to identify the
user-key for any leaked passwords that match a
site-key in the precomputed table.
- As a result, a single site passwords leak might result in identifying
How realistic is this scenario? As it turns out, whether computing such a table is feasibly useful depends a lot on the size of the
site-key. Let's explore how:
site-key is small, it would be possible to build a complete table exhaustively mapping all
site-key is large, this becomes a computationally impossible task, to the point where you could be building terabytes of mapping data and still not come close to creating a single useful reverse mapping. Spectre's
site-key is 32 bytes, ie. 256 bits. The amount of different
site-keys that can exist is
^ indicates power-of, feel free to put this into your calculator). If we need to store a mapping from each of these to a
user-key, that is at least 32 bytes +
user-key bytes for each of those keys (feel free to put
(32+64)^(2^256) in your calculator). We find that exhaustively computing the
site-keys is not realistic. As a homework assignment for the curious, try to find the largest size table one could reasonably compute in a year and how useful that table would be. It's not great. Moving on!
user-key is small, it would be possible to build a complete table mapping all
user-key is large, this again becomes computationally impossible. However, here we hit a snag: at face value it would appear that we cannot employ the same solution as the
site-key: we want our user secret to be memorable by the user, so they do not depend upon state and can recover their passwords after losing all state. A typical user password is not that high in entropy. We might consider a sentence of four random English words (assuming any of the top 20,000 words) a very strong password, but its entropy is only about 57 bits-worth: far smaller than our
site-key at 256. Consequently, we need to introduce an additional step in order to reinforce this side of the equation.
To do this, Spectre introduces a step called key-derivation. As you mentioned, we use a special memory-hard algorithm called Scrypt. Here is where we need to introduce the other step in the algorithm:
KDF( user-secret, user-salt ) = user-key
Spectre derives a 64-byte
user-key through this method. As a result, just like we had an impossible to exhaust
site-key, we now have an impossible to exhaust
user-key. This concludes the analysis of the MAC portion, with the conclusion that its security comes from the fact that its search space is too large to exhaust. But this is only true if this KDF step is sufficiently strong! So let's see what's required for this component to stand up to attack.
Compared to a typical efficient MAC such as HMAC-256, the KDF takes a long time to compute a single
user-key from a short
user-secret combined with a
user-salt. How is this any better than passing the
user-secret directly into the first step above? It is better in the sense that deriving the
user-key from the
user-secret is extremely expensive (in time, as constrained by both computation and memory). Your phone can do it quickly for a single attempt, but to do it exhaustively for all of our "small" 57-bit sample set of possible English sentences would take an impossibly long time. We have traded space size for time size. As a result, it now becomes impossible to exhaustively make a table of all
site-keys from all possible
user-secrets, even though the
user-secret is a small secret (empirical details below).
Can this KDF step fail at its goal? Yes, it can, and to understand how, we need to repeat the analysis we did above in step 1, but for this step. Can we make an exhaustive table of
user-secrets? No, we cannot, the
user-key is 64-bytes, which makes it too large to search exhaustively as established before. But the
user-secret is only 57 bit, so it's not that large.
As we said before, the KDF is extremely expensive, so exhaustively searching a 57-bit space would still be very hard. You could imagine a massive compute cluster trying to build a table for this. It would be insane to do so for hacking a single user key, but building a table that can be sold and used for breaking any Spectre user's secrets, this might be lucrative.
Here is where we need to introduce the
user-salt: its purpose is to make building the table
user-key infeasible. How does it do this? Not by being secret!
user-salt is allowed to be public knowledge: its only purpose is to require the attacker to build multiple
user-key tables. One for every unique
user-salt, to be specific. With the introduction of the
user-salt, an attacker cannot build a table to attack any Spectre user, they need to build a table for each individual Spectre user.
57bit -[scrypt]-> 64byte table for a single individual is such an expensive operation it is simply not ever going to be worth it.
How expensive? A GTX 1080 Ti could perhaps be optimized to compute about 150 table entries per second (numbers based on scrypt hashcat performance). If you'd like to find out how long it would take to compute the table for a single Spectre user using a Spectre secret of up-to 57-bit, put this into your calculator:
seconds to populate your table = (2^57 permutations) / 150 entries per second
356 * 24 * 3600 to find the amount in years.
Second homework assignment, how large a compute cluster to complete the job in a single year? And what's the bill on this operation? (Never mind where you'll source the silicon)
Let me know if anything remains unclear!