Cracking BIP39 Seed Phrases

by Steve Marx on

In my last post about BIP39, I mentioned that the checksum could be used to help in case you lost a word from your seed phrase. In this post, I’ll expand on exactly how the checksum helps.

Algorithm for recovering a lost word

Because seed phrases use a finite word list, you can simply guess the missing word:

  1. Pick a word from the word list.
  2. Insert that word where the missing word is in the phrase.
  3. Use PBKDF2 to derive a seed from the completed phrase.
  4. If that’s not the seed you’re looking for, go back to the first step.

But we can do better! Step 3 is where the checksum can speed things up.

Using the checksum

The checksum can check the validity of a seed phrase. This functions as a shortcut to skip words that can’t work. Here’s the modified algorithm:

  1. Pick a word from the word list.
  2. Insert that word where the missing word is in the phrase.
  3. Compute the checksum. If it doesn’t match, go back to the first step.
  4. Use PBKDF2 to derive a seed from the completed phrase.
  5. If that’s not the seed you’re looking for, go back to the first step.

I’ve added a step, but it saves quite a bit of time. This is because PBKDF2 is the most expensive step in the process. The purpose of PBKDF2 with a high difficulty setting—BIP39 uses 2,048 iterations of HMAC-SHA512—is that it makes guessing passwords slow.

For a 12-word seed phrase, the checksum is 4 bits long, so there are 24==16 possible checksum values. This means that, on average, only 1 out of 16 guesses will result in a matching checksum. For the rest, PBKDF2 can be skipped entirely.

What if you don’t know where the missing word goes?

Say you wrote down your seed phrase, but it’s somehow only 11 words long. This is an invalid BIP39 seed phrase because the minimum length is 12 words. Can you recover the missing word if you don’t know where it goes?

The answer is yes, but you’ll just have to try each position. There are 12 possible places the missing word could go, so it would take 12 times as long as if you already knew where the word belonged.

What if the checksum itself is missing?

You might wonder what happens if the last word of the seed phrase is the one that’s gone missing. That’s where the checksum is, so how can we use the checksum if we don’t know it?

This case can be safely ignored. The checksum has to match the rest of the phrase, so the validity check looks like this:

checksum(<all but last n bits>) == <last n bits>

This same process works no matter which bits we’re guessing.

However, this case does present an opportunity for optimization. Each word in the mnemonic phrase represents 11 bits of data. If I need to guess the last word in a 12-word phrase, 4 of those bits are actually the checksum. This means that instead of guessing and testing every combination of 11 bits, I can guess just 7 bits and then fill the other 4 in with the checksum. Longer phrases use more checksum bits, so this advantage gets even greater.

In the code I’ll link to at the bottom of this post, I didn’t implement this optimization.

What if you lost the password?

As I mentioned in my previous post, the mnemonic phrase can optionally be combined with a password in the key derivation step. If you’ve lost that password, you’re in a bit of trouble.

PBKDF2 is used with a parameterized difficulty to make password guessing expensive. BIP39 specifies 2,048 rounds of HMAC-SHA512. This is too slow to guess a large number of passwords. And, unlike the finite word list for seed phrases, passwords can be absolutely anything.

If you can narrow your password down to a relatively small number of choices, guessing could be feasible again.

Implementation and results

I added seed phrase cracking functionality to my example code, and it shows the expected speedup from using checksums. I made no attempt to optimize the implementation itself. Notably, there’s no concurrency, so the code can’t take advantage of multiple processors.

On my laptop (Pixelbook Go), guessing one missing word from a 12-word seed phrase without taking advantage of checksums takes about 5.6 seconds. When using the checksum, the same task takes only 0.5 seconds.

With two missing words, guessing with the help of a checksum takes 13 minutes and 15 seconds. I didn’t attempt this without the checksum’s help, and I didn’t attempt to guess more than two missing words. (The algorithm’s time complexity grows exponentially with the number of missing words.)

Me. In your inbox?

Admit it. You're intrigued.


Related posts