A Wordle Solver

Introduction

I probably don't need to tell you what Wordle is, but I will anyways. Wordle is a daily word game where you have six tries to guess an unknown five letter word. When you make a guess, each letter's tile will turn one of three colors, where each color has its own meaning. If a letter is highlighted in grey, it means it is not in the word. If it turns yellow, it means it is in the word, but not at the spot, and if its green it is in the word and in the correct position. If you haven't tried it yet, it's pretty fun and this will make more sense. So here's a link: Wordle

Now that you've tried it, we can move on to what I've been doing. I was talking to my family one night, and they said making a solver would be easy. I insisted that it would be a lot harder than they thought. But when they left, I went ahead and gave it a shot, and it actually wasn't that hard. So they were right. Thanks for the idea too!

So here I will explain how my solver works and share some of the insights into the game that I've gathered from running a few thousand simulations with it. So without any further preamble, let's begin.

The Solver

The solver works by eliminating words that don't match what it knows is true, and then finding a new guess by scoring each of the possible remaining words and choosing one, and then repeating this process until it arrives at the right answer.

Eliminating Wrong Answers

There are four steps that my solver uses to eliminate all words that can't be the answer.

Removing words that aren't in the solution
Only keeping the words with letters that must be in the solution
Remove words that don't have letters in the right place.
Removing words with letters in the same position as yellow tiles

If all of that was Wordle salad to you, don't worry. We're going to go through an example. Let's say the solution is:

Of course, we don't know that at the beginning. Let's use my typical starting word, irate to start out with. We would get this:

Right now, before any analysis, we have 12947 words to choose from. It's pretty unlikely to make a good guess now. You would only have a 0.00077 percent chance of guessing correctly. So let's start by filtering out some of those words. The first thing we do is get rid of words that contain letters that were in gray. So words like

are now going to be removed because they all contain letters that we know are not in the correct solution. After removing these words we are left with 6429 words. We now have a 0.012 percent chance of guessing. Which is a lot better, but still very bad.

Now, let's focus on getting rid of words that don't have A and E. We would remove words like

Because they don't contain any of the words that we know should be in our answer. That takes us all the way down to 1067 words. We have a 0.094 percent chance of guessing the word correctly now. That's pretty good, but we can do better. We know our word has an A, but that the A is not in the middle position. So we remove words like

because we already know the A is not in the center. Doing this allows us to eliminate some extra words.

So let's throw out all the words that don't have E at the end. After doing that, we only have 814 words left. We're now at a whopping 0.12 percent chance of guessing the right word.

For our last step, we know our final word will look like this:

So we can remove all words that don't end in E. That takes us down to only 184 words to guess from. That's a 0.5 percent chance of guessing the right answer. That's not easy but it's a lot better than having way too many 0's.

Choosing the next guess

So now we have 184 words to choose from. We could choose a random word, but we can do better than that. We want to find a word that eliminates the largest number of words possible. So we need to determine how much information each word would give us. But we also will penalize words that have the same letter twice, because that means we don't learn as much. This penalization makes sure we learn the most at every step.

Each letter has a score now, as you can see A and E are highest because they are in every word left in this list. But L and S are also very high scorers. So we can now take each of the remaining words and score each of them.

Using these scores, the solver says the top five words are anole, alone, salve, angle, and sable, with scores of 539, 539, 536, 536, and 535 respectively. Now is a good time to note that while there are 12,947 words we can guess from, only about 2,500 are actually Wordle answers. Most of the chosen words are words that people actually know. So we can safely ignore anole. So let's chose alone.

This process is repeated until we get to the end. The full result for this game would look like:

It's not a very good game, but it did solve it. So that's how a Wordle solver works!

If you want to try it out, the github repository is here: repo

Introduction

The Solver

Eliminating Wrong Answers

Choosing the next guess

Find me on ...