Crossword Clone
Intro
In this blog we will create a NY Times style dense crossword. I will focus more on generating the crossword than the presentation. I use Go programming language because finding the word combinations is a little computation heavy. Go also has built in HTTP server that I can use to serve the API.
I will use the word lists I generated previously. If you want to see how to generate a word list, check out my post Create word list.
Wave Function Collapse (WFC)
We will use a simplified version of wave function collapse. If you are not familiar with the algorithm, check out this article by Robert Heaton: https://robertheaton.com/2018/12/17/wavefunction-collapse-algorithm/
So, how do we use the WFC for our crossword? There are three things we need to take note of in WFC:
- The variables we are trying to fill. This would be entries for our crossword.
- The domain which has all possible for the variables. This would be our word lists.
- The constraints which narrow down the possible values for the variables. This would be existing letters in the crossword. For example a five letter entry with no letters filled will have about 20k choices, but a file letter entry with the middle letter filled to ‘A’ will only 2k choices.
We get a list of all unfilled entries from the crossword and find then number of all possible choices for those entries, still staying true to the constraints. For the entry with the lowest number of entries, we will fill it with a random word from possible choices. We will keep repeating this until all the entries are filled.
We might run into situations where there might not be any words that fit into an entry because of the constraints. To solve this issue, we will save the crossword state at each iteration and revert back to the previous state and try with a different choice. We will also blacklist the previous choice for that state, so we don’t pick it again if we have to revert back to this state again.
Code
Crossword
We will start with creating a Go module
go mod init generator
touch crossword.go
In crossword.go create crossword
package and Crossword
struct
package crossword
type Crossword struct {
Height int
Width int
Cells [][]byte
}
[!TIP]
You can use
[][]rune
instead of[][]byte
if you are not worried about memory and performance. You might need to make some changes to other parts of the code if you use[][]rune
. I will use[][]byte
since I am only using ASCII characters.
We will create a function to create a new crossword. We will also initialize the cell values to .
(dot) (ASCII 46). This will be useful when using regular expressions to find matches in word lists.
func New(height, width int) *Crossword {
c := Crossword{Height: height, Width: width}
c.Cells = make([][]byte, height)
for i := 0; i < height; i++ {
c.Cells[i] = make([]byte, width)
}
for i := range c.Height {
for j := range c.Width {
c.Cells[i][j] = 46
}
}
return &c
}
Create functions to export crossword as byte array and initialize from byte array
func (c *Crossword) ToByteArray() []byte {
return bytes.Join(c.Cells, nil)
}
func (c *Crossword) FromByteArray(b []byte) {
for row := range c.Height {
copy(c.Cells[row], b[row*c.Width:(row+1)*c.Width])
}
}
Create functions to get the words at starting at a row and column. We will use -
(hyphen) (ASCII 45) to denote black squares
func (c *Crossword) getWordAcross(row, col int) string {
b := make([]byte, c.Width)
i := col
for i < c.Width && c.Cells[row][i] != 45 {
b[i] = c.Cells[row][i]
i++
}
return string(b[col:i])
}
Create a function to get entries
type CrosswordEntry struct {
Row int
Column int
Across bool
Word string
}
func (c *Crossword) GetEntries() []CrosswordEntry {
entries := make([]CrosswordEntry, 0)
for row := range c.Height {
for col := range c.Width {
if c.Cells[row][col] != 45 {
if col == 0 || c.Cells[row][col-1] == 45 {
word := c.getWordAcross(row, col)
if len(word) > 1 {
entries = append(entries, CrosswordEntry{Row: row, Column: col, Across: true, Word: word})
}
}
if row == 0 || c.Cells[row-1][col] == 45 {
word := c.getWordDown(row, col)
if len(word) > 1 {
entries = append(entries, CrosswordEntry{Row: row, Column: col, Across: false, Word: word})
}
}
}
}
}
return entries
}
Create a type State
to store the crossword state. cbytes
is the flattened cells array. target
is the word that will be filled in the current iteration. picked
is the word chosen from the word lists. kaput
is an array of words that led to a dead end in the later iterations and have been blacklisted.
type Target struct {
Row int
Column int
Across bool
}
type State struct {
cbytes []byte
target Target
picked string
kaput []string
}
Let’s create the last function to fill the crossword. Start by creating a state array with an initial state
func (c *Crossword) Fill(wordList *WordList) bool {
sa := []State{}
sa = append(sa, State{c.ToByteArray(), Target{2, 0, true}, "", []string{}})
}
Next we setup the loop that will fill the crossword. We can limit the number of iterations if needed. The wordList
is an array of list of words of length equal to the index of the array
type WordList []string
func (c *Crossword) Fill(wordList *WordList) bool {
sa := []State{}
sa = append(sa, State{c.ToByteArray(), Target{2, 0, true}, "", []string{}})
var done = false
for iterations := 0; !done && iterations < 1000; iterations++ {
}
return done
}
Every iteration of the loop begins with using the most recent state in the state array and ends with adding an empty state to the state array
func (c *Crossword) Fill(wordList *WordList) bool {
...
for iterations := 0; !done && iterations < 1000; iterations++ {
s := &sa[len(sa)-1]
...
newState := State{c.ToByteArray(), Target{0, 0, true}, "", []string{}}
sa = append(sa, newState)
}
...
}
Inside the loop, we get the crossword entries and then find the number of words matching each entry. We only try to search words for entries that have a .
(dot) in it. The regular expression is simply the entry word since we initialized them to a .
(dot).
At the same time, we also save the minimum number of choices found for the entries and the index of the entries
entries := c.GetEntries()
numMatches := make([]int, len(entries))
for i := range numMatches {
numMatches[i] = math.MaxInt
}
minMatch := math.MaxInt
minMatchIndex := -1
for i, entry := range entries {
if strings.Contains(entry.Word, ".") {
re := regexp.MustCompile(entry.Word)
numMatches[i] = len(re.FindAllStringIndex((*wordList)[len(entry.Word)], -1))
if numMatches[i] < minMatch {
minMatch = numMatches[i]
minMatchIndex = i
}
}
}
If there are no more matches found, check if the crossword is complete and break out if it is. If it is not complete, delete the current state and go back to the previous state and add the previous state chosen word to the kaput array and continue the loop. This handles the dead ends by backtracking to the previous state and blacklisting the word that led to the dead end.
if minMatchIndex == -1 {
correct := true
for _, entry := range entries {
re := regexp.MustCompile(entry.Word)
found := re.FindAllString((*wordList)[len(entry.Word)], -1)
if len(found) == 0 {
correct = false
break
}
}
if correct {
done = true
break
} else {
sa = sa[:len(sa)-1]
s = &sa[len(sa)-1]
s.kaput = append(s.kaput, s.picked)
c.FromByteArray(s.cbytes)
continue
}
}
If there are matches found, we will grab the matches and find the usable array (difference between matches found and the kaput array)
entry := entries[minMatchIndex]
matches := regexp.MustCompile(entry.Word).FindAllString((*wordList)[len(entry.Word)], -1)
usable := slices.DeleteFunc(matches, func(match string) bool {
return slices.Contains(s.kaput, match)
})
Like before, if there are no usable words, revert to previous state
if len(usable) == 0 {
sa = sa[:len(sa)-1]
s = &sa[len(sa)-1]
s.kaput = append(s.kaput, s.picked)
c.FromByteArray(s.cbytes)
continue
}
If there are words we can use, pick a random one
s.picked = usable[rand.Intn(len(usable))]
s.target = Target{entry.Row, entry.Column, entry.Across}
c.SetWord(entry.Row, entry.Column, entry.Across, []byte(s.picked))
OpenAI Clue Generation
We can OpenAI for generating clues. The OpenAI official Go library is still in beta, so this code might change later.
Initialize the OpenAI Client. This code uses environment variable OPENAI_API_KEY
which has the API key generated from the OpenAI dashboard.
client := openai.NewClient(
option.WithAPIKey(
os.Getenv("OPENAI_API_KEY"),
),
)
Create an array words from entries
entries := c.GetEntries()
answers := []string{}
for _, entry := range entries {
answers = append(answers, entry.Word)
}
Use the chat completion endpoint to generate the hints. We provide a generic system prompt and an example of the clues generated. We also ask OpenAI to respond with JSON
response, err := client.Chat.Completions.New(
context.Background(),
openai.ChatCompletionNewParams{
Model: openai.F(openai.ChatModelGPT4o),
Messages: openai.F([]openai.ChatCompletionMessageParamUnion{
openai.SystemMessage("You are a crossword clue generator. You will read a comma separated list of words and generate clues for them. You will respond in json format with the each word as name and its clue as value. Some words are a combination of multiple words. Pay attention to prefixes, suffixes and plurals. Try to be accurate and concise. Do not use the word or the form of the abbreviations in the clue."),
openai.UserMessage(`DOG,AG,ETC`),
openai.AssistantMessage(`{"DOG":"Man's best friend","AG":"Symbol for silver","ETC":"Other similar things (abbr)"}`),
openai.UserMessage(strings.Join(answers, ",")),
}),
},
)
if err != nil {
log.Fatal(err)
}
clues := map[string]string{}
json.Unmarshal([]byte(response.Choices[0].Message.Content), &clues)
Server
The last part is to use HTTP server to create the endpoint for new crossword generation.
We start with reading the word lists so they can be used for the lifetime of the server
func initWordList() {
wordList = make([]string, 5+1)
for i := 2; i < 5+1; i++ {
tmp, err := os.ReadFile("../generator/words-" + strconv.Itoa(i) + ".txt")
if err != nil {
log.Fatal(err)
}
wordList[i] = string(tmp)
}
}
func main() {
initWordList()
...
}
Add handler for /new
url path
function main() {
...
http.HandleFunc("/new", func(w http.ResponseWriter, r *http.Request) {
...
})
}
In the handler, we will create the crossword, set some black cells, and then fill it using the WFC algorithm. If the fill fails then we send HTTP status 500.
function main() {
...
http.HandleFunc("/new", func(w http.ResponseWriter, r *http.Request) {
...
numBlanks := rand.Intn(2)
for i := 0; i < numBlanks+1; i++ {
x := rand.Intn(5)
y := rand.Intn(5)
c.SetCell(x, y, 45)
c.SetCell(4-x, 4-y, 45)
}
if !c.Fill(&wordList) {
fmt.Println("Failed to fill crossword")
w.WriteHeader(http.StatusInternalServerError)
return
}
...
})
}
Next we generate clues as shown in the previous section, format the output and return JSON reponse
function main() {
...
http.HandleFunc("/new", func(w http.ResponseWriter, r *http.Request) {
...
for _, entry := range c.GetEntries() {
if entry.Across {
across = append(across, []interface{}{entry.Row, entry.Column, len(entry.Word), clues[entry.Word]})
} else {
down = append(down, []interface{}{entry.Row, entry.Column, len(entry.Word), clues[entry.Word]})
}
}
data := NewCrosswordResponse{Rows: c.Height, Columns: c.Width, Grid: c.ToByteArray(), Across: across, Down: down}
j, err := json.Marshal(data)
if err != nil {
log.Fatal(err)
}
w.Header().Set("Access-Control-Allow-Origin", "*")
w.Header().Set("Content-Type", "application/json")
w.Write(j)
...
})
}
Finally, we can start the server with
http.ListenAndServe(":8080", nil)
I will create a separate post for the frontend for the crossword. The code for the frontend is also in the same Github repo.