Word List

Intro

In this post, we will create lists of two to five letter words for using in other projects. We will use the words file available on Mac or Linux OS. This file is usually at /usr/share/dict/words

Create word list in Go

We are going to use VS Code IDE for this project. Go extension for VS Code provides language and formatting support.

Create a directory for the files

mkdir generator
cd generator

Create Go module

go mod init generator
go: creating new go.mod: module generator

Create Go file

touch generator.go

In generator.go start by adding main package and function

package main

function main() {
}

All of the following code will go inside our main function.

Read the words file and store its contents in a local variable and exit with error message on failure

f, err := os.ReadFile("/usr/share/dict/words")

if err != nil {
  log.Fatal(err)
}

Our local variable f is of type []byte. Convert the contents of the file to string

words := string(f)

We will create a for loop and regular expressions to find the words of specific length.

(?m) modifier matches multiple lines
^ and $ matches the beginning and end of lines
[a-zA-Z]{5} matches alphabet characters exactly five times

Our regex be (?m)^[a-zA-Z]{` + strconv.Itoa(i) + `}$ with i being the loop index.

for i := 2; i < 6; i++ {
  re := regexp.MustCompile(`(?m)^[a-zA-Z]{` + strconv.Itoa(i) + `}$`)
  tmp := re.FindAllString(words, -1)
}

The regex FindAllString method will return the matches as an array of strings.

We will create a new file for each iteration of the loop. Edit the for loop created above

for i := 2; i < 6; i++ {
  re := regexp.MustCompile(`(?m)^[a-zA-Z]{` + strconv.Itoa(i) + `}$`)
  tmp := re.FindAllString(words, -1)

  w, err := os.Create("words-" + strconv.Itoa(i) + ".txt")
  if err != nil {
    log.Fatal(err)
  }
}

Write the matched string array joined with newline character to the file created and we are done

for i := 2; i < 6; i++ {
  re := regexp.MustCompile(`(?m)^[a-zA-Z]{` + strconv.Itoa(i) + `}$`)
  tmp := re.FindAllString(words, -1)

  w, err := os.Create("words-" + strconv.Itoa(i) + ".txt")
  if err != nil {
    log.Fatal(err)
  }

  w.WriteString(strings.ToUpper(strings.Join(tmp, "\n")))
  w.Close()
}

Run the main function with

go run .

This should create the files words-1.txt, words-2.txt, words-3.txt, words-4.txt, and words-5.txt in the same folder.

You can change the loop parameters to create word lists of any length.

Create word list in Java

Here is a similar example in Java, without the for loop and regular expressions. It reads the words file line by line and write each line to output files depending on the length.

import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileWriter;
import java.util.Scanner;

public class Main {
    public static void main(String[] args) {
        File inFile = new File("/usr/share/dict/words");
        File outFile3 = new File("words3.txt");
        File outFile4 = new File("words4.txt");
        File outFile5 = new File("words5.txt");
        
        boolean write3 = true;
        boolean write4 = true;
        boolean write5 = true;

        FileWriter writer3 = null;
        FileWriter writer4 = null;
        FileWriter writer5 = null;
        try {
            outFile3.createNewFile();
            writer3 = new FileWriter(outFile3);
            writer3.write("");
        } catch (Exception e) {
            write3 = false;
            e.printStackTrace();
        }

        try {
            outFile4.createNewFile();
            writer4 = new FileWriter(outFile4);
            writer4.write("");
        } catch (Exception e) {
            write4 = false;
            e.printStackTrace();
        }

        try {
            outFile5.createNewFile();
            writer5 = new FileWriter(outFile5);
            writer5.write("");
        } catch (Exception e) {
            write5 = false;
            e.printStackTrace();
        }


        try (Scanner scanner = new Scanner(inFile)) {
            while (scanner.hasNextLine()) {
                String word = scanner.nextLine();
                if (word.length() == 3 && write3) {
                    try {
                        writer3.write(word.toUpperCase() + "\n");
                    } catch (Exception e) {
                        e.printStackTrace();
                    }
                } else if (word.length() == 4 && write4) {
                    try {
                        writer4.write(word.toUpperCase() + "\n");
                    } catch (Exception e) {
                        e.printStackTrace();
                    }
                } else if (word.length() == 5 && write5) {
                    try {
                        writer5.write(word.toUpperCase() + "\n");
                    } catch (Exception e) {
                        e.printStackTrace();
                    }
                }
            }
            scanner.close();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }

        try {
            if (write3) {
                writer3.close();
            }
        } catch (Exception e) {
            e.printStackTrace();
        }

        try {
            if (write4) {
                writer4.close();
            }
        } catch (Exception e) {
            e.printStackTrace();
        }

        try {
            if (write5) {
                writer5.close();
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}