spellchecker

package module
v2.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 27, 2025 License: MIT Imports: 13 Imported by: 0

README

Spellchecker

Go Reference CI

Yet another spellchecker written in go.

Features:

  • very compact database: ~1 MB for 30,000 unique words
  • average time to fix a single word: ~35 µs
  • achieves about 70–74% accuracy on Peter Norvig’s test sets (see benchmarks)

Installation

go get -v github.com/f1monkey/spellchecker/v2

Usage

Quick start

func main() {
	// Create a new instance
	sc, err := spellchecker.New(
		"abcdefghijklmnopqrstuvwxyz1234567890", // allowed symbols, other symbols will be ignored
	)
	if err != nil {
		panic(err)
	}

	// The weight increases the likelihood that the word will be chosen as a correction.
	weight := uint(1)

	// Load data from any io.Reader
	in, err := os.Open("data/sample.txt")
	if err != nil {
		panic(err)
	}

	sc.AddFrom(&spellchecker.AddOptions{Weight: weight}, in)
	// OR
	sc.AddFrom(nil, in)

	// Add words manually
	sc.Add(nil, "lock", "stock", "and", "two", "smoking", "barrels")

	// Check if a word is valid
	result := sc.IsCorrect("coffee")
	fmt.Println(result) // true

	// Correct a single word
	fixed, isCorrect := sc.Fix(nil, "awepon")
	fmt.Println(isCorrect) // false
	fmt.Println(fixed) // weapon

	// Find up to 10 suggestions for a word
	matches := sc.Suggest(nil, "rang", 10)
	fmt.Println(matches) // [range, orange]

	if len(os.Args) < 2 {
		log.Fatal("dict path must be provided")
	}
Options

See options.go for the list of available options.

Save/load
	sc, err := spellchecker.New("abc")

	// Save data to any io.Writer
	out, err := os.Create("data/out.bin")
	if err != nil {
		panic(err)
	}
	sc.Save(out)

	// Load data back from io.Reader
	in, err = os.Open("data/out.bin")
	if err != nil {
		panic(err)
	}
	sc, err = spellchecker.Load(in)
	if err != nil {
		panic(err)
	}
Custom score function

You can provide a custom scoring function if needed:

	var fn spellchecker.FilterFunc = func(src, candidate []rune, cnt int) (float64, bool) {
		// you can calculate Levenshtein distance here (see defaultFilterFunc in options.go for example)

		return 1.0, true // constant score
	}

	sc, err := spellchecker.New("abc", spellchecker.WithFilterFunc(fn))
	if err != nil {
		// handle err
	}

	sc.Fix(fn, "word")

Benchmarks

Tests are based on data from Peter Norvig's article about spelling correction

Test set 1:
Running tool: /usr/bin/go test -benchmem -run=^$ -bench ^Benchmark_Norvig1$ github.com/f1monkey/spellchecker -count=1

goos: linux
goarch: amd64
pkg: github.com/f1monkey/spellchecker
cpu: 13th Gen Intel(R) Core(TM) i9-13980HX
Benchmark_Norvig1-32    	     357	   3305052 ns/op	        74.44 success_percent	       201.0 success_words	       270.0 total_words	  768899 B/op	   13302 allocs/op
PASS
ok  	github.com/f1monkey/spellchecker	3.801s
Test set 2:
Running tool: /usr/bin/go test -benchmem -run=^$ -bench ^Benchmark_Norvig2$ github.com/f1monkey/spellchecker -count=1

goos: linux
goarch: amd64
pkg: github.com/f1monkey/spellchecker
cpu: 13th Gen Intel(R) Core(TM) i9-13980HX
Benchmark_Norvig2-32    	     236	   5257185 ns/op	        71.25 success_percent	       285.0 success_words	       400.0 total_words	 1201260 B/op	   19346 allocs/op
PASS
ok  	github.com/f1monkey/spellchecker	4.350s

Documentation

Index

Constants

View Source
const DefaultAlphabet = "abcdefghijklmnopqrstuvwxyz"
View Source
const DefaultMaxErrors = 2

Variables

This section is empty.

Functions

This section is empty.

Types

type AddOptions

type AddOptions struct {
	Weight uint
	// Splitter is a splitter func for AddFrom() reader
	Splitter bufio.SplitFunc
}

type FilterFunc

type FilterFunc func(src, candidate []rune, count uint) (float64, bool)

type Match

type Match struct {
	Value string
	Score float64
}

type SearchOptions

type SearchOptions struct {
	// MaxErrors — the maximum allowed difference in bits
	// between the "search word" and a "dictionary word".
	// - deletion is a 1-bit change (proble → problem)
	// - insertion is a 1-bit change (problemm → problem)
	// - substitution is a 2-bit change (problam → problem)
	// - transposition is a 0-bit change (problme → problem)
	//
	// It is not recommended to set this value greater than 2,
	// as it can significantly affect performance.
	MaxErrors int

	// FilterFunc compares the source word with a candidate word.
	// It returns the candidate's score and a boolean flag.
	// If the flag is false, the candidate will be completely filtered out.
	FilterFunc FilterFunc
}

type Spellchecker

type Spellchecker struct {
	// contains filtered or unexported fields
}

func Load

func Load(reader io.Reader) (*Spellchecker, error)

Load reads spellchecker data from the provided reader and decodes it

func New

func New(alphabet string) (*Spellchecker, error)

func (*Spellchecker) Add

func (m *Spellchecker) Add(opts *AddOptions, words ...string)

Add adds provided words to the dictionary with a custom weight

func (*Spellchecker) AddFrom

func (m *Spellchecker) AddFrom(opts *AddOptions, input io.Reader) error

AddFrom reads input, splits it with spellchecker splitter func and adds words to the dictionary

func (*Spellchecker) Fix

func (s *Spellchecker) Fix(opts *SearchOptions, word string) (string, bool)

func (*Spellchecker) IsCorrect

func (s *Spellchecker) IsCorrect(word string) bool

IsCorrect check if provided word is in the dictionary

func (*Spellchecker) Save

func (m *Spellchecker) Save(w io.Writer) error

Save encodes spellchecker data and writes it to the provided writer

func (*Spellchecker) Suggest

func (s *Spellchecker) Suggest(opts *SearchOptions, word string, n int) SuggestionResult

Suggest find top n suggestions for the word. Returns spellchecker scores along with words

type SuggestionResult

type SuggestionResult struct {
	ExactMatch  bool // if true, the word is correct
	Suggestions []Match
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL