tokenizer

package

v0.5.0 Latest Latest Go to latest Published: Feb 9, 2026 License: MIT Imports: 6 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/electwix/db-catalyst

Links

Open Source Insights

Documentation ¶

Overview ¶

Package tokenizer scans SQL source code into tokens.

Index ¶

func IsKeyword(s string) bool
func NormalizeIdentifier(text string) string
func ScanSeq(path string, src []byte, captureDocs bool) iter.Seq[Token]
type Error
- func (e *Error) Error() string
type Kind
- func (k Kind) String() string
type Scanner
type Span
- func NewSpan(tok Token) Span
- func SpanBetween(start, end Token) Span
- func (s Span) Extend(tok Token) Span
type Token
- func Scan(path string, src []byte, captureDocs bool) ([]Token, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func IsKeyword ¶

func IsKeyword(s string) bool

IsKeyword reports whether the provided string matches a known keyword.

func NormalizeIdentifier ¶

func NormalizeIdentifier(text string) string

NormalizeIdentifier removes optional quoting from identifiers while unescaping content.

func ScanSeq ¶

func ScanSeq(path string, src []byte, captureDocs bool) iter.Seq[Token]

ScanSeq returns an iterator over tokens in the source. This is memory-efficient for large files and enables early termination. Use this when you only need to process tokens sequentially. For random access, use Scan() instead.

Example:

for tok := range tokenizer.ScanSeq(path, src, false) {
    if tok.Kind == tokenizer.KindEOF {
        break
    }
    process(tok)
}

Types ¶

type Error ¶

type Error struct {
	Path    string
	Line    int
	Column  int
	Message string
}

Error describes a positional scanning error suitable for diagnostics.

func (*Error) Error ¶

func (e *Error) Error() string

Error returns the printable representation of the tokenizer error.

type Kind ¶

type Kind int

Kind represents the classification of a scanned token.

const (
	// KindInvalid represents an unrecognized or placeholder token.
	KindInvalid Kind = iota
	// KindIdentifier represents bare or quoted identifiers.
	KindIdentifier
	// KindKeyword represents SQL keywords normalized to uppercase.
	KindKeyword
	// KindNumber represents numeric literals.
	KindNumber
	// KindString represents string literals using single quotes.
	KindString
	// KindBlob represents blob literals of the form X'...'.
	KindBlob
	// KindSymbol represents punctuation or operator symbols.
	KindSymbol
	// KindParam represents PostgreSQL-style positional parameters ($1, $2, etc.)
	KindParam
	// KindDocComment represents a documentation comment captured for a following statement.
	KindDocComment
	// KindEOF marks the logical end of the input.
	KindEOF
)

func (Kind) String ¶

func (k Kind) String() string

type Scanner ¶

type Scanner struct {
	// contains filtered or unexported fields
}

Scanner maintains scanning state over a schema source.

type Span ¶

type Span struct {
	File        string
	StartLine   int
	StartColumn int
	EndLine     int
	EndColumn   int
}

Span represents a best-effort start and end position within a source file.

func NewSpan ¶

func NewSpan(tok Token) Span

NewSpan returns a span covering a single token.

func SpanBetween ¶

func SpanBetween(start, end Token) Span

SpanBetween returns a span that covers both the start and end tokens, inclusive.

func (Span) Extend ¶

func (s Span) Extend(tok Token) Span

Extend expands the span to include the provided token.

type Token ¶

type Token struct {
	Kind   Kind
	Text   string
	File   string
	Line   int
	Column int
}

Token is a unit emitted by the scanner with positional metadata.

func Scan ¶

func Scan(path string, src []byte, captureDocs bool) ([]Token, error)

Scan tokenizes the provided schema source and returns the token stream.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL