Documentation
¶
Overview ¶
Package avro encodes and decodes Avro specification data.
Parse an Avro JSON schema with Parse (or MustParse for package-level vars), then call Schema.Encode / Schema.Decode for binary encoding, or Schema.EncodeJSON / Schema.DecodeJSON for JSON encoding. Use SchemaFor to infer a schema from a Go struct type, or Schema.Root to inspect a parsed schema's structure.
Basic usage ¶
schema := avro.MustParse(`{
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "age", "type": "int"}
]
}`)
type User struct {
Name string `avro:"name"`
Age int `avro:"age"`
}
// Encode
data, err := schema.Encode(&User{Name: "Alice", Age: 30})
// Decode
var u User
_, err = schema.Decode(data, &u)
JSON encoding ¶
Schema.EncodeJSON is schema-aware and handles bytes, unions, and NaN/Infinity floats correctly — use it instead of encoding/json.Marshal when serializing decoded Avro data to JSON. Options control the output format: TaggedUnions for Avro JSON union wrappers ({"type": value}), TagLogicalTypes for qualified branch names, and LinkedinFloats for the goavro NaN/Infinity convention.
Encoding from JSON input ¶
Data from encoding/json.Unmarshal (map[string]any with float64 numbers and string timestamps) can be encoded directly. Missing map keys are filled from schema defaults, encoding/json.Number is accepted for all numeric types, and timestamp fields accept RFC 3339 strings. String fields accept encoding.TextAppender and encoding.TextMarshaler implementations (with encoding.TextUnmarshaler on decode).
Schema evolution ¶
Avro data is always written with a specific schema — the "writer schema." When you read that data later, your application may expect a different schema — the "reader schema." For example, you may have added a field, removed one, or widened a type from int to long. The data on disk doesn't change, but your code expects the new layout.
Resolve bridges this gap. Given the writer and reader schemas, it returns a new schema that knows how to decode the old wire format and produce values in the reader's layout:
- Fields in the reader but not the writer are filled from defaults.
- Fields in the writer but not the reader are skipped.
- Fields that exist in both are matched by name (or alias) and decoded, with type promotion applied where needed (e.g. int → long).
You typically get the writer schema from the data itself: an OCF file header embeds it, and schema registries store it by ID or fingerprint.
As a concrete example, suppose v1 of your application wrote User records with just a name:
var writerSchema = avro.MustParse(`{
"type": "record", "name": "User",
"fields": [
{"name": "name", "type": "string"}
]
}`)
In v2, you added an email field with a default:
var readerSchema = avro.MustParse(`{
"type": "record", "name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "email", "type": "string", "default": ""}
]
}`)
type User struct {
Name string `avro:"name"`
Email string `avro:"email"`
}
To read old v1 data with your v2 struct, resolve the two schemas:
resolved, err := avro.Resolve(writerSchema, readerSchema)
// Decode v1 data: "email" is absent in the old data, so it gets
// the reader default ("").
var u User
_, err = resolved.Decode(v1Data, &u)
// u == User{Name: "Alice", Email: ""}
If you just want to check whether two schemas are compatible without building a resolved schema, use CheckCompatibility.
Struct tags ¶
Use the "avro" struct tag to control field mapping and schema inference. The format is avro:"[name][,option]..." where the name maps the Go field to the Avro field name (empty = use Go field name, "-" = exclude).
Encoding/decoding options:
avro:"name" // map to Avro field "name" avro:"-" // exclude field avro:",inline" // flatten nested struct fields into parent record avro:",omitzero" // encode zero values as the schema default
Schema inference options (used by SchemaFor):
avro:",default=value" // set field default (must be last option; scalars only) avro:",alias=old_name" // field alias for evolution (repeatable) avro:",timestamp-micros" // override logical type (also: timestamp-nanos, date, time-millis, time-micros) avro:",decimal(10,2)" // decimal logical type with precision and scale avro:",uuid" // UUID logical type
When encoding a map[string]any as a record, missing keys are filled from the schema's default values. For structs, omitzero does the same for zero-valued fields (or fields whose IsZero() method returns true).
Embedded (anonymous) struct fields are automatically inlined. To prevent inlining, give the field an explicit name tag. When multiple fields at different depths resolve to the same name, the shallowest wins; among fields at the same depth, a tagged field wins over an untagged one.
Custom types ¶
CustomType registers custom Go type conversions for logical types, domain types, or to replace built-in behavior. A matching custom type replaces the built-in logical type handler entirely — callbacks receive raw Avro-native values, not enriched types. Use NewCustomType for type-safe primitive conversions, or the CustomType struct directly for complex cases (records, fixed types, property-based dispatch). Custom types are registered per-schema via SchemaOpt.
Parsing options ¶
Parse and SchemaCache.Parse accept WithLaxNames to allow non-standard characters in type and field names.
Errors ¶
Encode and decode errors can be inspected with errors.As:
- *SemanticError: type mismatch (includes a dotted field path for nested records)
- *ShortBufferError: input truncated mid-value
- *CompatibilityError: schema evolution incompatibility
Other features ¶
- Schema Cache: SchemaCache accumulates named types across Parse calls for schema registry workflows
- Schema Introspection: Schema.Root returns a SchemaNode; Schema.String returns the original JSON
- Single Object Encoding: Schema.AppendSingleObject, Schema.DecodeSingleObject
- Fingerprinting: Schema.Canonical, Schema.Fingerprint, NewRabin
- Object Container Files: the github.com/twmb/avro/ocf sub-package
Example ¶
package main
import (
"fmt"
"log"
"github.com/twmb/avro"
)
func main() {
schema := avro.MustParse(`{
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "age", "type": "int"}
]
}`)
type User struct {
Name string `avro:"name"`
Age int32 `avro:"age"`
}
data, err := schema.Encode(&User{Name: "Alice", Age: 30})
if err != nil {
log.Fatal(err)
}
var u User
if _, err := schema.Decode(data, &u); err != nil {
log.Fatal(err)
}
fmt.Printf("%s is %d\n", u.Name, u.Age)
}
Output: Alice is 30
Index ¶
- Variables
- func CheckCompatibility(writer, reader *Schema) error
- func NewRabin() hash.Hash64
- func SingleObjectFingerprint(data []byte) (fp [8]byte, rest []byte, err error)
- type CompatibilityError
- type CustomType
- type Duration
- type Opt
- type Schema
- func (s *Schema) AppendEncode(dst []byte, v any, opts ...Opt) ([]byte, error)
- func (s *Schema) AppendEncodeJSON(dst []byte, v any, opts ...Opt) ([]byte, error)
- func (s *Schema) AppendSingleObject(dst []byte, v any, opts ...Opt) ([]byte, error)
- func (s *Schema) Canonical() []byte
- func (s *Schema) Decode(src []byte, v any, opts ...Opt) ([]byte, error)
- func (s *Schema) DecodeJSON(src []byte, v any, opts ...Opt) error
- func (s *Schema) DecodeSingleObject(data []byte, v any, opts ...Opt) ([]byte, error)
- func (s *Schema) Encode(v any, opts ...Opt) ([]byte, error)
- func (s *Schema) EncodeJSON(v any, opts ...Opt) ([]byte, error)
- func (s *Schema) Fingerprint(h hash.Hash) []byte
- func (s *Schema) Root() SchemaNode
- func (s *Schema) String() string
- type SchemaCache
- type SchemaField
- type SchemaNode
- type SchemaOpt
- type SemanticError
- type ShortBufferError
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var ErrSkipCustomType = errors.New("avro: skip custom type")
ErrSkipCustomType is returned from a CustomType Encode or Decode function to indicate the value is not handled by this custom type. The library falls through to the next matching custom type or to built-in behavior.
Functions ¶
func CheckCompatibility ¶
CheckCompatibility reports whether data written with the writer schema can be read by the reader schema. It returns nil on success or a *CompatibilityError describing the first incompatibility.
See Resolve for a note on argument order.
Types ¶
type CompatibilityError ¶
type CompatibilityError struct {
// Path is the dotted path to the incompatible element (e.g. "User.address.zip").
Path string
// ReaderType is the Avro type in the reader schema.
ReaderType string
// WriterType is the Avro type in the writer schema.
WriterType string
// Detail describes the specific incompatibility.
Detail string
}
CompatibilityError describes an incompatibility between a reader and writer schema, as returned by CheckCompatibility and Resolve.
func (*CompatibilityError) Error ¶
func (e *CompatibilityError) Error() string
type CustomType ¶ added in v1.3.0
type CustomType struct {
// LogicalType narrows matching to schema nodes with this logicalType.
LogicalType string
// AvroType narrows matching to schema nodes of this Avro type
// (e.g. "long", "bytes", "record"). Also used by SchemaFor to
// infer the underlying Avro type.
AvroType string
// GoType adds an encode-time filter: when set, the Encode function
// only fires when the value's concrete type matches GoType. Values
// of other types pass through to the underlying serializer unchanged.
// If nil, Encode fires for all values on matched schema nodes
// (those matching LogicalType/AvroType).
//
// [SchemaFor] uses GoType to match struct fields: when a field's Go
// type equals GoType, SchemaFor emits AvroType + LogicalType (or
// Schema) instead of the default type mapping. If nil, the custom
// type does not affect schema generation, but is still wired into
// the returned [*Schema] for encode/decode.
GoType reflect.Type
// Schema is the full schema to emit in SchemaFor. Only needed for
// types requiring extra metadata (fixed needs name+size, decimal
// needs precision+scale, records need fields). If nil, SchemaFor
// infers from AvroType + LogicalType.
Schema *SchemaNode
// Encode converts a custom Go value to an Avro-native value,
// called before serialization. Return [ErrSkipCustomType] to fall
// through to the next matching custom type or built-in behavior.
// Any other non-nil error is fatal. If nil, default encoding is used.
Encode func(v any, schema *SchemaNode) (any, error)
// Decode converts an Avro-native value to a custom Go value,
// called after deserialization. Return [ErrSkipCustomType] to fall
// through. Any other non-nil error is fatal. If nil, default
// decoding is used.
Decode func(v any, schema *SchemaNode) (any, error)
// contains filtered or unexported fields
}
CustomType defines a custom conversion between a Go type and an Avro type. Use this when you need full control over the type mapping — for example, to map a custom Go struct to/from an Avro fixed or record, to handle complex Avro types (records, arrays, maps) as backing types, or to dispatch on schema properties rather than logical type names. For simpler cases where the backing type is a primitive, prefer NewCustomType which infers the wiring from type parameters.
Pass to Parse or SchemaFor as a SchemaOpt.
Matching at parse time: LogicalType and AvroType are checked against schema nodes. All non-empty criteria must match.
- LogicalType only: matches any schema node with that logicalType
- LogicalType + AvroType: matches that logicalType on that Avro type
- AvroType only: matches all nodes of that Avro type
- Neither: matches ALL schema nodes (use with ErrSkipCustomType for property-based dispatch like Kafka Connect types)
At encode time, GoType is also checked: the Encode function only fires when the value's type matches GoType. This prevents the codec from intercepting native values (e.g. a raw int64 passes through without conversion for a custom-typed long field).
A matching custom type replaces the built-in logical type handler entirely. Encode and Decode callbacks receive raw Avro-native values (int32 for int, int64 for long, []byte for bytes/fixed, etc.), not the enriched types that built-in handlers produce (time.Time, time.Duration, etc.). Among user registrations, first match wins.
For custom types backed by complex Avro types (records, arrays, maps), use the struct form directly — the Encode function can return map[string]any, []any, etc. NewCustomType is limited to primitive backing types.
Example (Override) ¶
package main
import (
"fmt"
"github.com/twmb/avro"
)
func main() {
// Use CustomType directly to override a built-in logical type handler.
// Here we suppress the timestamp-millis → time.Time conversion and
// keep the raw int64 epoch millis.
schema := avro.MustParse(`{
"type": "record", "name": "Event",
"fields": [
{"name": "ts", "type": {"type": "long", "logicalType": "timestamp-millis"}}
]
}`, avro.CustomType{
LogicalType: "timestamp-millis",
Decode: func(v any, _ *avro.SchemaNode) (any, error) {
return v, nil // pass through raw int64
},
})
data, _ := schema.Encode(map[string]any{"ts": int64(1767225600000)})
var out any
schema.Decode(data, &out)
m := out.(map[string]any)
fmt.Printf("ts type: %T\n", m["ts"])
}
Output: ts type: int64
Example (PropertyDispatch) ¶
package main
import (
"fmt"
"github.com/twmb/avro"
)
func main() {
// CustomType with no LogicalType/AvroType/GoType matches ALL schema
// nodes. Use ErrSkipCustomType to selectively handle nodes based on
// schema properties, e.g. Kafka Connect type annotations.
ct := avro.CustomType{
Decode: func(v any, node *avro.SchemaNode) (any, error) {
if node.Props["connect.type"] == "double-it" {
return v.(int64) * 2, nil
}
return nil, avro.ErrSkipCustomType
},
}
// Properties on the type object are available via node.Props in the
// custom type callback.
schema := avro.MustParse(`{
"type": "record", "name": "R",
"fields": [
{"name": "x", "type": {"type": "long", "connect.type": "double-it"}},
{"name": "y", "type": "long"}
]
}`, ct)
data, _ := schema.Encode(map[string]any{"x": int64(5), "y": int64(5)})
var out any
schema.Decode(data, &out)
m := out.(map[string]any)
fmt.Printf("x=%d y=%d\n", m["x"], m["y"])
}
Output: x=10 y=5
Example (SchemaFor) ¶
package main
import (
"fmt"
"log"
"reflect"
"github.com/twmb/avro"
)
func main() {
// Setting GoType lets SchemaFor infer the Avro schema for struct
// fields of that type. Without GoType, SchemaFor doesn't know that
// a Cents field should map to {"type":"long","logicalType":"money"}.
type Cents int64
ct := avro.CustomType{
LogicalType: "money",
AvroType: "long",
GoType: reflect.TypeFor[Cents](),
Encode: func(v any, _ *avro.SchemaNode) (any, error) {
return int64(v.(Cents)), nil
},
Decode: func(v any, _ *avro.SchemaNode) (any, error) {
return Cents(v.(int64)), nil
},
}
type Order struct {
Price Cents `avro:"price"`
}
schema, err := avro.SchemaFor[Order](ct)
if err != nil {
log.Fatal(err)
}
fmt.Println(schema.Root().Fields[0].Type.LogicalType)
}
Output: money
func NewCustomType ¶ added in v1.3.0
func NewCustomType[G, A any]( logicalType string, encode func(G, *SchemaNode) (A, error), decode func(A, *SchemaNode) (G, error), ) CustomType
NewCustomType returns a type-safe CustomType for the common case of mapping a custom Go type to/from a primitive Avro type. For example, use this to decode Avro longs into a domain-specific ID type, or to encode a Money type as Avro bytes with a "decimal" logical type.
G is the custom Go type (e.g. Money). A is the Avro-native Go type: int32 for int, int64 for long, float32 for float, float64 for double, string for string, []byte for bytes, bool for boolean.
GoType and AvroType are inferred from the type parameters. If A is not a supported Avro-native type, Parse or SchemaFor returns an error.
Note: AvroType is inferred from A's Go kind, which may not match the Avro schema's type for logical types backed by smaller types. For example, time-millis uses Avro "int" but time.Duration is int64 (which infers "long"). Use int32 as A, or use the CustomType struct directly with an explicit AvroType.
For fixed, records, or types needing extra schema metadata, use the CustomType struct directly.
Example ¶
package main
import (
"fmt"
"log"
"github.com/twmb/avro"
)
type ExMoney struct {
Cents int64
}
func main() {
// NewCustomType is the easiest way to map a custom Go type to/from a
// primitive Avro type. The type parameters wire everything up:
// G = your Go type, A = the Avro-native Go type it maps to.
//
// A is the raw type on the wire:
// int32 → Avro int float32 → Avro float bool → Avro boolean
// int64 → Avro long float64 → Avro double string → Avro string
// []byte → Avro bytes
//
// The first argument is the logicalType to match. Pass "" to match
// all schema nodes of the inferred Avro type.
moneyType := avro.NewCustomType[ExMoney, int64]("money",
func(m ExMoney, _ *avro.SchemaNode) (int64, error) { return m.Cents, nil },
func(c int64, _ *avro.SchemaNode) (ExMoney, error) { return ExMoney{Cents: c}, nil },
)
schema := avro.MustParse(`{
"type": "record", "name": "Order",
"fields": [
{"name": "price", "type": {"type": "long", "logicalType": "money"}}
]
}`, moneyType)
type Order struct {
Price ExMoney `avro:"price"`
}
data, err := schema.Encode(&Order{Price: ExMoney{Cents: 1999}})
if err != nil {
log.Fatal(err)
}
var out Order
if _, err := schema.Decode(data, &out); err != nil {
log.Fatal(err)
}
fmt.Printf("%d cents\n", out.Price.Cents)
}
Output: 1999 cents
type Duration ¶
Duration represents the Avro duration logical type: a 12-byte fixed value containing three little-endian unsigned 32-bit integers representing months, days, and milliseconds.
func DurationFromBytes ¶ added in v1.3.0
DurationFromBytes decodes a 12-byte little-endian fixed value into a Duration. Returns zero Duration if b is shorter than 12 bytes.
type Opt ¶ added in v1.3.0
type Opt interface {
// contains filtered or unexported methods
}
Opt configures encoding and decoding behavior. See each option's documentation for which functions it affects. Inapplicable options are silently ignored.
func LinkedinFloats ¶ added in v1.3.0
func LinkedinFloats() Opt
LinkedinFloats encodes NaN as JSON null and ±Infinity as ±1e999 in Schema.EncodeJSON, matching the linkedin/goavro convention. Without this option, NaN is encoded as the JSON string "NaN" and ±Infinity as "Infinity"/"-Infinity", following the Java Avro convention. Schema.DecodeJSON always accepts both conventions regardless of this option.
func TagLogicalTypes ¶ added in v1.3.0
func TagLogicalTypes() Opt
TagLogicalTypes qualifies union branch names with their logical type (e.g. "long.timestamp-millis" instead of "long"). This applies to Schema.EncodeJSON with TaggedUnions and to Schema.Decode with TaggedUnions. Without this option, branch names use the base Avro type per the specification. This option has no effect without TaggedUnions.
func TaggedUnions ¶ added in v1.3.0
func TaggedUnions() Opt
TaggedUnions wraps non-null union values as {"type_name": value}.
In Schema.EncodeJSON, this produces tagged JSON union output. In Schema.Decode and Schema.DecodeJSON to *any, this wraps union values as map[string]any{branchName: value}.
Without this option, union values are bare in all cases. Schema.DecodeJSON always accepts both tagged and bare input regardless of this option.
type Schema ¶
type Schema struct {
// contains filtered or unexported fields
}
Schema is a compiled Avro schema. Create one with Parse or MustParse, then use Schema.Encode / Schema.Decode to convert between Go values and Avro binary. A Schema is safe for concurrent use.
func MustSchemaFor ¶ added in v1.1.0
MustSchemaFor is like SchemaFor but panics on error.
func Parse ¶
Parse parses an Avro JSON schema string and returns a compiled *Schema. The input can be a primitive name (e.g. `"string"`), a JSON object (record, enum, array, map, fixed), or a JSON array (union). Named types may self-reference. The schema is fully validated: unknown types, duplicate names, invalid defaults, etc. all return errors.
To parse schemas that reference named types from other schemas, use SchemaCache.
func Resolve ¶
Resolve returns a schema that decodes data written with the writer schema and produces values matching the reader schema's layout. The writer schema is what the data was encoded with (typically from an OCF file header or a schema registry); the reader schema is what your application expects now.
Decoding with the returned schema handles field addition (defaults), field removal (skip), renaming (aliases), reordering, and type promotion. Encoding with it uses the reader's format.
If the schemas have identical canonical forms, reader is returned as-is. Otherwise CheckCompatibility is called first and any incompatibility is returned as a *CompatibilityError. See the package-level documentation for a full example.
Note: the argument order is (writer, reader), matching source-then-destination convention and Java's GenericDatumReader. This differs from the Avro spec text and hamba/avro, which put reader first.
Example ¶
package main
import (
"fmt"
"log"
"github.com/twmb/avro"
)
func main() {
// v1 wrote User with just a name.
writerSchema := avro.MustParse(`{
"type": "record", "name": "User",
"fields": [{"name": "name", "type": "string"}]
}`)
// v2 added an email field with a default.
readerSchema := avro.MustParse(`{
"type": "record", "name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "email", "type": "string", "default": ""}
]
}`)
resolved, err := avro.Resolve(writerSchema, readerSchema)
if err != nil {
log.Fatal(err)
}
// Encode a v1 record (name only).
v1Data, err := writerSchema.Encode(map[string]any{"name": "Alice"})
if err != nil {
log.Fatal(err)
}
// Decode old data into the new layout; email gets the default.
type User struct {
Name string `avro:"name"`
Email string `avro:"email"`
}
var u User
if _, err := resolved.Decode(v1Data, &u); err != nil {
log.Fatal(err)
}
fmt.Printf("name=%s email=%q\n", u.Name, u.Email)
}
Output: name=Alice email=""
func SchemaFor ¶ added in v1.1.0
SchemaFor infers an Avro schema from the Go type T. T must be a struct.
Field names are taken from the avro struct tag, falling back to the Go field name. The following tag options are supported:
- avro:"-" excludes the field
- avro:",inline" flattens a nested struct's fields into the parent
- avro:",omitzero" is recorded but does not affect the schema
- avro:",alias=old_name" adds a field alias (repeatable)
- avro:",default=value" sets the field's default value (must be last option; scalars only)
- avro:",timestamp-millis" overrides the logical type (also: timestamp-micros, timestamp-nanos, date, time-millis, time-micros)
- avro:",decimal(precision,scale)" sets the decimal logical type
- avro:",uuid" sets the uuid logical type
Type inference:
- bool → boolean
- int8, int16, int32 → int
- int, int64, uint32 → long
- uint8, uint16 → int
- float32 → float
- float64 → double
- string → string
- []byte → bytes
- [N]byte → fixed (size N)
- *T → ["null", T] union
- []T → array
- map[string]T → map
- struct → record (recursive)
- time.Time → long with timestamp-millis (override with tag)
- time.Duration → int with time-millis (override with tag)
- *big.Rat → requires explicit decimal(p,s) tag
- [16]byte with uuid tag → string with uuid logical type
Example ¶
package main
import (
"fmt"
"log"
"time"
"github.com/twmb/avro"
)
func main() {
type Event struct {
ID int64 `avro:"id"`
Name string `avro:"name,default=unnamed"`
Source string `avro:"source,default=web"`
Time time.Time `avro:"ts"`
Meta *string `avro:"meta"` // *T becomes ["null", T] union
}
schema := avro.MustSchemaFor[Event](avro.WithNamespace("com.example"))
// Encode, then decode back.
meta := "test"
data, err := schema.Encode(&Event{
ID: 1,
Name: "click",
Source: "mobile",
Time: time.Date(2026, 1, 1, 0, 0, 0, 0, time.UTC),
Meta: &meta,
})
if err != nil {
log.Fatal(err)
}
var out Event
if _, err := schema.Decode(data, &out); err != nil {
log.Fatal(err)
}
fmt.Printf("id=%d name=%s source=%s meta=%s\n", out.ID, out.Name, out.Source, *out.Meta)
// Inspect the inferred schema.
root := schema.Root()
for _, f := range root.Fields {
if f.HasDefault {
fmt.Printf("field %s: default=%v\n", f.Name, f.Default)
}
}
}
Output: id=1 name=click source=mobile meta=test field name: default=unnamed field source: default=web
func (*Schema) AppendEncode ¶
AppendEncode appends the Avro binary encoding of v to dst. See Schema.Decode for the Go-to-Avro type mapping. In addition to the types listed there, encoding also accepts:
- encoding/json.Number for any numeric Avro type (int, long, float, double)
- RFC 3339 strings for timestamp and date logical types
- *big.Rat, big.Rat, float64, encoding/json.Number, and numeric strings for decimal logical types
- encoding.TextAppender, encoding.TextMarshaler, and []byte for string types (and vice versa for encoding.TextUnmarshaler)
Example ¶
package main
import (
"fmt"
"log"
"github.com/twmb/avro"
)
func main() {
schema := avro.MustParse(`"string"`)
// AppendEncode reuses a buffer across calls, avoiding allocation.
var buf []byte
var err error
for _, s := range []string{"hello", "world"} {
buf, err = schema.AppendEncode(buf[:0], s)
if err != nil {
log.Fatal(err)
}
fmt.Printf("encoded %q: %d bytes\n", s, len(buf))
}
}
Output: encoded "hello": 6 bytes encoded "world": 6 bytes
func (*Schema) AppendEncodeJSON ¶ added in v1.3.0
AppendEncodeJSON is like Schema.EncodeJSON but appends to dst.
func (*Schema) AppendSingleObject ¶
AppendSingleObject appends a Single Object Encoding of v to dst: 2-byte magic, 8-byte CRC-64-AVRO fingerprint, then the Avro binary payload.
Example ¶
package main
import (
"fmt"
"log"
"github.com/twmb/avro"
)
func main() {
schema := avro.MustParse(`{
"type": "record",
"name": "Event",
"fields": [
{"name": "id", "type": "long"},
{"name": "name", "type": "string"}
]
}`)
type Event struct {
ID int64 `avro:"id"`
Name string `avro:"name"`
}
// Encode: 2-byte magic + 8-byte fingerprint + Avro payload.
data, err := schema.AppendSingleObject(nil, &Event{ID: 1, Name: "click"})
if err != nil {
log.Fatal(err)
}
// Decode.
var e Event
if _, err := schema.DecodeSingleObject(data, &e); err != nil {
log.Fatal(err)
}
fmt.Printf("id=%d name=%s\n", e.ID, e.Name)
}
Output: id=1 name=click
func (*Schema) Canonical ¶
Canonical returns the Parsing Canonical Form of the schema, stripping doc, aliases, defaults, and other non-essential attributes. The result is deterministic and suitable for comparison and fingerprinting.
func (*Schema) Decode ¶
Decode reads Avro binary from src into v and returns the remaining bytes. v must be a non-nil pointer to a type compatible with the schema:
- null: any (always decodes to nil)
- boolean: bool, any
- int, long: int, int8–int64, uint8–uint64, any
- float: float32, float64, any
- double: float64, float32, any
- string: string, []byte, any; also encoding.TextUnmarshaler
- bytes: []byte, string, any
- enum: string, int/uint (ordinal), any
- fixed: [N]byte, []byte, any
- array: slice, any
- map: map[string]T, any
- union: any, *T (for ["null", T] unions), or the matched branch type
- record: struct (matched by field name or `avro` tag), map[string]any, any
When decoding into *any, primitive types become nil, bool, int32, int64, float32, float64, string, []byte, []any, or map[string]any (for records). Logical types decode to their natural Go equivalents:
- date, timestamp-millis/micros/nanos, local-timestamp-*: time.Time (UTC)
- time-millis, time-micros: time.Duration
- decimal: encoding/json.Number
- duration: Duration
To produce JSON from decoded *any data, use Schema.EncodeJSON rather than encoding/json.Marshal. EncodeJSON is schema-aware and converts these types back to their Avro representations (e.g. time.Time to epoch integers, []byte to \uXXXX strings).
func (*Schema) DecodeJSON ¶ added in v1.2.0
DecodeJSON decodes Avro JSON from src into v. It unwraps union wrappers, converts bytes/fixed strings, and coerces numeric types to match the schema. When v is *any, the result is returned directly. For typed targets (structs, etc.), the value is round-tripped through binary encode/decode.
DecodeJSON also accepts the non-standard union branch naming used by linkedin/goavro (e.g. "long.timestamp-millis" instead of "long").
DecodeJSON accepts all input formats (tagged and bare unions, Java and goavro NaN/Infinity conventions). Pass TaggedUnions to wrap decoded union values when the target is *any.
Example ¶
package main
import (
"fmt"
"log"
"github.com/twmb/avro"
)
func main() {
schema := avro.MustParse(`{
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "email", "type": ["null", "string"]}
]
}`)
type User struct {
Name string `avro:"name"`
Email *string `avro:"email"`
}
// DecodeJSON accepts both bare and tagged union formats.
var u1, u2 User
if err := schema.DecodeJSON([]byte(`{"name":"Alice","email":"[email protected]"}`), &u1); err != nil {
log.Fatal(err)
}
if err := schema.DecodeJSON([]byte(`{"name":"Bob","email":{"string":"[email protected]"}}`), &u2); err != nil {
log.Fatal(err)
}
fmt.Printf("%s: %s\n", u1.Name, *u1.Email)
fmt.Printf("%s: %s\n", u2.Name, *u2.Email)
}
Output: Alice: [email protected] Bob: [email protected]
func (*Schema) DecodeSingleObject ¶
DecodeSingleObject decodes a Single Object Encoding message into v after verifying the magic and fingerprint match this schema.
func (*Schema) Encode ¶
Encode encodes v as Avro binary. It is shorthand for AppendEncode(nil, v).
Example (TextMarshaler) ¶
package main
import (
"fmt"
"log"
"net"
"github.com/twmb/avro"
)
func main() {
// Types implementing encoding.TextMarshaler are encoded as Avro
// strings, and encoding.TextUnmarshaler types decode from them.
schema := avro.MustParse(`{
"type": "record",
"name": "Server",
"fields": [
{"name": "name", "type": "string"},
{"name": "ip", "type": "string"}
]
}`)
type Server struct {
Name string `avro:"name"`
IP net.IP `avro:"ip"`
}
data, err := schema.Encode(&Server{
Name: "web-1",
IP: net.IPv4(192, 168, 1, 1),
})
if err != nil {
log.Fatal(err)
}
var out Server
if _, err := schema.Decode(data, &out); err != nil {
log.Fatal(err)
}
fmt.Printf("%s: %s\n", out.Name, out.IP)
}
Output: web-1: 192.168.1.1
func (*Schema) EncodeJSON ¶ added in v1.2.0
EncodeJSON encodes v as JSON using the schema for type-aware encoding. By default, union values are written as bare JSON values and bytes/fixed fields use \uXXXX escapes for non-ASCII bytes. Options can modify the output format; see Opt for details.
NaN and Infinity float values are encoded as JSON strings "NaN", "Infinity", and "-Infinity" by default (Java Avro convention), or as null/±1e999 with LinkedinFloats. Standard encoding/json.Marshal cannot represent these values; use EncodeJSON instead.
EncodeJSON accepts the same Go types as Schema.Encode. Map key order in the output is non-deterministic, as with encoding/json.Marshal.
Example ¶
package main
import (
"fmt"
"log"
"github.com/twmb/avro"
)
func main() {
schema := avro.MustParse(`{
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "email", "type": ["null", "string"]}
]
}`)
type User struct {
Name string `avro:"name"`
Email *string `avro:"email"`
}
email := "[email protected]"
u := User{Name: "Alice", Email: &email}
// Default: bare union values.
bare, err := schema.EncodeJSON(&u)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(bare))
// TaggedUnions: wrapped as {"type": value}.
tagged, err := schema.EncodeJSON(&u, avro.TaggedUnions())
if err != nil {
log.Fatal(err)
}
fmt.Println(string(tagged))
}
Output: {"name":"Alice","email":"[email protected]"} {"name":"Alice","email":{"string":"[email protected]"}}
func (*Schema) Fingerprint ¶
Fingerprint hashes the schema's canonical form with h. Use NewRabin for CRC-64-AVRO or crypto/sha256 for cross-language compatibility.
The result is big-endian per hash.Hash.Sum. Single Object Encoding uses little-endian fingerprints; use Schema.DecodeSingleObject or SingleObjectFingerprint for that format.
func (*Schema) Root ¶ added in v1.2.0
func (s *Schema) Root() SchemaNode
Root returns the SchemaNode representation of a parsed schema by re-parsing the original schema JSON. This preserves all metadata including doc strings, namespaces, and custom properties.
Root re-parses the JSON on each call. Cache the result if you need to access it repeatedly (e.g. in a per-message processing loop).
type SchemaCache ¶
type SchemaCache struct {
// contains filtered or unexported fields
}
SchemaCache accumulates named types across multiple SchemaCache.Parse calls, allowing schemas to reference types defined in previously parsed schemas. This is useful for Schema Registry integrations where schemas have references to other schemas.
Schemas must be parsed in dependency order: referenced types must be parsed before the schemas that reference them.
Parsing the same schema string multiple times is allowed and returns the previously parsed result. This handles diamond dependencies in schema reference graphs (e.g. A→B→D, A→C→D) without requiring callers to track which schemas have already been parsed. Deduplication normalizes the JSON (whitespace and key order) but not the Avro canonical form: schemas that differ only in formatting are deduplicated, but differences in non-canonical fields like doc or aliases are not and will return a duplicate type error.
The returned *Schema from each Parse call is fully resolved and independent of the cache — it can be used for Schema.Encode and Schema.Decode without the cache.
The zero value is ready to use. A SchemaCache is safe for concurrent use.
Example ¶
package main
import (
"fmt"
"log"
"github.com/twmb/avro"
)
func main() {
cache := new(avro.SchemaCache)
// Parse the Address type first.
if _, err := cache.Parse(`{
"type": "record",
"name": "Address",
"fields": [
{"name": "street", "type": "string"},
{"name": "city", "type": "string"}
]
}`); err != nil {
log.Fatal(err)
}
// User references Address by name.
schema, err := cache.Parse(`{
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "address", "type": "Address"}
]
}`)
if err != nil {
log.Fatal(err)
}
type Address struct {
Street string `avro:"street"`
City string `avro:"city"`
}
type User struct {
Name string `avro:"name"`
Address Address `avro:"address"`
}
data, err := schema.Encode(&User{
Name: "Alice",
Address: Address{Street: "123 Main St", City: "Springfield"},
})
if err != nil {
log.Fatal(err)
}
var u User
if _, err := schema.Decode(data, &u); err != nil {
log.Fatal(err)
}
fmt.Printf("%s lives at %s, %s\n", u.Name, u.Address.Street, u.Address.City)
}
Output: Alice lives at 123 Main St, Springfield
func (*SchemaCache) Parse ¶
func (c *SchemaCache) Parse(schema string, opts ...SchemaOpt) (*Schema, error)
Parse parses a schema string, registering any named types (records, enums, fixed) in the cache. Named types from previous Parse calls are available for reference resolution. On failure, the cache is not modified.
type SchemaField ¶ added in v1.2.0
type SchemaField struct {
Name string // field name
Type SchemaNode // field schema
Default any // default value (only meaningful when HasDefault is true)
HasDefault bool // true if a default value is defined in the schema
Aliases []string // field aliases for schema evolution
Order string // sort order: "ascending" (default), "descending", or "ignore"
Doc string // documentation string
Props map[string]any // custom properties (any JSON value)
}
SchemaField represents a field in an Avro record schema.
type SchemaNode ¶ added in v1.2.0
type SchemaNode struct {
Type string // Avro type or named type reference
LogicalType string // e.g. date, timestamp-millis, decimal, uuid; empty if none
Name string // name for record, enum, fixed
Namespace string // namespace for named types
Aliases []string // alternate names for named types (record, enum, fixed)
Doc string // documentation string
Fields []SchemaField // record fields
Items *SchemaNode // array element schema
Values *SchemaNode // map value schema
Branches []SchemaNode // union member schemas
Symbols []string // enum symbols
Size int // fixed byte size
EnumDefault string // default symbol for enum schema evolution
HasEnumDefault bool // true if an enum default is defined
Precision int // decimal precision
Scale int // decimal scale
Props map[string]any // custom properties (any JSON value)
}
SchemaNode is a read-write representation of an Avro schema. It can be obtained from a parsed schema via Schema.Root, or constructed directly and converted to a *Schema via the SchemaNode.Schema method.
The Type field determines which other fields are relevant:
- Primitives (null, boolean, int, long, float, double, string, bytes): LogicalType, Precision, Scale, and Props are optional. Other fields are ignored.
- record/error: Name and Fields are required. Namespace, Doc, and Props are optional.
- enum: Name and Symbols are required. Namespace, Doc, and Props are optional.
- array: Items is required.
- map: Values is required.
- fixed: Name and Size are required. LogicalType, Precision, Scale, Namespace, and Props are optional.
- union: Branches lists the member schemas.
A named type (record, enum, fixed) that has already been defined elsewhere in the schema can be referenced by setting Type to its full name (e.g. com.example.Address) with no other fields.
type SchemaOpt ¶ added in v1.1.0
type SchemaOpt interface {
// contains filtered or unexported methods
}
SchemaOpt configures schema construction via Parse, SchemaCache.Parse, or SchemaFor. Inapplicable options are silently ignored.
func WithCustomType ¶ added in v1.3.0
func WithCustomType(ct CustomType) SchemaOpt
WithCustomType registers a custom type conversion for use with Parse, SchemaCache.Parse, or SchemaFor. CustomType and NewCustomType both satisfy SchemaOpt directly, so this wrapper is optional — it exists for discoverability.
func WithLaxNames ¶
WithLaxNames relaxes name validation in Parse and SchemaCache.Parse, overriding the default requirement that names match the Avro strict name regex [A-Za-z_][A-Za-z0-9_]*. If fn is nil, only non-empty names are required. If fn is non-nil, it is called for each name component and should return an error for invalid names. Dot-separated fullnames are split before calling fn. Ignored by SchemaFor.
type SemanticError ¶
type SemanticError struct {
// GoType is the Go type involved, if applicable.
GoType reflect.Type
// AvroType is the Avro schema type (e.g. "int", "record", "boolean").
AvroType string
// Field is the dotted path to the record field (e.g. "address.zip"),
// if the error occurred within a record.
Field string
// Err is the underlying error.
Err error
}
SemanticError indicates a Go type is incompatible with an Avro schema type during encoding or decoding.
func (*SemanticError) Error ¶
func (e *SemanticError) Error() string
func (*SemanticError) Unwrap ¶
func (e *SemanticError) Unwrap() error
type ShortBufferError ¶
type ShortBufferError struct {
// Type is what was being read (e.g. "boolean", "string", "uint32").
Type string
// Need is the number of bytes required (0 if unknown).
Need int
// Have is the number of bytes available.
Have int
}
ShortBufferError indicates the input buffer is too short for the value being decoded.
func (*ShortBufferError) Error ¶
func (e *ShortBufferError) Error() string