jsontype

package module
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 25, 2026 License: MIT Imports: 15 Imported by: 0

README

JSONType

JSONType is a CLI tool that analyzes one or more JSON files, infers their data types and structure, merges nested objects and arrays, and produces a clear, searchable description of the resulting JSON schema and type layout.

It is designed for exploration, debugging, reverse‑engineering unknown JSON, and documenting real‑world data formats that don't come with schemas.

Screenshot

Installation

go install github.com/4nd3r5on/jsontype/cmd/jsontype@latest

The binary will be installed as jsontype in your $GOBIN.

Basic Usage

Analyze a single file
jsontype ./parseme.json
jsontype parseme.json
Pipe input from stdin
cat file.json | jsontype
curl https://api.example.com/data | jsontype
Analyze multiple files

Multiple files can be provided as arguments and will be merged into a single inferred structure:

jsontype parseme1.json parseme2.json parseme3.json
jsontype "parse me.json" parseme1.json parseme2.json
Control output and logging
# Enable verbose diagnostics
jsontype -log-level debug parseme.json

# Write output to a file
jsontype -out schema.txt parseme.json

CLI Flags

-ignore-objects string
    Space-separated JSON paths to ignore
    Example: 'metadata debug.info'

-parse-objects string
    Space-separated JSON paths to explicitly parse
    Example: 'users data.items'

-max-depth int
    Maximum depth to parse (0 = unlimited)

-no-string-analysis
    Disable extended string type detection (UUID, email, IP addresses, etc.)

-log-level string
    debug | info | warn | error (default: "info")

-out string
    Output file (default: stdout)

Type Detection

Basic Types

JSONType detects the following basic JSON types:

  • unknown - Unable to determine type
  • null - JSON null value (or possibly missing)
  • string - Text string
  • bool - Boolean (true/false)
  • int32 - 32-bit integer
  • int64 - 64-bit integer
  • float64 - Floating point number
Container Types
  • object - JSON object with string keys
  • object_int - Object with integer keys (map-like structures)
  • array - JSON array
Extended String Types

By default, JSONType performs analysis of string values to detect common patterns. This can be disabled with the -no-string-analysis flag. String type detection is heuristic-based and may not be 100% accurate for all edge cases.

Common Formats
  • string-uuid - UUID identifiers
    Example: f3a9c2e7-6b4d-4f81-9a6c-2d8e5b71c0fa

  • string-filepath-windows - Windows file paths
    Example: C:/user/documents/file.txt

  • string-email - Email addresses
    Example: [email protected]

  • string-phone - Phone numbers
    Example: +380661153394

Web and Networking
  • string-link - URLs and web links
    Example: https://google.com or google.com/search

  • string-domain - Domain names
    Example: google.com

  • string-ipv4 - IPv4 addresses
    Example: 127.0.0.1

  • string-ipv4-with-mask - IPv4 with CIDR notation
    Example: 127.0.0.1/32

  • string-ipv6 - IPv6 addresses
    Example: 2a03:2880:21ff:001f:face:b00c:dead:beef

  • string-ipv4-port-pair - IPv4 address with port
    Example: 127.0.0.1:443

  • string-ipv6-port-pair - IPv6 address with port
    Example: [2a03:2880:21ff:1f:face:b00c:dead:beef]:43792

  • string-mac - MAC addresses
    Example: 9e:3b:74:a1:5f:c2

Encoding Formats
  • string-hex - Hexadecimal strings
    Example: a9f3c2e7b4d81f6a

  • string-b64-std - Base64 standard encoded data
    Example: wqFIb2xhL+S4lueVjCtHbyE=

  • string-b64-url - Base64 URL encoded data
    Example: wqFIb2xhL-S4lueVjCtHbyE=

  • string-b64-raw-std - Base64 raw standard encoded data
    Example: wqFIb2xhL+S4lueVjCtHbyE

  • string-b64-raw-url - Base64 raw URL encoded data
    Example: wqFIb2xhL-S4lueVjCtHbyE

JSON Path Format

JSONType uses a simple, readable JSON path syntax to refer to specific locations in a document.

Basic form
$.obj1.obj2.array[0].name
  • . separates object keys
  • [0] refers to a specific array index
Wildcards
$.obj1.obj2.array[].id

[] means any element of:

  • an array, or
  • an object with integer keys

This is especially useful for describing homogeneous arrays or map-like objects.

Root prefix
$
  • $. is an optional root prefix
  • Paths may be written with or without it

These are equivalent:

users[].id
$.users[].id

Selective Parsing

NOTE: paths can be by coma, space, or tab. U can use quotes and also back slashes for escape characters

Parse only specific subtrees
jsontype -parse-objects "users events.items" data.json

Only the listed paths will be analyzed; everything else is skipped.

Ignore specific paths
jsontype -ignore-objects "metadata debug.info" data.json

Ignored paths are completely excluded from inference and merging.

Limit parsing depth
jsontype -max-depth 5 data.json

Stops analyzing nested structures beyond the specified depth.

Typical Use Cases

  • Reverse‑engineering undocumented APIs - Discover the structure of API responses without documentation
  • Exploring logs or event streams - Understand the shape of log data and event payloads
  • Debugging unexpected JSON shape changes - Identify schema drift between versions
  • Creating documentation for real-world JSON formats - Generate schema documentation from examples
  • Validating assumptions before writing parsers - Verify data types before implementing deserialization
  • Analyzing network traffic - Detect IP addresses, UUIDs, and other structured data in JSON payloads

Documentation

Overview

Package jsontype provides functionality for getting types inside of a json file

Index

Constants

This section is empty.

Variables

View Source
var Base64Variants = []Base64Variant{
	{
		Type:     TypeBase64Std,
		Regexp:   reBase64Std,
		Encoding: base64.StdEncoding,
	},
	{
		Type:     TypeBase64URL,
		Regexp:   reBase64URL,
		Encoding: base64.URLEncoding,
	},
	{
		Type:     TypeBase64RawStd,
		Regexp:   reBase64RawStd,
		Encoding: base64.RawStdEncoding,
	},
	{
		Type:     TypeBase64RawURL,
		Regexp:   reBase64RawURL,
		Encoding: base64.RawURLEncoding,
	},
}

Functions

func ComparePaths

func ComparePaths(path1, path2 []string) string

ComparePaths shows the difference between two paths

func DiagnoseFieldInfo

func DiagnoseFieldInfo(field *FieldInfo, name string)

DiagnoseFieldInfo prints detailed diagnostic info about a FieldInfo tree

func FieldInfoToString

func FieldInfoToString(field *FieldInfo, indent string) string

FieldInfoToString converts a FieldInfo tree to a readable string

func IsContainerType

func IsContainerType(t DetectedType) bool

func IsDelim

func IsDelim(token json.Token, expectDelim json.Delim) bool

func IsMixedContainer

func IsMixedContainer(field *FieldInfo) bool

IsMixedContainer returns true if container has different types of children elements

func MergerToString

func MergerToString(m *Merger, indent string, isLast bool) string

MergerToString converts a Merger to a human-readable tree

func PathToString

func PathToString(path []string) string

func PlanToString

func PlanToString(plan *MergePlan, indent string, isLast bool) string

PlanToString converts a MergePlan to a human-readable ASCII tree This makes future regressions scream immediately

func PrintMergerTree

func PrintMergerTree(m *Merger, prefix string, w io.Writer)

func SnapshotMerger

func SnapshotMerger(name string, m *Merger) string

SnapshotMerger creates a golden test snapshot of a merger

func SnapshotPlan

func SnapshotPlan(name string, field *FieldInfo, logger *slog.Logger) string

SnapshotPlan creates a golden test snapshot of a plan

func StringToPath

func StringToPath(s string) []string

func TypesToString

func TypesToString(types []DetectedType) []string

Types

type ArrayStrategy

type ArrayStrategy int
const (
	ArrayCollapse    ArrayStrategy = iota // use ""
	ArrayKeepIndices                      // use "0", "1", ...
)

type Base64Encoding added in v0.1.1

type Base64Encoding interface {
	DecodeString(s string) ([]byte, error)
	EncodeToString(src []byte) string
}

type Base64Variant added in v0.1.1

type Base64Variant struct {
	Type     DetectedType
	Regexp   *regexp.Regexp
	Encoding Base64Encoding
}

type DefaultStream

type DefaultStream struct {
	*json.Decoder
}

func NewJSONStream

func NewJSONStream(r io.Reader) *DefaultStream

func (*DefaultStream) SkipValue

func (s *DefaultStream) SkipValue() error

type DetectedType

type DetectedType string
const (
	TypeUnknown DetectedType = "unknown"
	TypeNull    DetectedType = "null"
	TypeString  DetectedType = "string"
	TypeBool    DetectedType = "bool"
	TypeInt32   DetectedType = "int32"
	TypeInt64   DetectedType = "int64"
	TypeFloat64 DetectedType = "float64"
	// Containers
	TypeObj    DetectedType = "object"
	TypeObjInt DetectedType = "object_int"
	TypeArray  DetectedType = "array"
)
const (
	// Common
	TypeUUID            DetectedType = "string-uuid"             // f3a9c2e7-6b4d-4f81-9a6c-2d8e5b71c0fa
	TypeFilepathWindows DetectedType = "string-filepath-windows" // C:/user/file
	TypeEmail           DetectedType = "string-email"            // [email protected]
	TypePhone           DetectedType = "string-phone"            // +380661153394

	// Web
	TypeLink   DetectedType = "string-link"   // https://google.com or google.com/search
	TypeDomain DetectedType = "string-domain" // google.com

	// Encoding
	TypeHEX          DetectedType = "string-hex"         // "a9f3c2e7b4d81f6a"
	TypeBase64Std    DetectedType = "string-b64-std"     // wqFIb2xhL+S4lueVjCtHbyE=
	TypeBase64URL    DetectedType = "string-b64-url"     // wqFIb2xhL-S4lueVjCtHbyE=
	TypeBase64RawStd DetectedType = "string-b64-raw-std" // wqFIb2xhL+S4lueVjCtHbyE
	TypeBase64RawURL DetectedType = "string-b64-raw-url" // wqFIb2xhL-S4lueVjCtHbyE

	// Networking
	TypeIPv4         DetectedType = "string-ipv4"           // 127.0.0.1
	TypeIPv4WithMask DetectedType = "string-ipv4-with-mask" // 127.0.0.1/32
	TypeIPv6         DetectedType = "string-ipv6"           // 2a03:2880:21ff:001f:face:b00c:dead:beef
	TypeIPv4PortPair DetectedType = "string-ipv4-port-pair" // 127.0.0.1:443
	TypeIPv6PortPair DetectedType = "string-ipv6-port-pair" // [2a03:2880:21ff:1f:face:b00c:dead:beef]:43792
	TypeMAC          DetectedType = "string-mac"            // 9e:3b:74:a1:5f:c2
)

Extended string detection/analysis Detection code is at ./detect_str_type.go

func DetectBase64 added in v0.1.1

func DetectBase64(s string) (DetectedType, bool)

func DetectHex added in v0.1.1

func DetectHex(s string) (DetectedType, bool)

func DetectStrType added in v0.0.5

func DetectStrType(s string) DetectedType

DetectStrType detects the type of a string value

func GetChilderTypes

func GetChilderTypes(field *FieldInfo) (types []DetectedType)

func InferMergedContainerType

func InferMergedContainerType(children []*FieldInfo, key string) DetectedType

type FieldInfo

type FieldInfo struct {
	Parent *FieldInfo
	// Full path like ["obj1", "obj2", "field"]
	Path []string
	// If container -- will contain one of the following types: TypeObj, TypeObjInt, TypeObjArray
	Type DetectedType

	// Container-specific info
	Children []*FieldInfo // Ordered children for objects/arrays
	// Key: "0", "1", "2" for arrays; "1", "5", "12" for obj_int
	ChildrenMap map[string]*FieldInfo // Quick lookup by name/key
}

FieldInfo represents a single field/path in the JSON structure It serves only for a run through a single file (since)

func ParseStream

func ParseStream(
	s Stream,
	parseObjects, ignoreObjects [][]string,
	maxDepth int,
	noStringAnalysis bool,
	logger *slog.Logger,
) (root *FieldInfo, err error)

type MergePlan

type MergePlan struct {
	Kind PlanKind

	// For arrays
	ArrayStrategy ArrayStrategy
	Elem          *MergePlan

	// For objects
	Fields map[string]*MergePlan
}

MergePlan describes the shape of the merged result

func PlanShape

func PlanShape(field *FieldInfo, logger *slog.Logger) *MergePlan

PlanShape is a log wrapper for planShape

type Merger

type Merger struct {
	// full path of this node (immutable after creation)
	Path []string
	// map( label : set(type) )
	LabeledTypesMap map[string]map[DetectedType]struct{}
	// how much times each type was met
	TypesMap map[DetectedType]struct{}
	// children keyed by the immediate child key (for arrays use "0", "1", etc as keys).
	// if type isn't mixed (for some labels) -- all data is written under the same key ""
	ChildrenMap map[string]*Merger
	// tracks the order in which children were added
	ChildrenKeys []string
}

Merger represents aggregated information for a single JSON path. Path is the full path (slice of keys). Children map contains sub-path nodes.

func ExecuteMerge

func ExecuteMerge(
	plan *MergePlan,
	label string,
	fields []*FieldInfo,
	logger *slog.Logger,
) *Merger

ExecuteMerge executes a merge plan on the given field infos parentPath is the path we're building (with wildcards for collapsed arrays) fields are the FieldInfos at this level

func MergeFieldInfo

func MergeFieldInfo(m *Merger, label string, field *FieldInfo, logger *slog.Logger) *Merger

MergeFieldInfo is the main entry point - creates plan and executes it

func NewMerger

func NewMerger(path []string) *Merger

func (*Merger) AddChild

func (m *Merger) AddChild(key string, label string, child *Merger) *Merger

func (*Merger) AddTypes

func (m *Merger) AddTypes(label string, types ...DetectedType)

type ObjectMergeStrategy

type ObjectMergeStrategy struct {
	// contains filtered or unexported fields
}

func NewObjectMergeStrategy

func NewObjectMergeStrategy(logger *slog.Logger) *ObjectMergeStrategy

func (*ObjectMergeStrategy) Merge

func (s *ObjectMergeStrategy) Merge(
	path []string,
	label string,
	plan *MergePlan,
	fields []*FieldInfo,
) *Merger

Merge handles all object merging logic

type PlanKind

type PlanKind int
const (
	PlanPrimitive PlanKind = iota
	PlanArray
	PlanObject
)

type Stream

type Stream interface {
	// NextToken returns next token from a json stream
	// A Token holds a value of one of these types:
	//
	//   - [Delim], for the four JSON delimiters [ ] { }
	//   - bool, for JSON booleans
	//   - float64, for JSON numbers
	//   - [Number], for JSON numbers
	//   - string, for JSON string literals
	//   - nil, for JSON null
	//
	// At the end of the stream returns EOF
	Token() (json.Token, error)
	More() bool
	SkipValue() error
}

Directories

Path Synopsis
cmd
jsontype command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL