csv

package module

v3.5.2 Latest Latest Go to latest Published: Dec 15, 2025 License: MIT Imports: 12 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/josephcopenhaver/csv-go

Links

Open Source Insights

README ¶

csv-go

This package is a highly flexible and performant single threaded UTF-8 friendly csv stream reader and writer. It opts for strictness with nearly all options off by default. Reader and Writer constructors use functional options to maximize flexibility while validating configuration at initialization rather than at runtime. This keeps exported types behavior-oriented (methods over public fields), avoiding leakage of rigid internal implementation details, and improving coupling/cohesion. Once created, parsing and writing strategies are immutable, allowing maintainers to evolve implementations greatly over time while keeping interface contracts stable. It has been battle tested thoroughly in production contexts for both correctness and speed so feel free to use in any way you like.

Both the reader and writer are more performant than the standard go csv package when compared in an apples-to-apples configuration between the two. The writer also has several optimizations for non-string type serialization via the fluent api returned by csv.Writer.NewRecord() and FieldWriters(). I expect mileage here to vary over time. My primary goal with this lib was to solve my own edge case problems like suspect-encodings/loose-rules and offer something back more aligned with others that think like myself regarding reducing allocations, GC pause, and increasing efficiency.

package main

// this is a toy example that reads a csv file and writes to another

import (
	"os"

	"github.com/josephcopenhaver/csv-go/v3"
)

func main() {
	r, err := os.Open("input.csv")
	if err != nil {
		panic(err)
	}
	defer r.Close()

	cr, err := csv.NewReader(
		csv.ReaderOpts().Reader(r),
		// by default quotes have no meaning
		// so must be specified to match RFC 4180
		// csv.ReaderOpts().Quote('"'),
	)
	if err != nil {
		panic(err)
	}
	defer cr.Close()

	w, err := os.Create("output.csv")
	if err != nil {
		panic(err)
	}
	defer func() {
		if err := w.Close(); err != nil {
			panic(err)
		}
	}()

	cw, err := csv.NewWriter(
		csv.WriterOpts().Writer(w),
	)
	if err != nil {
		panic(err)
	}
	defer func() {
		if err := cw.Close(); err != nil {
			panic(err)
		}
	}()

	for row := range cr.IntoIter() {
		if _, err := cw.WriteRow(row...); err != nil {
			panic(err)
		}
	}
	if err := cr.Err(); err != nil {
		panic(err)
	}
}

See the Reader and Writer examples for more in-depth usages.

Reader Features

Name	option(s)
Zero allocations during processing	BorrowRow + BorrowFields + InitialRecordBuffer + InitialRecordBufferSize + NumFields
Format Specification	Comment + CommentsAllowedAfterStartOfRecords + Escape + FieldSeparator + Quote + RecordSeparator + NumFields
Format Discovery	DiscoverRecordSeparator
Data Loss Prevention	ClearFreedDataMemory
Byte Order Marker Support	RemoveByteOrderMarker + ErrorOnNoByteOrderMarker
Headers Support	ExpectHeaders + RemoveHeaderRow + TrimHeaders
Reader Buffer tuning	ReaderBuffer + ReaderBufferSize
Format Validation	ErrorOnNoRows + ErrorOnNewlineInUnquotedField + ErrorOnQuotesInUnquotedField
Security Limits	MaxFields + MaxRecordBytes + MaxRecords + MaxComments + MaxCommentBytes

Writer Features

Name	option(s)
Zero allocations	InitialRecordBufferSize + InitialRecordBuffer
Header and Comment Specification	CommentRune + CommentLines + IncludeByteOrderMarker + Headers + TrimHeaders
Format Specification	CommentRune + Escape + FieldSeparator + Quote + RecordSeparator + NumFields
Data Loss Prevention	ClearFreedDataMemory
Encoding Validation	ErrorOnNonUTF8
Security Limits	planned

Note that the writer also has WriteFieldRow*() functions (WriteFieldRow, WriteFieldRowBorrowed) to reduce allocations when converting non‑string types to human‑readable CSV field values via the FieldWriter generating functions under csv.FieldWriters().

Note that after a number of columns, the WriteFieldRow*() calls flush less efficiently given they can leak to the heap and the cost of staging the non-serialized forms in a slice of wide-structs can add up quickly. To address this case, a fluent API has been added to the csv.Writer instance which can be utilized per some record to write via .NewRecord() which returns a RecordWriter instance. In a single-threaded fashion it locks the writer until Write() or Rollback() is called. Each field can be buffered for writing via the "FieldTypeName()" functions on the RecordWriter instance. Only one RecordWriter instance can be alive at a time for any given Writer.

Performance testing should be utilized to choose which writing methodology is ideal for your case. In general choose the method most sympathetic to your hardware and data formats. For most cases, csv.Writer.NewRecord() should achieve a nice balance that scales very high in terms of both utility and efficiency.

CHANGELOG

Here's the same example as above adjusted to optimize throughput via additional configurations.

package main

// this is a toy example that reads a csv file and writes to another without making allocations while processing

import (
	"bufio"
	"os"

	"github.com/josephcopenhaver/csv-go/v3"
)

func main() {
	r, err := os.Open("input.csv")
	if err != nil {
		panic(err)
	}
	defer r.Close()

	// using a buffered reader to avoid hot io pipes / writing less than the system storage device block size or ideal network protocol packet payload size
	// could instead use something async powered to get concurrent behaviors
	br := bufio.NewReader(r)

	var cr csv.Reader
	{
		op := csv.ReaderOpts()
		cr, err = csv.NewReader(
			op.Reader(br),
			op.RecordSeparator("\n"), // simplifies the execution plan ever so slightly and ensures consistent parsing rather than depending on automatic discovery
			op.InitialRecordBufferSize(4*1024*1024), // seeds the reading record buffer to a particular initial capacity
			op.BorrowRow(true),                      // evades allocations BUT makes it unsafe to store/use the resulting slice past the next call to Scan
			op.BorrowFields(true),                   // evades allocations BUT makes it unsafe to store/use the resulting character content of each slice element result anywhere past the next call to Scan
			op.NumFields(2),                         // simplifies the execution plan ever so slightly
			// by default quotes have no meaning
			// so must be specified to match RFC 4180
			// op.Quote('"'),
		)
		if err != nil {
			panic(err)
		}
		defer func() {
			if err := cr.Close(); err != nil {
				panic(err)
			}
		}()
	}

	w, err := os.Create("output.csv")
	if err != nil {
		panic(err)
	}
	defer func() {
		if err := w.Close(); err != nil {
			panic(err)
		}
	}()

	// using a buffered writer to avoid hot io pipes / writing less than the system storage device block size or ideal network protocol packet payload size
	// could instead use something async powered to get concurrent behaviors
	bw := bufio.NewWriterSize(w, 4*1024*1024)
	defer func() {
		if err := bw.Flush(); err != nil {
			panic(err)
		}
	}()

	var cw *csv.Writer
	{
		op := csv.WriterOpts()
		cw, err = csv.NewWriter(
			op.Writer(bw),
			op.InitialRecordBufferSize(4*1024*1024), // seeds the writing record buffer to a particular initial capacity
		)
		if err != nil {
			panic(err)
		}
		defer func() {
			if err := cw.Close(); err != nil {
				panic(err)
			}
		}()
	}

	// using Scan instead of the iterator sugar to avoid allocation of the iterator closures
	for cr.Scan() {
		// if BorrowRow=true or BorrowFields=true then implementation reading rows from the Reader MUST NOT keep the rows or byte sub-slices alive beyond the next call to cr.Scan()

		// I could also use cw.WriteRow here in this example since
		// the input is a slice of strings, but for most contexts
		// persons will have varying input data types in which
		// case NewRecord offers the most utility for a small
		// overhead cost. If you always have strings already on the
		// heap or you know they do not escape, then use WriteRow
		// instead.
		rw, err := cw.NewRecord()
		if err != nil {
			// note if you are just going to panic or are certain
			// the Writer state never errors unexpectedly / becomes
			// hard-locked, consider MustNewRecord() instead of
			// NewRecord()
			panic(err)
		}
		for _, s := range cr.Row() {
			rw.String(s)
		}
		if _, err := rw.Write(); err != nil {
			panic(err)
		}
	}
	if err := cr.Err(); err != nil {
		panic(err)
	}
}

Documentation ¶

Index ¶

Constants
Variables
type FieldWriter
- func (w *FieldWriter) AppendText(p []byte) ([]byte, error)
- func (w *FieldWriter) MarshalText() ([]byte, error)
type FieldWriterFactory
- func FieldWriters() FieldWriterFactory
- func (FieldWriterFactory) Bool(b bool) FieldWriter
- func (FieldWriterFactory) Bytes(p []byte) FieldWriter
- func (FieldWriterFactory) Duration(d time.Duration) FieldWriter
- func (FieldWriterFactory) Float64(f float64) FieldWriter
- func (FieldWriterFactory) Int(i int) FieldWriter
- func (FieldWriterFactory) Int64(i int64) FieldWriter
- func (FieldWriterFactory) Rune(r rune) FieldWriter
- func (FieldWriterFactory) String(s string) FieldWriter
- func (FieldWriterFactory) Time(t time.Time) FieldWriter
- func (FieldWriterFactory) Uint64(i uint64) FieldWriter
- func (FieldWriterFactory) UncheckedUTF8Bytes(p []byte) FieldWriter
- func (FieldWriterFactory) UncheckedUTF8String(s string) FieldWriter
type Reader
- func NewReader(options ...ReaderOption) (Reader, error)
type ReaderOption
type ReaderOptions
- func ReaderOpts() ReaderOptions
- func (ReaderOptions) BorrowFields(b bool) ReaderOption
- func (ReaderOptions) BorrowRow(b bool) ReaderOption
- func (ReaderOptions) ClearFreedDataMemory(b bool) ReaderOption
- func (ReaderOptions) Comment(r rune) ReaderOption
- func (ReaderOptions) CommentsAllowedAfterStartOfRecords(b bool) ReaderOption
- func (ReaderOptions) DiscoverRecordSeparator(b bool) ReaderOption
- func (ReaderOptions) ErrorOnNewlineInUnquotedField(b bool) ReaderOption
- func (ReaderOptions) ErrorOnNoByteOrderMarker(b bool) ReaderOption
- func (ReaderOptions) ErrorOnNoRows(b bool) ReaderOption
- func (ReaderOptions) ErrorOnQuotesInUnquotedField(b bool) ReaderOption
- func (ReaderOptions) Escape(r rune) ReaderOption
- func (ReaderOptions) ExpectHeaders(h ...string) ReaderOption
- func (ReaderOptions) FieldSeparator(r rune) ReaderOption
- func (ReaderOptions) InitialRecordBuffer(v []byte) ReaderOption
- func (ReaderOptions) InitialRecordBufferSize(v int) ReaderOption
- func (ReaderOptions) MaxCommentBytes(n int) ReaderOption
- func (ReaderOptions) MaxComments(n int) ReaderOption
- func (ReaderOptions) MaxFields(v uint) ReaderOption
- func (ReaderOptions) MaxRecordBytes(n int) ReaderOption
- func (ReaderOptions) MaxRecords(n uint64) ReaderOption
- func (ReaderOptions) NumFields(n int) ReaderOption
- func (ReaderOptions) Quote(r rune) ReaderOption
- func (ReaderOptions) Reader(r io.Reader) ReaderOption
- func (ReaderOptions) ReaderBuffer(v []byte) ReaderOption
- func (ReaderOptions) ReaderBufferSize(v int) ReaderOption
- func (ReaderOptions) RecordSeparator(s string) ReaderOption
- func (ReaderOptions) RemoveByteOrderMarker(b bool) ReaderOption
- func (ReaderOptions) RemoveHeaderRow(b bool) ReaderOption
- func (ReaderOptions) TerminalRecordSeparatorEmitsRecord(b bool) ReaderOption
- func (ReaderOptions) TrimHeaders(b bool) ReaderOption
type RecordWriter
- func (rw *RecordWriter) Bool(b bool) *RecordWriter
- func (rw *RecordWriter) Bytes(p []byte) *RecordWriter
- func (rw *RecordWriter) Duration(d time.Duration) *RecordWriter
- func (rw *RecordWriter) Empty() *RecordWriter
- func (rw *RecordWriter) Err() error
- func (rw *RecordWriter) Float64(f float64) *RecordWriter
- func (rw *RecordWriter) Int(i int) *RecordWriter
- func (rw *RecordWriter) Int64(i int64) *RecordWriter
- func (rw *RecordWriter) Rollback()
- func (rw *RecordWriter) Rune(r rune) *RecordWriter
- func (rw *RecordWriter) String(s string) *RecordWriter
- func (rw *RecordWriter) Time(t time.Time) *RecordWriter
- func (rw *RecordWriter) Uint64(i uint64) *RecordWriter
- func (rw *RecordWriter) UncheckedUTF8Bytes(p []byte) *RecordWriter
- func (rw *RecordWriter) UncheckedUTF8Rune(r rune) *RecordWriter
- func (rw *RecordWriter) UncheckedUTF8String(s string) *RecordWriter
- func (rw *RecordWriter) Write() (int, error)
type WriteHeaderOption
type WriteHeaderOptions
- func WriteHeaderOpts() WriteHeaderOptions
- func (WriteHeaderOptions) CommentLines(s ...string) WriteHeaderOption
- func (WriteHeaderOptions) CommentRune(r rune) WriteHeaderOption
- func (WriteHeaderOptions) Headers(h ...string) WriteHeaderOption
- func (WriteHeaderOptions) IncludeByteOrderMarker(b bool) WriteHeaderOption
- func (WriteHeaderOptions) TrimHeaders(b bool) WriteHeaderOption
type Writer
- func NewWriter(options ...WriterOption) (*Writer, error)
- func (w *Writer) Close() error
- func (w *Writer) MustNewRecord() *RecordWriter
- func (w *Writer) NewRecord() (*RecordWriter, error)
- func (w *Writer) WriteFieldRow(row ...FieldWriter) (int, error)
- func (w *Writer) WriteFieldRowBorrowed(row []FieldWriter) (int, error)
- func (w *Writer) WriteHeader(options ...WriteHeaderOption) (int, error)
- func (w *Writer) WriteRow(row ...string) (int, error)
type WriterOption
type WriterOptions
- func WriterOpts() WriterOptions
- func (WriterOptions) ClearFreedDataMemory(b bool) WriterOption
- func (WriterOptions) CommentRune(r rune) WriterOption
- func (WriterOptions) ErrorOnNonUTF8(v bool) WriterOption
- func (WriterOptions) Escape(r rune) WriterOption
- func (WriterOptions) FieldSeparator(v rune) WriterOption
- func (WriterOptions) InitialFieldBuffer(v []byte) WriterOption
- func (WriterOptions) InitialFieldBufferSize(v int) WriterOption
- func (WriterOptions) InitialRecordBuffer(v []byte) WriterOption
- func (WriterOptions) InitialRecordBufferSize(v int) WriterOption
- func (WriterOptions) NumFields(v int) WriterOption
- func (WriterOptions) Quote(v rune) WriterOption
- func (WriterOptions) RecordSeparator(s string) WriterOption
- func (WriterOptions) Writer(v io.Writer) WriterOption

Constants ¶

View Source

const (

	// ReaderMinBufferSize is the minimum value a ReaderBufferSize
	// option will allow. It is also the minimum length for any
	// ReaderBuffer slice argument. This is exported so
	// configuration which may not be hardcoded by the utilizing
	// author can more easily define validation logic and cite
	// the reason for the limit.
	//
	// Algorithms used in this lib cannot work with a smaller buffer
	// size than this - however in general ReaderBufferSize and
	// ReaderBuffer options should be used to tune and balance mem
	// constraints with performance gained via using larger amounts
	// of buffer space.
	ReaderMinBufferSize = utf8.UTFMax + rMaxOverflowNumBytes
)

Variables ¶

View Source

var (
	// classifications
	ErrIO         = errors.New("io error")
	ErrParsing    = errors.New("parsing error")
	ErrFieldCount = errors.New("field count error")
	ErrBadConfig  = errors.New("bad config")
	ErrSecOp      = errors.New("security error")

	// instances
	ErrTooManyFields                = errors.New("too many fields")
	ErrSecOpRecordByteCountAboveMax = errors.New("record byte count exceeds max")
	// ErrSecOpFieldCountAboveMax is a sub-instance of ErrTooManyFields
	ErrSecOpFieldCountAboveMax     = errors.New("field count exceeds max")
	ErrSecOpRecordCountAboveMax    = errors.New("record count exceeds max")
	ErrSecOpCommentBytesAboveMax   = errors.New("comment byte count exceeds max")
	ErrSecOpCommentsAboveMax       = errors.New("comment line count exceeds max")
	ErrNotEnoughFields             = errors.New("not enough fields")
	ErrReaderClosed                = errors.New("reader closed")
	ErrUnexpectedHeaderRowContents = errors.New("header row values do not match expectations")
	ErrBadRecordSeparator          = errors.New("record separator can only be one valid utf8 rune long or \"\\r\\n\"")
	ErrIncompleteQuotedField       = fmt.Errorf("incomplete quoted field: %w", io.ErrUnexpectedEOF)
	ErrQuoteInUnquotedField        = errors.New("quote found in unquoted field")
	ErrInvalidQuotedFieldEnding    = errors.New("unexpected character found after end of quoted field") // expecting field separator, record separator, quote char, or end of file if field count matches expectations
	ErrNoHeaderRow                 = fmt.Errorf("no header row: %w", io.ErrUnexpectedEOF)
	ErrNoRows                      = fmt.Errorf("no rows: %w", io.ErrUnexpectedEOF)
	ErrNoByteOrderMarker           = errors.New("no byte order marker")
	ErrNilReader                   = errors.New("nil reader")
	ErrInvalidEscSeqInQuotedField  = errors.New("invalid escape sequence in quoted field")
	ErrNewlineInUnquotedField      = errors.New("newline rune found in unquoted field")
	ErrUnexpectedQuoteAfterField   = errors.New("unexpected quote after quoted+escaped field")
	ErrUnsafeCRFileEnd             = fmt.Errorf("ended in a carriage return which must be quoted when record separator is CRLF: %w", io.ErrUnexpectedEOF)
)

View Source

var (
	ErrWriteHeaderFailed         = errors.New("write header failed")
	ErrRowNilOrEmpty             = errors.New("row is nil or empty")
	ErrNonUTF8InRecord           = errors.New("non-utf8 characters in record")
	ErrNonUTF8InComment          = errors.New("non-utf8 characters in comment")
	ErrWriterClosed              = errors.New("writer closed")
	ErrHeaderWritten             = errors.New("header already written")
	ErrInvalidFieldCountInRecord = errors.New("invalid field count in record")
	ErrInvalidRune               = errors.New("invalid rune")
	// ErrWriterNotReady describes the state when a writer is locked for use by an external writing implement such as a RecordWriter.
	ErrWriterNotReady = errors.New("writer not ready")
)

View Source

var (
	ErrInvalidFieldWriter = errors.New("invalid field writer")
)

View Source

var (
	// ErrRecordWriterClosed indicates that the RecordWriter has completed its lifecycle and cannot be used for further writing.
	// It does not indicate that a record was successfully written or not;
	// it only indicates that the RecordWriter instance is no longer usable for writing and is ready for garbage collection.
	//
	// For write success status, check the error return value from Write.
	ErrRecordWriterClosed = errors.New("record writer closed")
)

Functions ¶

This section is empty.

Types ¶

type FieldWriter ¶ added in v3.0.1

type FieldWriter struct {
	// contains filtered or unexported fields
}

func (*FieldWriter) AppendText ¶ added in v3.0.1

func (w *FieldWriter) AppendText(p []byte) ([]byte, error)

func (*FieldWriter) MarshalText ¶ added in v3.0.1

func (w *FieldWriter) MarshalText() ([]byte, error)

type FieldWriterFactory ¶ added in v3.0.1

type FieldWriterFactory struct{}

func FieldWriters ¶ added in v3.0.1

func FieldWriters() FieldWriterFactory

func (FieldWriterFactory) Bool ¶ added in v3.0.1

func (FieldWriterFactory) Bool(b bool) FieldWriter

func (FieldWriterFactory) Bytes ¶ added in v3.0.1

func (FieldWriterFactory) Bytes(p []byte) FieldWriter

Bytes creates a concrete FieldWriter that serializes the provided byte slice as a UTF-8 string.

WARNING: This method is likely to lead to additional allocations if the input data is not already on the heap due to how the Go compiler handles escape analysis. (consider using the fluent API on RecordWriter instead to avoid this issue)

func (FieldWriterFactory) Duration ¶ added in v3.0.1

func (FieldWriterFactory) Duration(d time.Duration) FieldWriter

func (FieldWriterFactory) Float64 ¶ added in v3.0.1

func (FieldWriterFactory) Float64(f float64) FieldWriter

func (FieldWriterFactory) Int ¶ added in v3.0.1

func (FieldWriterFactory) Int(i int) FieldWriter

func (FieldWriterFactory) Int64 ¶ added in v3.0.1

func (FieldWriterFactory) Int64(i int64) FieldWriter

func (FieldWriterFactory) Rune ¶ added in v3.0.1

func (FieldWriterFactory) Rune(r rune) FieldWriter

Rune value must be a valid utf8 rune value otherwise attempting to write the rune will result in an ErrInvalidRune error.

func (FieldWriterFactory) String ¶ added in v3.0.1

func (FieldWriterFactory) String(s string) FieldWriter

func (FieldWriterFactory) Time ¶ added in v3.0.1

func (FieldWriterFactory) Time(t time.Time) FieldWriter

func (FieldWriterFactory) Uint64 ¶ added in v3.0.1

func (FieldWriterFactory) Uint64(i uint64) FieldWriter

func (FieldWriterFactory) UncheckedUTF8Bytes ¶ added in v3.2.0

func (FieldWriterFactory) UncheckedUTF8Bytes(p []byte) FieldWriter

UncheckedUTF8Bytes serializes the same way as Bytes except that the content is not validated for utf8 compliance in any way.

Please consider this to be a micro optimization and prefer Bytes instead should there be any uncertainty in the encoding of the byte contents.

WARNING: Using this method with invalid UTF-8 data will produce invalid CSV output.

WARNING: This method is likely to lead to additional allocations if the input data is not already on the heap due to how the Go compiler handles escape analysis. (consider using the fluent API on RecordWriter instead to avoid this issue)

func (FieldWriterFactory) UncheckedUTF8String ¶ added in v3.2.0

func (FieldWriterFactory) UncheckedUTF8String(s string) FieldWriter

UncheckedUTF8String serializes the same way as String except that the content is not validated for utf8 compliance in any way.

Please consider this to be a micro optimization and prefer String instead should there be any uncertainty in the encoding of the byte contents.

WARNING: Using this method with invalid UTF-8 data will produce invalid CSV output.

type Reader ¶

type Reader interface {
	Close() error
	Err() error
	IntoIter() iter.Seq[[]string]
	Row() []string
	Scan() bool
}

func NewReader ¶

func NewReader(options ...ReaderOption) (Reader, error)

NewReader creates a new instance of a CSV reader which is not safe for concurrent reads.

type ReaderOption ¶

type ReaderOption func(*rCfg)

type ReaderOptions ¶

type ReaderOptions struct{}

ReaderOptions should never be instantiated manually

Instead call ReaderOpts()

This is only exported to allow godocs to discover the exported methods.

ReaderOptions will never have exported members and the zero value is not part of the semver guarantee. Instantiate it incorrectly at your own peril.

Calling the function is a nop that is compiled away anyways, you will not optimize anything at all. Use ReaderOpts()!

func ReaderOpts ¶

func ReaderOpts() ReaderOptions

func (ReaderOptions) BorrowFields ¶

func (ReaderOptions) BorrowFields(b bool) ReaderOption

BorrowFields alters the Row function to return strings that directly reference the internal buffer without copying. This is UNSAFE and can lead to memory corruption if not handled properly.

WARNING: Specifying this option as true while BorrowRow is false will result in an error.

DANGER: Only set to true if you guarantee that field strings are NEVER used after the next call to Scan or Close. Otherwise, you MUST clone both the slice AND the strings within it via strings.Clone(). Failure to do so can lead to memory corruption as the underlying buffer will be reused.

Example of safe usage:

for reader.Scan() {
  row := reader.Row()
  // Process row immediately without storing references
  processRow(row[0], row[1])
}
if reader.Err() != nil { ... }

Example of UNSAFE usage that will lead to bugs:

var savedStrings []string
for reader.Scan() {
  row := reader.Row()
  savedStrings = append(savedStrings, row[0]) // WRONG! Will be corrupted
}
if reader.Err() != nil { ... }

This should be considered a micro-optimization only for performance-critical code paths where profiling has identified string copying as a bottleneck.

func (ReaderOptions) BorrowRow ¶

func (ReaderOptions) BorrowRow(b bool) ReaderOption

BorrowRow alters the Row function to return the same slice instance each time with the strings inside set to different values.

Only set to true if the returned row slice is never used or modified after the next call to Scan or Close. You must clone the slice if doing otherwise.

See BorrowFields() if you wish to also remove allocations related to cloning strings into the slice.

Please consider this to be a micro optimization in most circumstances just because is tightens the usage contract of the returned row in ways most would not normally consider.

func (ReaderOptions) ClearFreedDataMemory ¶

func (ReaderOptions) ClearFreedDataMemory(b bool) ReaderOption

ClearFreedDataMemory ensures that whenever a shared memory buffer that contains data goes out of scope that zero values are written to every byte within the buffer.

This may significantly degrade performance and is recommended only for sensitive data or long-lived processes.

func (ReaderOptions) Comment ¶

func (ReaderOptions) Comment(r rune) ReaderOption

func (ReaderOptions) CommentsAllowedAfterStartOfRecords ¶

func (ReaderOptions) CommentsAllowedAfterStartOfRecords(b bool) ReaderOption

func (ReaderOptions) DiscoverRecordSeparator ¶

func (ReaderOptions) DiscoverRecordSeparator(b bool) ReaderOption

func (ReaderOptions) ErrorOnNewlineInUnquotedField ¶

func (ReaderOptions) ErrorOnNewlineInUnquotedField(b bool) ReaderOption

func (ReaderOptions) ErrorOnNoByteOrderMarker ¶

func (ReaderOptions) ErrorOnNoByteOrderMarker(b bool) ReaderOption

func (ReaderOptions) ErrorOnNoRows ¶

func (ReaderOptions) ErrorOnNoRows(b bool) ReaderOption

ErrorOnNoRows causes cr.Err() to return ErrNoRows should the reader stream terminate before any data records are parsed.

func (ReaderOptions) ErrorOnQuotesInUnquotedField ¶

func (ReaderOptions) ErrorOnQuotesInUnquotedField(b bool) ReaderOption

func (ReaderOptions) Escape ¶

func (ReaderOptions) Escape(r rune) ReaderOption

Escape is useful for specifying what character is used to escape a quote in a field and the literal escape character itself.

Without specifying this option a quote character is expected to be escaped by it just being doubled while the overall field is wrapped in quote characters.

This is mainly useful when processing a spark csv file as it does not follow strict rfc4180.

So set to '\\' if you have this need.

It is not valid to use this option without specifically setting a quote. Doing so will result in an error being returned on Reader creation.

func (ReaderOptions) ExpectHeaders ¶

func (ReaderOptions) ExpectHeaders(h ...string) ReaderOption

ExpectHeaders causes the first row to be recognized as a header row.

If the slice of header values does not match then the reader will error.

func (ReaderOptions) FieldSeparator ¶

func (ReaderOptions) FieldSeparator(r rune) ReaderOption

func (ReaderOptions) InitialRecordBuffer ¶

func (ReaderOptions) InitialRecordBuffer(v []byte) ReaderOption

InitialRecordBuffer is a hint to pre-allocate record buffer space once externally and pipe it in to reduce the number of re-allocations when processing a reader and reuse it at a later time after the reader is closed.

This option should generally not be used. It only exists to assist with processing large numbers of CSV files should memory be a clear constraint. There is no guarantee this buffer will always be used till the end of the csv Reader's lifecycle.

Please consider this to be a micro optimization in most circumstances just because is tightens the usage contract of the csv Reader in ways most would not normally consider.

func (ReaderOptions) InitialRecordBufferSize ¶

func (ReaderOptions) InitialRecordBufferSize(v int) ReaderOption

InitialRecordBufferSize is a hint to pre-allocate record buffer space once and reduce the number of re-allocations when processing a reader.

Please consider this to be a micro optimization in most circumstances just because it's not likely that most users will know the maximum total record size they wish to target / be under and it's generally a better practice to leave these details to the go runtime to coordinate via standard garbage collection.

func (ReaderOptions) MaxCommentBytes ¶

func (ReaderOptions) MaxCommentBytes(n int) ReaderOption

MaxCommentBytes is a security option that limits the number of bytes allowed in a comment line before a SecOp error is thrown

func (ReaderOptions) MaxComments ¶

func (ReaderOptions) MaxComments(n int) ReaderOption

MaxComments is a security option that limits the number of comment lines allowed in a stream before a SecOp error is thrown

func (ReaderOptions) MaxFields ¶

func (ReaderOptions) MaxFields(v uint) ReaderOption

MaxFields is a security option that limits the number of fields allowed to be detected automatically before a SecOp error is thrown

using this option at the same time as the NumFields option will lead to an error on reader creation since using both is counter intuitive in general

func (ReaderOptions) MaxRecordBytes ¶

func (ReaderOptions) MaxRecordBytes(n int) ReaderOption

MaxRecordBytes is a security option that limits the number of bytes allowed to be detected in a record before a SecOp error is thrown

func (ReaderOptions) MaxRecords ¶

func (ReaderOptions) MaxRecords(n uint64) ReaderOption

MaxRecords is a security option that limits the number of records allowed in a stream before a SecOp error is thrown

func (ReaderOptions) NumFields ¶

func (ReaderOptions) NumFields(n int) ReaderOption

func (ReaderOptions) Quote ¶

func (ReaderOptions) Quote(r rune) ReaderOption

func (ReaderOptions) Reader ¶

func (ReaderOptions) Reader(r io.Reader) ReaderOption

func (ReaderOptions) ReaderBuffer ¶

func (ReaderOptions) ReaderBuffer(v []byte) ReaderOption

ReaderBuffer will only accept a slice with a length greater than or equal to ReaderMinBufferSize otherwise an error will be thrown when creating the reader instance. Only up to the length of the slice is utilized during buffering operations. Capacity of the provided slice is not utilized in any way.

func (ReaderOptions) ReaderBufferSize ¶

func (ReaderOptions) ReaderBufferSize(v int) ReaderOption

ReaderBufferSize will only accept a value greater than or equal to ReaderMinBufferSize otherwise an error will be thrown when creating the reader instance.

func (ReaderOptions) RecordSeparator ¶

func (ReaderOptions) RecordSeparator(s string) ReaderOption

func (ReaderOptions) RemoveByteOrderMarker ¶

func (ReaderOptions) RemoveByteOrderMarker(b bool) ReaderOption

func (ReaderOptions) RemoveHeaderRow ¶

func (ReaderOptions) RemoveHeaderRow(b bool) ReaderOption

RemoveHeaderRow causes the first row to be recognized as a header row.

The row will be skipped over by Scan() and will not be returned by Row().

func (ReaderOptions) TerminalRecordSeparatorEmitsRecord ¶

func (ReaderOptions) TerminalRecordSeparatorEmitsRecord(b bool) ReaderOption

TerminalRecordSeparatorEmitsRecord only exists to acknowledge an edge case when processing csv documents that contain one column. If the file contents end in a record separator it's impossible to determine if that should indicate that a new record with an empty field should be emitted unless that record is enclosed in quotes or a config option like this exists.

In most cases this should not be an issue, unless the dataset is a single column list that allows empty strings for some use case and the writer used to create the file chooses to not always write the last record followed by a record separator. (treating the record separator like a record terminator)

func (ReaderOptions) TrimHeaders ¶

func (ReaderOptions) TrimHeaders(b bool) ReaderOption

TrimHeaders causes the first row to be recognized as a header row and all values are returned with whitespace trimmed.

type RecordWriter ¶ added in v3.3.0

type RecordWriter struct {
	// contains filtered or unexported fields
}

RecordWriter instances must always have life-cycles that end with calls to Write and/or Rollback.

Failure to do so will leave the parent writer in a locked state.

func (*RecordWriter) Bool ¶ added in v3.3.0

func (rw *RecordWriter) Bool(b bool) *RecordWriter

Bool appends a bool field to the current record, where true = 1 and false = 0.

func (*RecordWriter) Bytes ¶ added in v3.3.0

func (rw *RecordWriter) Bytes(p []byte) *RecordWriter

Bytes appends a byte slice field to the current record.

The byte slice is treated as UTF-8 encoded data and validated as such before writing unless the Writer was created with the ErrorOnNonUTF8 option set to false.

If the byte slice contains invalid UTF-8 and UTF-8 validation is enabled, the RecordWriter instance will enter an error state retrievable through the Err() method or eventually observable through a terminating Write call.

func (*RecordWriter) Duration ¶ added in v3.3.0

func (rw *RecordWriter) Duration(d time.Duration) *RecordWriter

Duration appends a time.Duration field to the current record as its int64 nanosecond count base-10 string representation.

func (*RecordWriter) Empty ¶ added in v3.4.0

func (rw *RecordWriter) Empty() *RecordWriter

Empty appends an empty (zero-length) field to the current record.

It does not clear or reset the record or RecordWriter; it only appends a new empty field. To abandon the current record, use Rollback.

This function is useful when the calling context is implementing a custom serialization scheme and needs to explicitly write empty fields. It is functionally equivalent to writing an empty string field through the String or Bytes methods but faster and more explicit in intent.

func (*RecordWriter) Err ¶ added in v3.3.0

func (rw *RecordWriter) Err() error

Err returns any error that has occurred during the lifecycle of the RecordWriter instance.

Once Rollback or Write has been called, Err will always return a non-nil error value. If no error occurred during the RecordWriter lifecycle, Err will return ErrRecordWriterClosed. ErrRecordWriterClosed does not indicate that a record was successfully written; it only indicates that the RecordWriter instance is no longer usable for writing. For write success status, check the error return value from Write.

func (*RecordWriter) Float64 ¶ added in v3.3.0

func (rw *RecordWriter) Float64(f float64) *RecordWriter

Float64 appends a base-10 encoded float64 field to the current record using strconv.FormatFloat with fmt='g'.

func (*RecordWriter) Int ¶ added in v3.3.0

func (rw *RecordWriter) Int(i int) *RecordWriter

Int appends a base-10 encoded int field to the current record.

func (*RecordWriter) Int64 ¶ added in v3.3.0

func (rw *RecordWriter) Int64(i int64) *RecordWriter

Int64 appends a base-10 encoded int64 field to the current record.

func (*RecordWriter) Rollback ¶ added in v3.3.0

func (rw *RecordWriter) Rollback()

Rollback releases the csv Writer for additional writing through another RecordWriter instance or other means without flushing the record buffer to the io.Writer within the csv Writer instance.

This RecordWriter instance cannot be used for further writing after this call.

This function is always safe to call even if the RecordWriter instance is already closed (write success, errored, or skipped), in some error state, or previously rolled back.

If a previous Write call was attempted, Rollback will have no meaningful effect of any kind. Same applies if the RecordWriter instance was previously rolled back.

Calling rollback will not change any existing error state in the RecordWriter instance should it be non-nil. If the error state is nil prior to calling Rollback, it will be set to ErrRecordWriterClosed. The error state can be retrieved through the Err() method.

func (*RecordWriter) Rune ¶ added in v3.3.0

func (rw *RecordWriter) Rune(r rune) *RecordWriter

Rune appends a rune field to the current record.

The rune value is treated as UTF-8 encoded data and validated as such before writing.

Please note that only valid runes can be written and attempting to write anything else will lead to an ErrInvalidRune error state in the RecordWriter instance. The error can be retrieved through the Err() method or eventually observable through a terminating Write call.

The Writer option ErrorOnNonUTF8 does not affect this behavior!

func (*RecordWriter) String ¶ added in v3.3.0

func (rw *RecordWriter) String(s string) *RecordWriter

String appends a string field to the current record.

The string is treated as UTF-8 encoded data and validated as such before writing unless the Writer was created with the ErrorOnNonUTF8 option set to false.

If the string is invalid UTF-8 and UTF-8 validation is enabled, the RecordWriter instance will enter an error state retrievable through the Err() method or eventually observable through a terminating Write call.

func (*RecordWriter) Time ¶ added in v3.3.0

func (rw *RecordWriter) Time(t time.Time) *RecordWriter

Time appends a time.Time field to the current record as its string representation.

func (*RecordWriter) Uint64 ¶ added in v3.3.0

func (rw *RecordWriter) Uint64(i uint64) *RecordWriter

Uint64 appends a base-10 encoded uint64 field to the current record.

func (*RecordWriter) UncheckedUTF8Bytes ¶ added in v3.3.0

func (rw *RecordWriter) UncheckedUTF8Bytes(p []byte) *RecordWriter

UncheckedUTF8Bytes appends a byte slice field to the current record in a similar manner to Bytes but skips UTF-8 validation.

Please consider this to be a micro optimization and prefer Bytes instead should there be any uncertainty in the encoding of the byte contents.

WARNING: Using this method with invalid UTF-8 data will produce invalid CSV output.

func (*RecordWriter) UncheckedUTF8Rune ¶ added in v3.5.0

func (rw *RecordWriter) UncheckedUTF8Rune(r rune) *RecordWriter

UncheckedUTF8Rune appends a rune field to the current record similarly to Rune but skips rune validation.

It will not set an internal error state if a rune cannot be normally encoded to UTF8. Instead, invalid UTF-8 runes will be encoded as the UTF-8 replacement character.

Please consider this to be a micro optimization and prefer Rune instead should there be any uncertainty in the rune value being a valid utf8 encodable value.

WARNING: Invalid UTF-8 runes will be encoded as the UTF-8 replacement character.

func (*RecordWriter) UncheckedUTF8String ¶ added in v3.3.0

func (rw *RecordWriter) UncheckedUTF8String(s string) *RecordWriter

UncheckedUTF8String appends a string field to the current record in a similar manner to String but skips UTF-8 validation.

Please consider this to be a micro optimization and prefer String instead should there be any uncertainty in the encoding of the byte contents.

WARNING: Using this method with invalid UTF-8 data will produce invalid CSV output.

func (*RecordWriter) Write ¶ added in v3.3.0

func (rw *RecordWriter) Write() (int, error)

Write flushes the record buffer to the io.Writer within the csv Writer instance and releases the csv Writer for additional writing through another RecordWriter instance or other means.

If the write is successful then this function will return a nil error. It is the only opportunity to retrieve such a "write success" status from the RecordWriter instance.

This RecordWriter instance cannot be used for further writing after this call.

type WriteHeaderOption ¶

type WriteHeaderOption func(*whCfg)

type WriteHeaderOptions ¶

type WriteHeaderOptions struct{}

WriteHeaderOptions should never be instantiated manually

Instead call WriteHeaderOpts()

This is only exported to allow godocs to discover the exported methods.

WriteHeaderOptions will never have exported members and the zero value is not part of the semver guarantee. Instantiate it incorrectly at your own peril.

Calling the function is a nop that is compiled away anyways, you will not optimize anything at all. Use WriteHeaderOpts()!

func WriteHeaderOpts ¶

func WriteHeaderOpts() WriteHeaderOptions

func (WriteHeaderOptions) CommentLines ¶

func (WriteHeaderOptions) CommentLines(s ...string) WriteHeaderOption

func (WriteHeaderOptions) CommentRune ¶

func (WriteHeaderOptions) CommentRune(r rune) WriteHeaderOption

CommentRune specifies that each comment line begins with this specific rune followed by a space when writing a csv document Header.

If you need comment parsing consistency and do not always call WriteHeader then instead use the CommentRune option when creating the writer instance and avoid using this option.

In general you should avoid using this option and instead specify CommentRune when calling NewWriter unless you understand and accept the indeterminism risks.

func (WriteHeaderOptions) Headers ¶

func (WriteHeaderOptions) Headers(h ...string) WriteHeaderOption

func (WriteHeaderOptions) IncludeByteOrderMarker ¶

func (WriteHeaderOptions) IncludeByteOrderMarker(b bool) WriteHeaderOption

func (WriteHeaderOptions) TrimHeaders ¶

func (WriteHeaderOptions) TrimHeaders(b bool) WriteHeaderOption

type Writer ¶

type Writer struct {
	// contains filtered or unexported fields
}

func NewWriter ¶

func NewWriter(options ...WriterOption) (*Writer, error)

NewWriter creates a new instance of a CSV writer which is not safe for concurrent reads.

func (*Writer) Close ¶

func (w *Writer) Close() error

Close should be called after writing all rows successfully to the underlying writer.

Close currently always returns nil, but in the future it may not.

Should any configuration options require post-flight checks they will be implemented here.

It will never attempt to flush or close the underlying writer instance. That is left to the calling context.

func (*Writer) MustNewRecord ¶ added in v3.3.0

func (w *Writer) MustNewRecord() *RecordWriter

MustNewRecord is like NewRecord but panics if a RecordWriter cannot be created.

It panics in two situations:

when the Writer is no longer usable because it has already observed an error or has been closed (I/O or lifecycle error), or
when a previous RecordWriter is still active and has not been finalized with Write or Rollback (programmer misuse).

This helper is intended for applications that choose to treat an unusable Writer (closed, already errored, or with an active RecordWriter) as a fatal condition. Callers that need to recover from these states should use NewRecord and handle its error result instead. When using MustNewRecord, the caller is responsible for only invoking it on Writer instances that have not yet observed an error and that do not currently have an active RecordWriter.

func (*Writer) NewRecord ¶ added in v3.3.0

func (w *Writer) NewRecord() (*RecordWriter, error)

NewRecord creates a new RecordWriter instance associated with the parent csv Writer instance.

The returned RecordWriter instance must have its lifecycle ended with a call to Write and/or Rollback.

Concurrent calls to NewRecord are not supported.

While the RecordWriter is active, the parent Writer instance is locked from additional writing until the RecordWriter's lifecycle is ended with a call to Write or Rollback.

If another RecordWriter is already active, NewRecord will return a nil RecordWriter and ErrWriterNotReady.

If the parent Writer instance is in an error state or closed, NewRecord will return a nil RecordWriter and the existing error.

func (*Writer) WriteFieldRow ¶ added in v3.0.1

func (w *Writer) WriteFieldRow(row ...FieldWriter) (int, error)

WriteFieldRow will take a vararg collection of FieldWriter instances and write them as a csv record row.

Each subsequent call to WriteRow, WriteFieldRow, or WriteFieldRowBorrowed should have the same slice length.

This call will copy the provided list of field writers to an internally maintained buffer for amortized access and removal of allocations due to the slice escaping.

If the calling context maintains a reused slice of field writers per write iteration then consider instead using WriteFieldRowBorrowed if performance testing indicates that FieldWriter slice copying is a major contributing bottleneck for your case.

func (*Writer) WriteFieldRowBorrowed ¶ added in v3.0.1

func (w *Writer) WriteFieldRowBorrowed(row []FieldWriter) (int, error)

WriteFieldRowBorrowed is similar to WriteFieldRow except the slice of rows provided is expected to be externally maintained and reused. In such a case this function will be faster than WriteFieldRow, but really it should only be used if performance testing indicates copying of field writers that occurs in WriteFieldRow is a bottleneck

Each subsequent call to WriteRow, WriteFieldRow, or WriteFieldRowBorrowed should have the same slice length.

func (*Writer) WriteHeader ¶

func (w *Writer) WriteHeader(options ...WriteHeaderOption) (int, error)

func (*Writer) WriteRow ¶

func (w *Writer) WriteRow(row ...string) (int, error)

WriteRow writes a vararg collection of strings as a csv record row.

Each subsequent call to WriteRow, WriteFieldRow, or WriteFieldRowBorrowed should have the same slice length.

type WriterOption ¶

type WriterOption func(*wCfg)

type WriterOptions ¶

type WriterOptions struct{}

WriterOptions should never be instantiated manually

Instead call WriterOpts()

This is only exported to allow godocs to discover the exported methods.

WriterOptions will never have exported members and the zero value is not part of the semver guarantee. Instantiate it incorrectly at your own peril.

Calling the function is a nop that is compiled away anyways, you will not optimize anything at all. Use WriterOpts()!

func WriterOpts ¶

func WriterOpts() WriterOptions

func (WriterOptions) ClearFreedDataMemory ¶

func (WriterOptions) ClearFreedDataMemory(b bool) WriterOption

ClearFreedDataMemory ensures that whenever a shared memory buffer that contains data goes out of scope that zero values are written to every byte within the buffer.

This may significantly degrade performance and is recommended only for sensitive data or long-lived processes.

func (WriterOptions) CommentRune ¶ added in v3.2.0

func (WriterOptions) CommentRune(r rune) WriterOption

CommentRune ensures that even if the WriterHeader function is not called that the output doc is still parsable with the comment header enabled.

If you need comment parsing consistency and do not always call WriteHeader then use this option at this level instead of the WriteHeader option also named CommentRune.

func (WriterOptions) ErrorOnNonUTF8 ¶

func (WriterOptions) ErrorOnNonUTF8(v bool) WriterOption

func (WriterOptions) Escape ¶

func (WriterOptions) Escape(r rune) WriterOption

func (WriterOptions) FieldSeparator ¶

func (WriterOptions) FieldSeparator(v rune) WriterOption

func (WriterOptions) InitialFieldBuffer ¶ added in v3.0.1

func (WriterOptions) InitialFieldBuffer(v []byte) WriterOption

InitialFieldBuffer is deprecated and no longer has any effect.

Historically:

InitialFieldBuffer is a hint to pre-allocate field buffer space once externally and pipe it in to reduce the number of re-allocations when processing a writer and reuse it at a later time after the writer is closed.

This option should generally not be used. It only exists to assist with processing large numbers of CSV files should memory be a clear constraint. There is no guarantee this buffer will always be used till the end of the csv Writer's lifecycle.

Please consider this to be a micro optimization in most circumstances just because is tightens the usage contract of the csv Reader in ways most would not normally consider.

func (WriterOptions) InitialFieldBufferSize ¶ added in v3.0.1

func (WriterOptions) InitialFieldBufferSize(v int) WriterOption

InitialFieldBufferSize is deprecated and no longer has any effect.

Historically:

InitialFieldBufferSize is a hint to pre-allocate field buffer space once and reduce the number of re-allocations when processing fields to write.

Please consider this to be a micro optimization in most circumstances just because it's not likely that most users will know the maximum total field size they wish to target / be under and it's generally a better practice to leave these details to the go runtime to coordinate via standard garbage collection.

func (WriterOptions) InitialRecordBuffer ¶ added in v3.0.1

func (WriterOptions) InitialRecordBuffer(v []byte) WriterOption

InitialRecordBuffer is a hint to pre-allocate record buffer space once externally and pipe it in to reduce the number of re-allocations when processing a writer and reuse it at a later time after the writer is closed.

This option should generally not be used. It only exists to assist with processing large numbers of CSV files should memory be a clear constraint. There is no guarantee this buffer will always be used till the end of the csv Writer's lifecycle.

Please consider this to be a micro optimization in most circumstances just because is tightens the usage contract of the csv Reader in ways most would not normally consider.

func (WriterOptions) InitialRecordBufferSize ¶ added in v3.0.1

func (WriterOptions) InitialRecordBufferSize(v int) WriterOption

InitialRecordBufferSize is a hint to pre-allocate record buffer space once and reduce the number of re-allocations when processing fields to write.

Please consider this to be a micro optimization in most circumstances just because it's not likely that most users will know the maximum total record size they wish to target / be under and it's generally a better practice to leave these details to the go runtime to coordinate via standard garbage collection.

func (WriterOptions) NumFields ¶

func (WriterOptions) NumFields(v int) WriterOption

func (WriterOptions) Quote ¶

func (WriterOptions) Quote(v rune) WriterOption

func (WriterOptions) RecordSeparator ¶

func (WriterOptions) RecordSeparator(s string) WriterOption

func (WriterOptions) Writer ¶

func (WriterOptions) Writer(v io.Writer) WriterOption

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
internal
cmd/generate command
examples/reader command
examples/writer command

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL