NAV Navbar

YAML

To install:

npm install yaml
# or
yarn add yaml

yaml is a new definitive library for YAML, a human friendly data serialization standard. This library:

The library is released under the ISC open source license, and the code is available on GitHub. It runs on Node.js 6 and later with no external dependencies, and in browsers from IE 11 upwards (Note: @babel/runtime is used only by the "browser" entry point).

API Overview

The API provided by yaml has three layers, depending on how deep you need to go: Parse & Stringify, Documents, and the CST Parser. The first has the simplest API and "just works", the second gets you all the bells and whistles supported by the library along with a decent AST, and the third is the closest to YAML source, making it fast, raw, and crude.

Parse & Stringify

import YAML from 'yaml'
// or
const YAML = require('yaml')

Documents

import { Pair, YAMLMap, YAMLSeq } from 'yaml/types'

CST Parser

import parseCST from 'yaml/parse-cst'

Parse & Stringify

# file.yml
YAML:
  - A human-readable data serialization language
  - https://en.wikipedia.org/wiki/YAML
yaml:
  - A complete JavaScript implementation
  - https://www.npmjs.com/package/yaml

At its simplest, you can use YAML.parse(str) and YAML.stringify(value) just as you'd use JSON.parse(str) and JSON.stringify(value). If that's enough for you, everything else in these docs is really just implementation details.

YAML.parse

import fs from 'fs'
import YAML from 'yaml'

YAML.parse('3.14159')
// 3.14159

YAML.parse('[ true, false, maybe, null ]\n')
// [ true, false, 'maybe', null ]

const file = fs.readFileSync('./file.yml', 'utf8')
YAML.parse(file)
// { YAML:
//   [ 'A human-readable data serialization language',
//     'https://en.wikipedia.org/wiki/YAML' ],
//   yaml:
//   [ 'A complete JavaScript implementation',
//     'https://www.npmjs.com/package/yaml' ] }

YAML.parse(str, options = {}): any

str should be a string with YAML formatting. See Options for more information on the second parameter, an optional configuration object.

The returned value will match the type of the root value of the parsed YAML document, so Maps become objects, Sequences arrays, and scalars result in nulls, booleans, numbers and strings.

YAML.parse may throw on error, and it may log warnings using console.warn. It only supports input consisting of a single YAML document; for multi-document support you should use YAML.parseAllDocuments.

YAML.stringify

YAML.stringify(3.14159)
// '3.14159\n'

YAML.stringify([true, false, 'maybe', null])
// `- true
// - false
// - maybe
// - null
// `

YAML.stringify({ number: 3, plain: 'string', block: 'two\nlines\n' })
// `number: 3
// plain: string
// block: >
//   two
//
//   lines
// `

YAML.stringify(value, options = {}): string

value can be of any type. The returned string will always include \n as the last character, as is expected of YAML documents. See Options for more information on the second parameter, an optional configuration object.

As strings in particular may be represented in a number of different styles, the simplest option for the value in question will always be chosen, depending mostly on the presence of escaped or control characters and leading & trailing whitespace.

To create a stream of documents, you may call YAML.stringify separately for each document's value, and concatenate the documents with the string ...\n as a separator.

Options

YAML.defaultOptions
// { keepBlobsInJSON: true,
//   keepNodeTypes: true,
//   version: '1.2' }

YAML.Document.defaults
// { '1.0': { merge: true, schema: 'yaml-1.1' },
//   '1.1': { merge: true, schema: 'yaml-1.1' },
//   '1.2': { merge: false, schema: 'core' } }

YAML.defaultOptions

YAML.Document.defaults

yaml defines options in three places: as an argument of parse, create and stringify calls, in the values of YAML.defaultOptions, and in the version-dependent YAML.Document.defaults object. Values set in YAML.defaultOptions override version-dependent defaults, and argument options override both.

The version option value ('1.2' by default) may be overridden by any document-specific %YAML directive.

Option Type Description
anchorPrefix string Default prefix for anchors. By default 'a', resulting in anchors a1, a2, etc.
customTags Tag[] | function Array of additional (custom) tags to include in the schema
keepBlobsInJSON boolean Allow non-JSON JavaScript objects to remain in the toJSON output. Relevant with the YAML 1.1 !!timestamp and !!binary tags. By default true.
keepCstNodes boolean Include references in the AST to each node's corresponding CST node. By default false.
keepNodeTypes boolean Store the original node type when parsing documents. By default true.
mapAsMap boolean When outputting JS, use Map rather than Object to represent mappings. By default false.
maxAliasCount number Prevent exponential entity expansion attacks by limiting data aliasing count; set to -1 to disable checks; 0 disallows all alias nodes. By default 100.
merge boolean Enable support for << merge keys. By default false for YAML 1.2 and true for earlier versions.
prettyErrors boolean Include line position & node type directly in errors; drop their verbose source and context. By default false.
schema 'core' | 'failsafe' | 'json' | 'yaml-1.1' The base schema to use. By default 'core' for YAML 1.2 and 'yaml-1.1' for earlier versions.
version string The YAML version used by documents without a %YAML directive. By default '1.2'.

Data Schemas

YAML.parse('3') // 3
YAML.parse('3', { schema: 'failsafe' }) // '3'

YAML.parse('No') // 'No'
YAML.parse('No', { schema: 'json' }) // SyntaxError: Unresolved plain scalar "No"
YAML.parse('No', { schema: 'yaml-1.1' }) // false
YAML.parse('No', { version: '1.1' }) // false

YAML.parse('{[1, 2]: many}') // { '[1,2]': 'many' }
YAML.parse('{[1, 2]: many}', { mapAsMap: true }) // Map { [ 1, 2 ] => 'many' }

Aside from defining the language structure, the YAML 1.2 spec defines a number of different schemas that may be used. The default is the core schema, which is the most common one. The json schema is effectively the minimum schema required to parse JSON; both it and the core schema are supersets of the minimal failsafe schema.

The yaml-1.1 schema matches the more liberal YAML 1.1 types (also used by YAML 1.0), including binary data and timestamps as distinct tags as well as accepting greater variance in scalar values (with e.g. 'No' being parsed as false rather than a string value). The !!value and !!yaml types are not supported.

YAML.defaultOptions.merge = true

const mergeResult = YAML.parse(`
source: &base { a: 1, b: 2 }
target:
  <<: *base
  b: base
`)

mergeResult.target
// { a: 1, b: 'base' }

Merge keys are a YAML 1.1 feature that is not a part of the 1.2 spec. To use a merge key, assign an alias node or an array of alias nodes as the value of a << key in a mapping.

Tag Stringifier Options

import { binaryOptions, boolOptions, nullOptions, strOptions } from 'yaml/types'

binaryOptions // Used by !!binary, part of the yaml-1.1 schema
// { defaultType: 'BLOCK_LITERAL', lineWidth: 76 }

boolOptions
// { trueStr: 'true', falseStr: 'false' }

nullOptions
// { nullStr: 'null' }

strOptions
// { defaultType: 'PLAIN',
//   doubleQuoted: { jsonEncoding: false, minMultiLineLength: 40 },
//   fold: { lineWidth: 80, minContentWidth: 20 } }

YAML.stringify({ 'this is': null })
// this is: null

nullOptions.nullStr = '~'
strOptions.defaultType = 'QUOTE_SINGLE'
YAML.stringify({ 'this is': null })
// 'this is': ~

To customise the YAML stringification, some options objects are exported from 'yaml/types'. Note that these values are used by all documents. For example, to disable the automatic line wrapping, set strOptions.fold.lineWidth to 0.

Documents

In order to work with YAML features not directly supported by native JavaScript data types, such as comments, anchors and aliases, yaml provides the YAML.Document API.

Parsing Documents

import fs from 'fs'
import YAML from 'yaml'

const file = fs.readFileSync('./file.yml', 'utf8')
const doc = YAML.parseDocument(file)
doc.contents
// YAMLMap {
//   items:
//    [ Pair {
//        key: Scalar { value: 'YAML', range: [ 0, 4 ] },
//        value:
//         YAMLSeq {
//           items:
//            [ Scalar {
//                value: 'A human-readable data serialization language',
//                range: [ 10, 55 ] },
//              Scalar {
//                value: 'https://en.wikipedia.org/wiki/YAML',
//                range: [ 59, 94 ] } ],
//           tag: 'tag:yaml.org,2002:seq',
//           range: [ 8, 94 ] } },
//      Pair {
//        key: Scalar { value: 'yaml', range: [ 94, 98 ] },
//        value:
//         YAMLSeq {
//           items:
//            [ Scalar {
//                value: 'A complete JavaScript implementation',
//                range: [ 104, 141 ] },
//              Scalar {
//                value: 'https://www.npmjs.com/package/yaml',
//                range: [ 145, 180 ] } ],
//           tag: 'tag:yaml.org,2002:seq',
//           range: [ 102, 180 ] } } ],
//   tag: 'tag:yaml.org,2002:map',
//   range: [ 0, 180 ] }

YAML.parseDocument(str, options = {}): YAML.Document

Parses a single YAML.Document from the input str; used internally by YAML.parse. Will include an error if str contains more than one document. See Options for more information on the second parameter.


YAML.parseAllDocuments(str, options = {}): YAML.Document[]

When parsing YAML, the input string str may consist of a stream of documents separated from each other by ... document end marker lines. YAML.parseAllDocuments will return an array of Document objects that allow these documents to be parsed and manipulated with more control. See Options for more information on the second parameter.


These functions should never throw; errors and warnings are included in the documents' errors and warnings arrays. In particular, if errors is not empty it's likely that the document's parsed contents are not entirely correct.

The contents of a parsed document will always consist of Scalar, Map, Seq or null values.

Creating Documents

new YAML.Document(options = {})

Member Type Description
anchors Anchors Anchors associated with the document's nodes; also provides alias & merge node creators.
commentBefore string? A comment at the very beginning of the document. If not empty, separated from the rest of the document by a blank line when stringified.
comment string? A comment at the end of the document. If not empty, separated from the rest of the document by a blank line when stringified.
contents Node|any The document contents.
errors Error[] Errors encountered during parsing.
schema Schema The schema used with the document.
tagPrefixes Prefix[] Array of prefixes; each will have a string handle that starts and ends with ! and a string prefix that the handle will be replaced by.
version string? The parsed version of the source document; if true-ish, stringified output will include a %YAML directive.
warnings Error[] Warnings encountered during parsing.
const doc = new YAML.Document()
doc.version = true
doc.commentBefore = ' A commented document'
doc.contents = ['some', 'values', { balloons: 99 }]

String(doc)
// # A commented document
// %YAML 1.2
// ---
// - some
// - values
// - balloons: 99

The Document members are all modifiable, though it's unlikely that you'll have reason to change errors, schema or warnings. In particular you may be interested in both reading and writing contents. Although YAML.parseDocument() and YAML.parseAllDocuments() will leave it with Map, Seq, Scalar or null contents, it can be set to anything.

During stringification, a document with a true-ish version value will include a %YAML directive; the version number will be set to 1.2 unless the yaml-1.1 schema is in use.

Document Methods

Method Returns Description
listNonDefaultTags() string[] List the tags used in the document that are not in the default tag:yaml.org,2002: namespace.
parse(cst) Document Parse a CST into this document. Mostly an internal method, modifying the document according to the contents of the parsed cst. Calling this multiple times on a Document is not recommended.
setSchema() void When a document is created with new YAML.Document(), the schema object is not set as it may be influenced by parsed directives; call this to set it manually.
setTagPrefix(handle, prefix) void Set handle as a shorthand string for the prefix tag namespace.
toJSON() any A plain JavaScript representation of the document contents.
toString() string A YAML representation of the document.
const doc = YAML.parseDocument('a: 1\nb: [2, 3]\n')
doc.get('a') // 1
doc.getIn([]) // YAMLMap { items: [Pair, Pair], ... }
doc.hasIn(['b', 0]) // true
doc.addIn(['b'], 4) // -> doc.get('b').items.length === 3
doc.deleteIn(['b', 1]) // true
doc.getIn(['b', 1]) // 4

In addition to the above, the document object also provides the same accessor methods as collections, based on the top-level collection: add, delete, get, has, and set, along with their deeper variants addIn, deleteIn, getIn, hasIn, and setIn. For the *In methods using an empty path value (i.e. null, undefined, or []) will refer to the document's top-level contents.

To define a tag prefix to use when stringifying, use setTagPrefix(handle, prefix) rather than setting a value directly in tagPrefixes. This will guarantee that the handle is valid (by throwing an error), and will overwrite any previous definition for the handle. Use an empty prefix value to remove a prefix.

const src = '1969-07-21T02:56:15Z'
const doc = YAML.parseDocument(src, { customTags: ['timestamp'] })

doc.toJSON()
// Date { 1969-07-21T02:56:15.000Z }

doc.options.keepBlobsInJSON = false
doc.toJSON()
// '1969-07-21T02:56:15.000Z'

String(doc)
// '1969-07-21T02:56:15\n'

For a plain JavaScript representation of the document, toJSON() is your friend. By default the values wrapped in scalar nodes will not be forced to JSON, so e.g. a !!timestamp will remain a Date in the output. To change this behaviour and enforce JSON values only, set the keepBlobsInJSON option to false.

Conversely, to stringify a document as YAML, use toString(). This will also be called by String(doc). This method will throw if the errors array is not empty.

Working with Anchors

A description of alias and merge nodes is included in the next section.


YAML.Document#anchors

Method Returns Description
createAlias(node: Node, name?: string) Alias Create a new Alias node, adding the required anchor for node. If name is empty, a new anchor name will be generated.
createMergePair(...Node) Merge Create a new Merge node with the given source nodes. Non-Alias sources will be automatically wrapped.
getName(node: Node) string? The anchor name associated with node, if set.
getNode(name: string) Node? The node associated with the anchor name, if set.
newName(prefix: string) string Find an available anchor name with the given prefix and a numerical suffix.
setAnchor(node: Node, name?: string) string? Associate an anchor with node. If name is empty, a new name will be generated.
const src = '[{ a: A }, { b: B }]'
const doc = YAML.parseDocument(src)
const { anchors, contents } = doc
const [a, b] = contents.items
anchors.setAnchor(a.items[0].value) // 'a1'
anchors.setAnchor(b.items[0].value) // 'a2'
anchors.setAnchor(null, 'a1') // 'a1'
anchors.getName(a) // undefined
anchors.getNode('a2')
// { value: 'B', range: [ 16, 18 ], type: 'PLAIN' }
String(doc)
// [ { a: A }, { b: &a2 B } ]

const alias = anchors.createAlias(a, 'AA')
contents.items.push(alias)
doc.toJSON()
// [ { a: 'A' }, { b: 'B' }, { a: 'A' } ]
String(doc)
// [ &AA { a: A }, { b: &a2 B }, *AA ]

const merge = anchors.createMergePair(alias)
b.items.push(merge)
doc.toJSON()
// [ { a: 'A' }, { b: 'B', a: 'A' }, { a: 'A' } ]
String(doc)
// [ &AA { a: A }, { b: &a2 B, <<: *AA }, *AA ]

// This creates a circular reference
merge.value.items.push(anchors.createAlias(b))
doc.toJSON() // [RangeError: Maximum call stack size exceeded]
String(doc)
// [
//   &AA { a: A },
//   &a3 {
//       b: &a2 B,
//       <<:
//         [ *AA, *a3 ]
//     },
//   *AA
// ]

The constructors for Alias and Merge are not directly exported by the library, as they depend on the document's anchors; instead you'll need to use createAlias(node, name) and createMergePair(...sources). You should make sure to only add alias and merge nodes to the document after the nodes to which they refer, or the document's YAML stringification will fail.

It is valid to have an anchor associated with a node even if it has no aliases. yaml will not allow you to associate the same name with more than one node, even though this is allowed by the YAML spec (all but the last instance will have numerical suffixes added). To add or reassign an anchor, use setAnchor(node, name). The second parameter is optional, and if left out either the pre-existing anchor name of the node will be used, or a new one generated. To remove an anchor, use setAnchor(null, name). The function will return the new anchor's name, or null if both of its arguments are null.

While the merge option needs to be true to parse Merge nodes as such, this is not required during stringification.

Content Nodes

After parsing, the contents value of each YAML.Document is the root of an Abstract Syntax Tree of nodes representing the document (or null for an empty document).

Scalar Values

class Node {
  comment: ?string,   // a comment on or immediately after this
  commentBefore: ?string, // a comment before this
  range: ?[number, number],
      // the [start, end] range of characters of the source parsed
      // into this node (undefined for pairs or if not parsed)
  spaceBefore: ?boolean,
      // a blank line before this node and its commentBefore
  tag: ?string,       // a fully qualified tag, if required
  toJSON(): any       // a plain JS representation of this node
}

For scalar values, the tag will not be set unless it was explicitly defined in the source document; this also applies for unsupported tags that have been resolved using a fallback tag (string, Map, or Seq).

class Scalar extends Node {
  format: 'BIN' | 'HEX' | 'OCT' | 'TIME' | undefined,
      // By default (undefined), numbers use decimal notation.
      // The YAML 1.2 core schema only supports 'HEX' and 'OCT'.
  type:
    'BLOCK_FOLDED' | 'BLOCK_LITERAL' | 'PLAIN' |
    'QUOTE_DOUBLE' | 'QUOTE_SINGLE' | undefined,
  value: any
}

A parsed document's contents will have all of its non-object values wrapped in Scalar objects, which themselves may be in some hierarchy of Map and Seq collections. However, this is not a requirement for the document's stringification, which is rather tolerant regarding its input values, and will use YAML.createNode when encountering an unwrapped value.

When stringifying, the node type will be taken into account by !!str and !!binary values, and ignored by other scalars. On the other hand, !!int and !!float stringifiers will take format into account.

Collections

class Pair extends Node {
  key: Node | any,    // key and value are always Node or null
  value: Node | any,  // when parsed, but can be set to anything
  type: 'PAIR'
}

class Map extends Node {
  items: Array<Pair>,
  type: 'FLOW_MAP' | 'MAP' | undefined
}

class Seq extends Node {
  items: Array<Node | any>,
  type: 'FLOW_SEQ' | 'SEQ' | undefined
}

Within all YAML documents, two forms of collections are supported: sequential Seq collections and key-value Map collections. The JavaScript representations of these collections both have an items array, which may (Seq) or must (Map) consist of Pair objects that contain a key and a value of any type, including null. The items array of a Seq object may contain values of any type.

When stringifying collections, by default block notation will be used. Flow notation will be selected if type is FLOW_MAP or FLOW_SEQ, the collection is within a surrounding flow collection, or if the collection is in an implicit key.

The yaml-1.1 schema includes additional collections that are based on Map and Seq: OMap and Pairs are sequences of Pair objects (OMap requires unique keys & corresponds to the JS Map object), and Set is a map of keys with null values that corresponds to the JS Set object.

All of the collections provide the following accessor methods:

Method Returns Description
add(value) void Adds a value to the collection. For !!map and !!omap the value must be a Pair instance or a { key, value } object, which may not have a key that already exists in the map.
delete(key) boolean Removes a value from the collection. Returns true if the item was found and removed.
get(key, [keepScalar]) any Returns item at key, or undefined if not found. By default unwraps scalar values from their surrounding node; to disable set keepScalar to true (collections are always returned intact).
has(key) boolean Checks if the collection includes a value with the key key.
set(key, value) any Sets a value in this collection. For !!set, value needs to be a boolean to add/remove the item from the set.
const map = YAML.createNode({ a: 1, b: [2, 3] })
map.add({ key: 'c', value: 4 })
  // => map.get('c') === 4 && map.has('c') === true
map.addIn(['b'], 5) // -> map.getIn(['b', 2]) === 5
map.delete('c') // true
map.deleteIn(['c', 'f']) // false
map.get('a') // 1
map.get(YAML.createNode('a'), true) // Scalar { value: 1 }
map.getIn(['b', 1]) // 3
map.has('c') // false
map.hasIn(['b', '0']) // true
map.set('c', null)
  // => map.get('c') === null && map.has('c') === true
map.setIn(['c', 'x'])
  // throws Error:
  // Expected YAML collection at c. Remaining path: x

For all of these methods, the keys may be nodes or their wrapped scalar values (i.e. 42 will match Scalar { value: 42 }) . Keys for !!seq should be positive integers, or their string representations. add() and set() do not automatically call createNode() to wrap the value.

Each of the methods also has a variant that requires an iterable as the first parameter, and allows fetching or modifying deeper collections: addIn(path, value), deleteIn(path), getIn(path, keepScalar), hasIn(path), setIn(path, value). getIn and hasIn will return undefined or false (respectively) if any of the intermediate collections is not found or if the key path attempts to extend within a scalar value, but the others will throw an error in such cases. Note that for addIn the path argument points to the collection rather than the item.

Alias Nodes

class Alias extends Node {
  source: Scalar | Map | Seq,
  type: 'ALIAS'
}

const obj = YAML.parse('[ &x { X: 42 }, Y, *x ]')
  // => [ { X: 42 }, 'Y', { X: 42 } ]
obj[2].Z = 13
  // => [ { X: 42, Z: 13 }, 'Y', { X: 42, Z: 13 } ]
YAML.stringify(obj)
  // - &a1
  //   X: 42
  //   Z: 13
  // - Y
  // - *a1

Alias nodes provide a way to include a single node in multiple places in a document; the source of an alias node must be a preceding node in the document. Circular references are fully supported, and where possible the JS representation of alias nodes will be the actual source object.

When directly stringifying JS structures with YAML.stringify(), multiple references to the same object will result in including an autogenerated anchor at its first instance, and alias nodes to that anchor at later references. Directly calling YAML.createNode() will not create anchors or alias nodes, allowing for greater manual control.

class Merge extends Pair {
  key: Scalar('<<'),      // defined by the type specification
  value: Seq<Alias(Map)>, // stringified as *A if length = 1
  type: 'MERGE_PAIR'
}

Merge nodes are not a core YAML 1.2 feature, but are defined as a YAML 1.1 type. They are only valid directly within a Map#items array and must contain one or more Alias nodes that themselves refer to Map nodes. When the surrounding map is resolved as a plain JS object, the key-value pairs of the aliased maps will be included in the object. Earlier Alias nodes override later ones, as do values set in the object directly.

To create and work with alias and merge nodes, you should use the YAML.Document#anchors object.

Creating Nodes

const seq = YAML.createNode(['some', 'values', { balloons: 99 }])
// YAMLSeq {
//   items:
//    [ Scalar { value: 'some' },
//      Scalar { value: 'values' },
//      YAMLMap {
//        items:
//         [ Pair {
//             key: Scalar { value: 'balloons' },
//             value: Scalar { value: 99 } } ] } ] }

const doc = new YAML.Document()
doc.contents = seq
seq.items[0].comment = ' A commented item'
String(doc)
// - some # A commented item
// - values
// - balloons: 99

YAML.createNode(value, wrapScalars?, tag?): Node

YAML.createNode recursively turns objects into collections. Generic objects as well as Map and its descendants become mappings, while arrays and other iterable objects result in sequences. If wrapScalars is undefined or true, it also wraps plain values in Scalar objects; if it is false and value is not an object, it will be returned directly.

To specify the collection type, set tag to its identifying string, e.g. "!!omap". Note that this requires the corresponding tag to be available based on the default options. To use a specific document's schema, use the wrapped method doc.schema.createNode(value, wrapScalars, tag).

The primary purpose of this function is to enable attaching comments or other metadata to a value, or to otherwise exert more fine-grained control over the stringified output. To that end, you'll need to assign its return value to the contents of a Document (or somewhere within said contents), as the document's schema is required for YAML string output.

new Map(), new Seq(), new Pair(key, value)

import YAML from 'yaml'
import { Pair, YAMLSeq } from 'yaml/types'

const doc = new YAML.Document()
doc.contents = new YAMLSeq()
doc.contents.items = [
  'some values',
  42,
  { including: 'objects', 3: 'a string' }
]
doc.contents.items.push(new Pair(1, 'a number'))

doc.toString()
// - some values
// - 42
// - "3": a string
//   including: objects
// - 1: a number

To construct a YAMLSeq or YAMLMap, use YAML.createNode() with array, object or iterable input, or create the collections directly by importing the classes from yaml/types.

Once created, normal array operations may be used to modify the items array. New Pair objects may created by importing the class from yaml/types and using its new Pair(key, value) constructor.

Comments

const doc = YAML.parseDocument(`
# This is YAML.
---
it has:
  - an array
  - of values
`)

doc.toJSON()
// { 'it has': [ 'an array', 'of values' ] }

doc.commentBefore
// ' This is YAML.'

const seq = doc.contents.items[0].value
seq.items[0].comment = ' item comment'
seq.comment = ' collection end comment'

doc.toString()
// # This is YAML.
//
// it has:
//   - an array # item comment
//   - of values
//   # collection end comment

A primary differentiator between this and other YAML libraries is the ability to programmatically handle comments, which according to the spec "must not have any effect on the serialization tree or representation graph. In particular, comments are not associated with a particular node."

This library does allow comments to be handled programmatically, and does attach them to particular nodes (most often, the following node). Each Scalar, Map, Seq and the Document itself has comment and commentBefore members that may be set to a stringifiable value.

The string contents of comments are not processed by the library, except for merging adjacent comment lines together and prefixing each line with the # comment indicator. Document comments will be separated from the rest of the document by a blank line.

Note: Due to implementation details, the library's comment handling is not completely stable. In particular, when creating, writing, and then reading a YAML file, comments may sometimes be associated with a different node.

Blank Lines

const doc = YAML.parseDocument('[ one, two, three ]')

doc.contents.items[0].comment = ' item comment'
doc.contents.items[1].spaceBefore = true
doc.comment = ' document end comment'

doc.toString()
// [
//   one, # item comment
//
//   two,
//   three
// ]
//
// # document end comment

Similarly to comments, the YAML spec instructs non-content blank lines to be discarded. Instead of doing that, yaml provides a spaceBefore boolean property for each node. If true, the node (and its commentBefore, if any) will be separated from the preceding node by a blank line.

Note that scalar block values with "keep" chomping (i.e. with + in their header) consider any trailing empty lines to be a part of their content, so the spaceBefore setting of a node following such a value is ignored.

Custom Data Types

YAML.parse('!!timestamp 2001-12-15 2:59:43')
// YAMLWarning:
//   The tag tag:yaml.org,2002:timestamp is unavailable,
//   falling back to tag:yaml.org,2002:str
// '2001-12-15 2:59:43'

YAML.defaultOptions.customTags = ['timestamp']

YAML.parse('2001-12-15 2:59:43') // returns a Date instance
// 2001-12-15T02:59:43.000Z

const doc = YAML.parseDocument('2001-12-15 2:59:43')
doc.contents.value.toDateString()
// 'Sat Dec 15 2001'

The easiest way to extend a schema is by defining the additional tags that you wish to support. To do that, the customTags option allows you to provide an array of custom tag objects or tag identifiers. In particular, the built-in tags that are a part of the yaml-1.1 schema but not the default core schema may be referred to by their string identifiers.

For further customisation, customTags may also be a function (Tag[]) => (Tag[]) that may modify the schema's base tag array.

Built-in Custom Tags

Identifier YAML Type JS Type Description
'binary' !!binary Uint8Array Binary data, represented in YAML as base64 encoded characters.
'floatTime' !!float Number Sexagesimal floating-point number format, e.g. 190:20:30.15. To stringify with this tag, the node format must be 'TIME'.
'intTime' !!int Number Sexagesimal integer number format, e.g. 190:20:30. To stringify with this tag, the node format must be 'TIME'.
'omap' !!omap Map Ordered sequence of key: value pairs without duplicates. Using mapAsMap: true together with this tag is not recommended, as it makes the parse → stringify loop non-idempotent.
'pairs' !!pairs Array Ordered sequence of key: value pairs allowing duplicates. To create from JS, you'll need to explicitly use '!!pairs' as the third argument of createNode().
'set' !!set Set Unordered set of non-equal values.
'timestamp' !!timestamp Date A point in time.

Writing Custom Tags

import { stringifyString } from 'yaml/util'

const regexp = {
  identify: value => value instanceof RegExp,
  tag: '!re',
  resolve(doc, cst) {
    const match = cst.strValue.match(/^\/([\s\S]+)\/([gimuy]*)$/)
    return new RegExp(match[1], match[2])
  }
}

const sharedSymbol = {
  identify: value => value.constructor === Symbol,
  tag: '!symbol/shared',
  resolve: (doc, cst) => Symbol.for(cst.strValue),
  stringify(item, ctx, onComment, onChompKeep) {
    const key = Symbol.keyFor(item.value)
    if (key === undefined) throw new Error('Only shared symbols are supported')
    return stringifyString({ value: key }, ctx, onComment, onChompKeep)
  }
}

YAML.defaultOptions.customTags = [regexp, sharedSymbol]

YAML.stringify({
  regexp: /foo/gi,
  symbol: Symbol.for('bar')
})
// regexp: !re /foo/gi
// symbol: !symbol/shared bar

In YAML-speak, a custom data type is represented by a tag. To define your own tag, you need to account for the ways that your data is both parsed and stringified. Furthermore, both of those processes are split into two stages by the intermediate AST node structure.

If you wish to implement your own custom tags, the !!binary and !!set tags provide relatively cohesive examples to study in addition to the simple examples in the sidebar here.

Parsing Custom Data

At the lowest level, YAML.parseCST() will take care of turning string input into a concrete syntax tree (CST). In the CST all scalar values are available as strings, and maps & sequences as collections of nodes. Each schema includes a set of default data types, which handle converting at least strings, maps and sequences into their AST nodes. These are considered to have implicit tags, and are autodetected. Custom tags, on the other hand, should almost always define an explicit tag with which their value will be prefixed. This may be application-specific local !tag, a shorthand !ns!tag, or a verbatim !<tag:example.com,2019:tag>.

Once identified by matching the tag, the resolve(doc, cstNode): Node | any function will turn a CST node into an AST node. For scalars, this is relatively simple, as the stringified node value is directly available, and should be converted to its actual value. Collections are trickier, and it's almost certain that it'll make sense to use the parseMap(doc, cstNode) and parseSeq(doc, cstNode) functions exported from 'yaml/util' to initially resolve the CST collection into a YAMLMap or YAMLSeq object, and to work with that instead -- this is for instance what the YAML 1.1 collections do.

Note that during the CST -> AST parsing, the anchors and comments attached to each node are also resolved for each node. This metadata will unfortunately be lost when converting the values to JS objects, so collections should have values that extend one of the existing collection classes. Collections should therefore either fall back to their parent classes' toJSON() methods, or define their own in order to allow their contents to be expressed as the appropriate JS object.

Creating Nodes and Stringifying Custom Data

As with parsing, turning input data into its YAML string representation is a two-stage process as the input is first turned into an AST tree before stringifying it. This allows for metadata and comments to be attached to each node, and for e.g. circular references to be resolved. For scalar values, this means just wrapping the value within a Scalar class while keeping it unchanged.

As values may be wrapped within objects and arrays, YAML.createNode() uses each tag's identify(value): boolean function to detect custom data types. For the same reason, collections need to define their own createNode(schema, value, ctx): Collection functions that may recursively construct their equivalent collection class instances.

Finally, stringify(item, ctx, ...): string defines how your data should be represented as a YAML string, in case the default stringifiers aren't enough. For collections in particular, the default stringifier should be perfectly sufficient. 'yaml/util' exports stringifyNumber(item) and stringifyString(item, ctx, ...), which may be of use for custom scalar data.

Custom Tag API

import {
  findPair, // (items, key) => Pair? -- Given a key, find a matching Pair
  parseMap, // (doc, cstNode) => new YAMLMap
  parseSeq, // (doc, cstNode) => new YAMLSeq
  stringifyNumber, // (node) => string
  stringifyString, // (node, ctx, ...) => string
  toJSON, // (value, arg, ctx) => any -- Recursively convert to plain JS
  Type, // { [string]: string } -- Used as enum for node types
  YAMLReferenceError, YAMLSemanticError, YAMLSyntaxError, YAMLWarning
} from 'yaml/util'

To define your own tag, you'll need to define an object comprising of some of the following fields. Those in bold are required:

CST Parser

For ease of implementation and to provide better error handling and reporting, the lowest level of the library's parser turns any input string into a Concrete Syntax Tree of nodes as if the input were YAML. This level of the API has not been designed to be particularly user-friendly for external users, but it is fast, robust, and not dependent on the rest of the library.

parseCST

import parseCST from 'yaml/parse-cst'

const cst = parseCST(`
sequence: [ one, two, ]
mapping: { sky: blue, sea: green }
---
-
  "flow in block"
- >
 Block scalar
- !!map # Block collection
  foo : bar
`)

cst[0]            // first document, containing a map with two keys
  .contents[0]    // document contents (as opposed to directives)
  .items[3].node  // the last item, a flow map
  .items[3]       // the fourth token, parsed as a plain value
  .strValue       // 'blue'

cst[1]            // second document, containing a sequence
  .contents[0]    // document contents (as opposed to directives)
  .items[1].node  // the second item, a block value
  .strValue       // 'Block scalar\n'

parseCST(string): CSTDocument[]

YAML.parseCST(string): CSTDocument[]

The CST parser will not produce a CST that is necessarily valid YAML, and in particular its representation of collections of items is expected to undergo further processing and validation. The parser should never throw errors, but may include them as a value of the relevant node. On the other hand, if you feed it garbage, you'll likely get a garbage CST as well.

The public API of the CST layer is a single function which returns an array of parsed CST documents. The array and its contained nodes override the default toString method, each returning a YAML string representation of its contents. The same function is exported as a part of the default YAML object, as well as seprately at yaml/parse-cst. It has no dependency on the rest of the library, so importing only parseCST should add about 9kB to your gzipped bundle size, when the whole library will add about 27kB.

Care should be taken when modifying the CST, as no error checks are included to verify that the resulting YAML is valid, or that e.g. indentation levels aren't broken. In other words, this is an engineering tool and you may hurt yourself. If you're looking to generate a brand new YAML document, see the section on Creating Documents.

For more usage examples and CST trees, have a look through the extensive test suite included in the project's repository.

Error detection

import YAML from 'yaml'

const cst = YAML.parseCST('this: is: bad YAML')

cst[0].contents[0]  // Note: Simplified for clarity
// { type: 'MAP',
//   items: [
//     { type: 'PLAIN', strValue: 'this' },
//     { type: 'MAP_VALUE',
//       node: {
//         type: 'MAP',
//         items: [
//           { type: 'PLAIN', strValue: 'is' },
//           { type: 'MAP_VALUE',
//             node: { type: 'PLAIN', strValue: 'bad YAML' } } ] } } ] }

const doc = new YAML.Document()
doc.parse(cst[0])
doc.errors
// [ {
//   name: 'YAMLSemanticError',
//   message: 'Nested mappings are not allowed in compact mappings',
//   source: {
//     type: 'MAP',
//     range: { start: 6, end: 18 },
//     ...,
//     rawValue: 'is: bad YAML' } } ]

doc.contents.items[0].value.items[0].value.value
// 'bad YAML'

While the YAML spec considers e.g. block collections within a flow collection to be an error, this error will not be detected by the CST parser. For complete validation, you will need to parse the CST into a YAML.Document. If the document contains errors, they will be included in the document's errors array, and each error will will contain a source reference to the CST node where it was encountered. Do note that even if an error is encountered, the document contents might still be available. In such a case, the error will be a YAMLSemanticError rather than a YAMLSyntaxError.

Dealing with CRLF line terminators

import parseCST from 'yaml/parse-cst'

const src = '- foo\r\n- bar\r\n'
const cst = parseCST(src)
cst.setOrigRanges() // true
const { range, valueRange } = cst[0].contents[0].items[1].node

src.slice(range.origStart, range.origEnd)
// 'bar\r\n'

src.slice(valueRange.origStart, valueRange.origEnd)
// 'bar'

CST#setOrigRanges(): bool

The array returned by parseCST() will also include a method setOrigRanges to help deal with input that includes \r\n line terminators, which are converted to just \n before parsing into documents. This conversion will obviously change the total length of the string, as well as the offsets of all ranges. If the method returns false, the input did not include \r\n line terminators and no changes were made. However, if the method returns true, each Range object within the CST will have its origStart and origEnd values set appropriately to refer to the original input string.

CST Nodes

Node type definitions use Flow-ish notation, so + as a prefix indicates a read-only getter property.

class Range {
  start: number,        // offset of first character
  end: number,          // offset after last character
  isEmpty(): boolean,   // true if end is not greater than start
  origStart: ?number,   // set by CST#setOrigRanges(), source
  origEnd: ?number      //   offsets for input with CRLF terminators
}

Note: The Node, Scalar and other values referred to in this section are the CST representations of said objects, and are not the same as those used in preceding parts.

Actual values in the CST nodes are stored as start, end indices of the input string. This allows for memory consumption to be minimised by making string generation really lazy.

Node

class Node {
  context: {
    atLineStart: boolean, // is this node the first one on this line
    indent: number,     // current level of indentation (may be -1)
    root: CSTDocument,  // a reference to the parent document
    src: string         // the full original source
  },
  error: ?Error,        // if not null, indicates a parser failure
  props: Array<Range>,  // anchors, tags and comments
  range: Range,         // span of context.src parsed into this node
  type:                 // specific node type
    'ALIAS' | 'BLOCK_FOLDED' | 'BLOCK_LITERAL' | 'COMMENT' |
    'DIRECTIVE' | 'DOCUMENT' | 'FLOW_MAP' | 'FLOW_SEQ' |
    'MAP' | 'MAP_KEY' | 'MAP_VALUE' | 'PLAIN' |
    'QUOTE_DOUBLE' | 'QUOTE_SINGLE' | 'SEQ' | 'SEQ_ITEM',
  value: ?string        // if set to a non-null value, overrides
                        //   source value when stringified
  +anchor: ?string,     // anchor, if set
  +comment: ?string,    // newline-delimited comment(s), if any
  +rangeAsLinePos:      // human-friendly source location
    ?{ start: LinePos, end: ?LinePos },
    // LinePos here is { line: number, col: number }
  +rawValue: ?string,   // an unprocessed slice of context.src
                        //   determining this node's value
  +tag:                 // this node's tag, if set
    null | { verbatim: string } | { handle: string, suffix: string },
  toString(): string    // a YAML string representation of this node
}

type ContentNode =
  Comment | Alias | Scalar | Map | Seq | FlowCollection

Each node in the CST extends a common ancestor Node. Additional undocumented properties are available, but are likely only useful during parsing.

If a node has its value set, that will be used when re-stringifying (initially undefined for all nodes).

Scalars

class Alias extends Node {
  // rawValue will contain the anchor without the * prefix
  type: 'ALIAS'
}

class Scalar extends Node {
  type: 'PLAIN' | 'QUOTE_DOUBLE' | 'QUOTE_SINGLE' |
    'BLOCK_FOLDED' | 'BLOCK_LITERAL'
  +strValue: ?string |  // unescaped string value
    { str: string, errors: YAMLSyntaxError[] }
}

class Comment extends Node {
  type: 'COMMENT',      // PLAIN nodes may also be comment-only
  +anchor: null,
  +comment: string,
  +rawValue: null,
  +tag: null
}

class BlankLine extends Comment {
  type: 'BLANK_LINE',   // may represent multiple consecutive empty
  +comment: null,       //   lines, which may include whitespace
}

While Alias, BlankLine and Comment nodes are not technically scalars, they are parsed as such at this level.

Due to parsing differences, each scalar type is implemented using its own class.

Collections

class MapItem extends Node {
  node: ContentNode | null,
  type: 'MAP_KEY' | 'MAP_VALUE'
}

class Map extends Node {
  // implicit keys are not wrapped
  items: Array<Comment | Alias | Scalar | MapItem>,
  type: 'MAP'
}

class SeqItem extends Node {
  node: ContentNode | null,
  type: 'SEQ_ITEM'
}

class Seq extends Node {
  items: Array<Comment | SeqItem>,
  type: 'SEQ'
}

type FlowChar = '{' | '}' | '[' | ']' | ',' | '?' | ':'

class FlowCollection extends Node {
  items: Array<FlowChar | Comment | Alias | Scalar | FlowCollection>,
  type: 'FLOW_MAP' | 'FLOW_SEQ'
}

Block and flow collections are parsed rather differently, due to their representation differences.

An Alias or Scalar item directly within a Map should be treated as an implicit map key.

In actual code, MapItem and SeqItem are implemented as CollectionItem, and correspondingly Map and Seq as Collection.

Document Structure

class Directive extends Node {
  name: string,  // should only be 'TAG' or 'YAML'
  type: 'DIRECTIVE',
  +anchor: null,
  +parameters: Array<string>,
  +tag: null
}

class CSTDocument extends Node {
  directives: Array<Comment | Directive>,
  contents: Array<ContentNode>,
  type: 'DOCUMENT',
  +anchor: null,
  +comment: null,
  +tag: null
}

The CST tree of a valid YAML document should have a single non-Comment ContentNode in its contents array. Multiple values indicates that the input is malformed in a way that made it impossible to determine the proper structure of the document.

Errors

Nearly all errors and warnings produced by the yaml parser functions contain the following fields:

Member Type Description
name string One of YAMLReferenceError, YAMLSemanticError, YAMLSyntaxError, or YAMLWarning
message string A human-readable description of the error
source CST Node The CST node at which this error or warning was encountered. Note that in particular source.context is likely to be a complex object and include some circular references.

If the prettyErrors option is enabled, source is dropped from the errors and the following fields are added with summary information regarding the error's source node, if available:

Member Type Description
nodeType string A string constant identifying the type of node
range { start: number, end: ?number } Character offset in the input string
linePos { start: LinePos, end: ?LinePos } One-indexed human-friendly source location. LinePos here is { line: number, col: number }

In rare cases, the library may produce a more generic error. In particular, TypeError may occur when parsing invalid input using the json schema, and ReferenceError when the maxAliasCount limit is enountered.

YAMLReferenceError

An error resolving a tag or an anchor that is referred to in the source. It is likely that the contents of the source node have not been completely parsed into the document. Not used by the CST parser.

YAMLSemanticError

An error related to the metadata of the document, or an error with limitations imposed by the YAML spec. The data contents of the document should be valid, but the metadata may be broken.

YAMLSyntaxError

A serious parsing error; the document contents will not be complete, and the CST is likely to be rather broken.

YAMLWarning

Not an error, but a spec-mandated warning about unsupported directives or a fallback resolution being used for a node with an unavailable tag. Not used by the CST parser.

YAML Syntax

A YAML schema is a combination of a set of tags and a mechanism for resolving non-specific tags, i.e. values that do not have an explicit tag such as !!int. The default schema is the 'core' schema, which is the recommended one for YAML 1.2. For YAML 1.0 and YAML 1.1 documents the default is 'yaml-1.1'.

Tags

YAML.parse('"42"')
// '42'

YAML.parse('!!int "42"')
// 42

YAML.parse(`
%TAG ! tag:example.com,2018:app/
---
!foo 42
`)
// YAMLWarning:
//   The tag tag:example.com,2018:app/foo is unavailable,
//   falling back to tag:yaml.org,2002:str
// '42'

The default prefix for YAML tags is tag:yaml.org,2002:, for which the shorthand !! is used when stringified. Shorthands for other prefixes may also be defined by document-specific directives, e.g. !e! or just ! for tag:example.com,2018:app/, but this is not required to use a tag with a different prefix.

During parsing, unresolved tags should not result in errors (though they will be noted as warnings), with the tagged value being parsed according to the data type that it would have under automatic tag resolution rules. This should not result in any data loss, allowing such tags to be handled by the calling app.

In order to have yaml provide you with automatic parsing and stringification of non-standard data types, it will need to be configured with a suitable tag object. For more information, see Custom Tags.

The YAML 1.0 tag specification is slightly different from that used in later versions, and implements prefixing shorthands rather differently.

Version Differences

This library's parser is based on the 1.2 version of the YAML spec, which is mostly backwards-compatible with YAML 1.1 as well as YAML 1.0. Some specific relaxations have been added for backwards compatibility, but if you encounter an issue please report it.

Changes from YAML 1.1 to 1.2

%YAML 1.1
---
true: Yes
octal: 014
sexagesimal: 3:25:45
picture: !!binary |
 R0lGODlhDAAMAIQAAP//9/X
 17unp5WZmZgAAAOfn515eXv
 Pz7Y6OjuDg4J+fn5OTk6enp
 56enmleECcgggoBADs=
{ true: true,
  octal: 12,
  sexagesimal: 12345,
  picture:
   Buffer [Uint8Array] [
     71, 73, 70, 56, 57, 97, 12, 0, 12, 0, 132, 0, 0,
     255, 255, 247, 245, 245, 238, 233, 233, 229, 102,
     102, 102, 0, 0, 0, 231, 231, 231, 94, 94, 94, 243,
     243, 237, 142, 142, 142, 224, 224, 224, 159, 159,
     159, 147, 147, 147, 167, 167, 167, 158, 158, 158,
     105, 94, 16, 39, 32, 130, 10, 1, 0, 59 ] }

The most significant difference between YAML 1.1 and YAML 1.2 is the introduction of the core data schema as the recommended default, replacing the YAML 1.1 type library:

The other major change has been to make sure that YAML 1.2 is a valid superset of JSON. Additionally there are some minor differences between the parsing rules:

Changes from YAML 1.0 to 1.1

%YAML:1.0
---
date: 2001-01-23
number: !int '123'
string: !str 123
pool: !!ball { number: 8 }
invoice: !domain.tld,2002/^invoice
  customers: !seq
    - !^customer
      given : Chris
      family : Dumars
// YAMLWarning:
//   The tag tag:private.yaml.org,2002:ball is unavailable,
//   falling back to tag:yaml.org,2002:map
// YAMLWarning:
//   The tag tag:domain.tld,2002/^invoice is unavailable,
//   falling back to tag:yaml.org,2002:map
// YAMLWarning:
//   The tag ^customer is unavailable,
//   falling back to tag:yaml.org,2002:map
{ date: '2001-01-23T00:00:00.000Z',
  number: 123,
  string: '123',
  pool: { number: 8 },
  invoice: { customers: [ { given: 'Chris', family: 'Dumars' } ] } }

The most significant difference between these versions is the complete refactoring of the tag syntax:

Additionally, the formal description of the language describing the document structure has been completely refactored between these versions, but the described intent has not changed. Other changes include:

yaml supports parsing and stringifying YAML 1.0 tags, but does not expand tags using the ^ notation. If this is something you'd find useful, please file a GitHub issue about it.