@exodus/bytesUint8Array conversion to and from base64, base32, base58, hex, utf8, utf16, bech32 and wif
And a TextEncoder / TextDecoder polyfill
See documentation.
Performs proper input validation, ensures no garbage-in-garbage-out
Tested on Node.js, Deno, Bun, browsers (including Servo), Hermes, QuickJS and barebone engines in CI (how?)
10-20x faster than Buffer polyfill2-10x faster than iconv-liteThe above was for the js fallback
It's up to 100x when native impl is available
e.g. in utf8fromString on Hermes / React Native or fromHex in Chrome
Also:
3-8x faster than bs5810-30x faster than @scure/base (or >100x on Node.js <25)utf8toString / utf8fromString than Buffer or TextDecoder / TextEncoder on Node.jsSee Performance for more info
import { TextDecoder, TextEncoder } from '@exodus/bytes/encoding.js'
import { TextDecoderStream, TextEncoderStream } from '@exodus/bytes/encoding.js' // Requires Streams
Less than half the bundle size of text-encoding, whatwg-encoding or iconv-lite (gzipped or not).
Also much faster than all of those.
See also the lite version to get this down to 10 KiB gzipped.
Spec compliant, passing WPT and covered with extra tests.
Moreover, tests for this library uncovered bugs in all major implementations.
Including all three major browser engines being wrong at UTF-8.
See WPT pull request.
It works correctly even in environments that have native implementations broken (that's all of them currently).
Runs (and passes WPT) on Node.js built without ICU.
Faster than Node.js native implementation on Node.js.
The JS multi-byte version is as fast as native impl in Node.js and browsers, but (unlike them) returns correct results.
For encodings where native version is known to be fast and correct, it is automatically used.
Some single-byte encodings are faster than native in all three major browser engines.
See analysis table for more info.
TextDecoder / TextEncoder APIs are lossy by default per specThese are only provided as a compatibility layer, prefer hardened APIs instead in new code.
TextDecoder can (and should) be used with { fatal: true } option for all purposes demanding correctness / lossless transforms
TextEncoder does not support a fatal mode per spec, it always performs replacement.
That is not suitable for hashing, cryptography or consensus applications.
Otherwise there would be non-equal strings with equal signatures and hashes — the collision is caused by the lossy transform of a JS string to bytes.
Those also survive e.g. JSON.stringify/JSON.parse or being sent over network.
Use strict APIs in new applications, see utf8fromString / utf16fromString below.
Those throw on non-well-formed strings by default.
If you don't need support for legacy multi-byte encodings, you can use the lite import:
import { TextDecoder, TextEncoder } from '@exodus/bytes/encoding-lite.js'
import { TextDecoderStream, TextEncoderStream } from '@exodus/bytes/encoding-lite.js' // Requires Streams
This reduces the bundle size 9x:
from 90 KiB gzipped for @exodus/bytes/encoding.js to 10 KiB gzipped for @exodus/bytes/encoding-lite.js.
(For comparison, text-encoding module is 190 KiB gzipped, and iconv-lite is 194 KiB gzipped):
It still supports utf-8, utf-16le, utf-16be and all single-byte encodings specified by the spec,
the only difference is support for legacy multi-byte encodings.
@exodus/bytes/utf8.jsUTF-8 encoding/decoding
import { utf8fromString, utf8toString } from '@exodus/bytes/utf8.js'
// loose
import { utf8fromStringLoose, utf8toStringLoose } from '@exodus/bytes/utf8.js'
These methods by design encode/decode BOM (codepoint U+FEFF Byte Order Mark) as-is.
If you need BOM handling or detection, use @exodus/bytes/encoding.js
utf8fromString(string, format = 'uint8')Encode a string to UTF-8 bytes (strict mode)
Throws on invalid Unicode (unpaired surrogates)
utf8fromStringLoose(string, format = 'uint8')Encode a string to UTF-8 bytes (loose mode)
Replaces invalid Unicode (unpaired surrogates) with replacement codepoints U+FFFD
per WHATWG Encoding specification.
Such replacement is a non-injective function, is irreversable and causes collisions.
Prefer using strict throwing methods for cryptography applications.
utf8toString(arr)Decode UTF-8 bytes to a string (strict mode)
Throws on invalid UTF-8 byte sequences
utf8toStringLoose(arr)Decode UTF-8 bytes to a string (loose mode)
Replaces invalid UTF-8 byte sequences with replacement codepoints U+FFFD
per WHATWG Encoding specification.
Such replacement is a non-injective function, is irreversable and causes collisions.
Prefer using strict throwing methods for cryptography applications.
@exodus/bytes/utf16.jsUTF-16 encoding/decoding
import { utf16fromString, utf16toString } from '@exodus/bytes/utf16.js'
// loose
import { utf16fromStringLoose, utf16toStringLoose } from '@exodus/bytes/utf16.js'
These methods by design encode/decode BOM (codepoint U+FEFF Byte Order Mark) as-is.
If you need BOM handling or detection, use @exodus/bytes/encoding.js
utf16fromString(string, format = 'uint16')Encode a string to UTF-16 bytes (strict mode)
Throws on invalid Unicode (unpaired surrogates)
utf16fromStringLoose(string, format = 'uint16')Encode a string to UTF-16 bytes (loose mode)
Replaces invalid Unicode (unpaired surrogates) with replacement codepoints U+FFFD
per WHATWG Encoding specification.
Such replacement is a non-injective function, is irreversible and causes collisions.
Prefer using strict throwing methods for cryptography applications.
utf16toString(arr, format = 'uint16')Decode UTF-16 bytes to a string (strict mode)
Throws on invalid UTF-16 byte sequences
Throws on non-even byte length.
utf16toStringLoose(arr, format = 'uint16')Decode UTF-16 bytes to a string (loose mode)
Replaces invalid UTF-16 byte sequences with replacement codepoints U+FFFD
per WHATWG Encoding specification.
Such replacement is a non-injective function, is irreversible and causes collisions.
Prefer using strict throwing methods for cryptography applications.
Throws on non-even byte length.
@exodus/bytes/single-byte.jsDecode / encode the legacy single-byte encodings according to the
Encoding standard
(§9,
§14.5),
and unicode.org iso-8859-* mappings.
import { createSinglebyteDecoder, createSinglebyteEncoder } from '@exodus/bytes/single-byte.js'
import { windows1252toString, windows1252fromString } from '@exodus/bytes/single-byte.js'
import { latin1toString, latin1fromString } from '@exodus/bytes/single-byte.js'
Supports all single-byte encodings listed in the WHATWG Encoding standard:
ibm866, iso-8859-2, iso-8859-3, iso-8859-4, iso-8859-5, iso-8859-6, iso-8859-7, iso-8859-8,
iso-8859-8-i, iso-8859-10, iso-8859-13, iso-8859-14, iso-8859-15, iso-8859-16, koi8-r, koi8-u,
macintosh, windows-874, windows-1250, windows-1251, windows-1252, windows-1253, windows-1254,
windows-1255, windows-1256, windows-1257, windows-1258, x-mac-cyrillic and x-user-defined.
Also supports iso-8859-1, iso-8859-9, iso-8859-11 as defined at
unicode.org
(and all other iso-8859-* encodings there as they match WHATWG).
While all iso-8859-* encodings supported by the WHATWG Encoding standard match
unicode.org, the WHATWG Encoding spec doesn't support
iso-8859-1, iso-8859-9, iso-8859-11, and instead maps them as labels to windows-1252, windows-1254, windows-874.
createSinglebyteDecoder() (unlike TextDecoder or legacyHookDecode()) does not do such mapping,
so its results will differ from TextDecoder for those encoding names.
> new TextDecoder('iso-8859-1').encoding
'windows-1252'
> new TextDecoder('iso-8859-9').encoding
'windows-1254'
> new TextDecoder('iso-8859-11').encoding
'windows-874'
> new TextDecoder('iso-8859-9').decode(Uint8Array.of(0x80, 0x81, 0xd0))
'€\x81Ğ' // this is actually decoded according to windows-1254 per TextDecoder spec
> createSinglebyteDecoder('iso-8859-9')(Uint8Array.of(0x80, 0x81, 0xd0))
'\x80\x81Ğ' // this is iso-8859-9 as defined at https://unicode.org/Public/MAPPINGS/ISO8859/8859-9.txt
All WHATWG Encoding spec windows-* encodings are supersets of
corresponding unicode.org encodings, meaning that
they encode/decode all the old valid (non-replacement) strings / byte sequences identically, but can also support
a wider range of inputs.
createSinglebyteDecoder(encoding, loose = false)Create a decoder for a supported one-byte encoding, given its lowercased name encoding.
Returns a function decode(arr) that decodes bytes to a string.
createSinglebyteEncoder(encoding, { mode = 'fatal' })Create an encoder for a supported one-byte encoding, given its lowercased name encoding.
Returns a function encode(string) that encodes a string to bytes.
In 'fatal' mode (default), will throw on non well-formed strings or any codepoints which could
not be encoded in the target encoding.
latin1toString(arr)Decode iso-8859-1 bytes to a string.
There is no loose variant for this encoding, all bytes can be decoded.
Same as:
const latin1toString = createSinglebyteDecoder('iso-8859-1')
Note: this is different from new TextDecoder('iso-8859-1') and new TextDecoder('latin1'), as
those alias to new TextDecoder('windows-1252').
latin1fromString(string)Encode a string to iso-8859-1 bytes.
Throws on non well-formed strings or any codepoints which could not be encoded in iso-8859-1.
Same as:
const latin1fromString = createSinglebyteEncoder('iso-8859-1', { mode: 'fatal' })
windows1252toString(arr)Decode windows-1252 bytes to a string.
There is no loose variant for this encoding, all bytes can be decoded.
Same as:
const windows1252toString = createSinglebyteDecoder('windows-1252')
windows1252fromString(string)Encode a string to windows-1252 bytes.
Throws on non well-formed strings or any codepoints which could not be encoded in windows-1252.
Same as:
const windows1252fromString = createSinglebyteEncoder('windows-1252', { mode: 'fatal' })
@exodus/bytes/multi-byte.jsDecode / encode the legacy multi-byte encodings according to the Encoding standard (§10, §11, §12, §13).
import { createMultibyteDecoder, createMultibyteEncoder } from '@exodus/bytes/multi-byte.js'
Supports all legacy multi-byte encodings listed in the WHATWG Encoding standard:
gbk, gb18030, big5, euc-jp, iso-2022-jp, shift_jis, euc-kr.
createMultibyteDecoder(encoding, loose = false)Create a decoder for a supported legacy multi-byte encoding, given its lowercased name encoding.
Returns a function decode(arr, stream = false) that decodes bytes to a string.
The returned function will maintain internal state while stream = true is used, allowing it to
handle incomplete multi-byte sequences across multiple calls.
State is reset when stream = false or when the function is called without the stream parameter.
createMultibyteEncoder(encoding, { mode = 'fatal' })Create an encoder for a supported legacy multi-byte encoding, given its lowercased name encoding.
Returns a function encode(string) that encodes a string to bytes.
In 'fatal' mode (default), will throw on non well-formed strings or any codepoints which could
not be encoded in the target encoding.
@exodus/bytes/bigint.jsConvert between BigInt and Uint8Array
import { fromBigInt, toBigInt } from '@exodus/bytes/bigint.js'
fromBigInt(bigint, { length, format = 'uint8' })Convert a BigInt to a Uint8Array or Buffer
The output bytes are in big-endian format.
Throws if the BigInt is negative or cannot fit into the specified length.
toBigInt(arr)Convert a Uint8Array or Buffer to a BigInt
The bytes are interpreted as a big-endian unsigned integer.
@exodus/bytes/hex.jsImplements Base16 from RFC4648 (no differences from RFC3548).
import { fromHex, toHex } from '@exodus/bytes/hex.js'
fromHex(string, format = 'uint8')Decode a hex string to bytes
Unlike Buffer.from(), throws on invalid input
toHex(arr)Encode a Uint8Array to a lowercase hex string
@exodus/bytes/base64.jsImplements base64 and base64url from RFC4648 (no differences from RFC3548).
import { fromBase64, toBase64 } from '@exodus/bytes/base64.js'
import { fromBase64url, toBase64url } from '@exodus/bytes/base64.js'
import { fromBase64any } from '@exodus/bytes/base64.js'
fromBase64(string, { format = 'uint8', padding = 'both' })Decode a base64 string to bytes
Operates in strict mode for last chunk, does not allow whitespace
fromBase64url(string, { format = 'uint8', padding = false })Decode a base64url string to bytes
Operates in strict mode for last chunk, does not allow whitespace
fromBase64any(string, { format = 'uint8', padding = 'both' })Decode either base64 or base64url string to bytes
Automatically detects the variant based on characters present
toBase64(arr, { padding = true })Encode a Uint8Array to a base64 string (RFC 4648)
toBase64url(arr, { padding = false })Encode a Uint8Array to a base64url string (RFC 4648)
@exodus/bytes/base32.jsImplements base32 and base32hex from RFC4648 (no differences from RFC3548).
import { fromBase32, toBase32 } from '@exodus/bytes/base32.js'
import { fromBase32hex, toBase32hex } from '@exodus/bytes/base32.js'
fromBase32(string, { format = 'uint8', padding = 'both' })Decode a base32 string to bytes
Operates in strict mode for last chunk, does not allow whitespace
fromBase32hex(string, { format = 'uint8', padding = 'both' })Decode a base32hex string to bytes
Operates in strict mode for last chunk, does not allow whitespace
toBase32(arr, { padding = false })Encode a Uint8Array to a base32 string (RFC 4648)
toBase32hex(arr, { padding = false })Encode a Uint8Array to a base32hex string (RFC 4648)
@exodus/bytes/bech32.jsImplements bech32 and bech32m from BIP-0173 and BIP-0350.
import { fromBech32, toBech32 } from '@exodus/bytes/bech32.js'
import { fromBech32m, toBech32m } from '@exodus/bytes/bech32.js'
import { getPrefix } from '@exodus/bytes/bech32.js'
getPrefix(string, limit = 90)Extract the prefix from a bech32 or bech32m string without full validation
This is a quick check that skips most validation.
fromBech32(string, limit = 90)Decode a bech32 string to bytes
toBech32(prefix, bytes, limit = 90)Encode bytes to a bech32 string
fromBech32m(string, limit = 90)Decode a bech32m string to bytes
toBech32m(prefix, bytes, limit = 90)Encode bytes to a bech32m string
@exodus/bytes/base58.jsImplements base58 encoding.
Supports both standard base58 and XRP variant alphabets.
import { fromBase58, toBase58 } from '@exodus/bytes/base58.js'
import { fromBase58xrp, toBase58xrp } from '@exodus/bytes/base58.js'
fromBase58(string, format = 'uint8')Decode a base58 string to bytes
Uses the standard Bitcoin base58 alphabet
toBase58(arr)Encode a Uint8Array to a base58 string
Uses the standard Bitcoin base58 alphabet
fromBase58xrp(string, format = 'uint8')Decode a base58 string to bytes using XRP alphabet
Uses the XRP variant base58 alphabet
toBase58xrp(arr)Encode a Uint8Array to a base58 string using XRP alphabet
Uses the XRP variant base58 alphabet
@exodus/bytes/base58check.jsImplements base58check encoding.
import { fromBase58check, toBase58check } from '@exodus/bytes/base58check.js'
import { fromBase58checkSync, toBase58checkSync } from '@exodus/bytes/base58check.js'
import { makeBase58check } from '@exodus/bytes/base58check.js'
On non-Node.js, requires peer dependency @noble/hashes to be installed.
async fromBase58check(string, format = 'uint8')Decode a base58check string to bytes asynchronously
Validates the checksum using double SHA-256
async toBase58check(arr)Encode bytes to base58check string asynchronously
Uses double SHA-256 for checksum calculation
fromBase58checkSync(string, format = 'uint8')Decode a base58check string to bytes synchronously
Validates the checksum using double SHA-256
toBase58checkSync(arr)Encode bytes to base58check string synchronously
Uses double SHA-256 for checksum calculation
makeBase58check(hashAlgo, hashAlgoSync)Create a base58check encoder/decoder with custom hash functions
@exodus/bytes/wif.jsWallet Import Format (WIF) encoding and decoding.
import { fromWifString, toWifString } from '@exodus/bytes/wif.js'
import { fromWifStringSync, toWifStringSync } from '@exodus/bytes/wif.js'
On non-Node.js, requires peer dependency @noble/hashes to be installed.
async fromWifString(string[, version])Decode a WIF string to WIF data
Returns a promise that resolves to an object with { version, privateKey, compressed }.
The optional version parameter validates the version byte.
Throws if the WIF string is invalid or version doesn't match.
fromWifStringSync(string[, version])Decode a WIF string to WIF data (synchronous)
Returns an object with { version, privateKey, compressed }.
The optional version parameter validates the version byte.
Throws if the WIF string is invalid or version doesn't match.
async toWifString({ version, privateKey, compressed })Encode WIF data to a WIF string
toWifStringSync({ version, privateKey, compressed })Encode WIF data to a WIF string (synchronous)
@exodus/bytes/array.jsTypedArray utils and conversions.
import { typedView } from '@exodus/bytes/array.js'
typedView(arr, format = 'uint8')Create a view of a TypedArray in the specified format ('uint8' or 'buffer')
Important: does not copy data, returns a view on the same underlying buffer
@exodus/bytes/encoding.jsImplements the Encoding standard: TextDecoder, TextEncoder, TextDecoderStream, TextEncoderStream, some hooks.
import { TextDecoder, TextEncoder } from '@exodus/bytes/encoding.js'
import { TextDecoderStream, TextEncoderStream } from '@exodus/bytes/encoding.js' // Requires Streams
// Hooks for standards
import { getBOMEncoding, legacyHookDecode, labelToName, normalizeEncoding } from '@exodus/bytes/encoding.js'
new TextDecoder(label = 'utf-8', { fatal = false, ignoreBOM = false })TextDecoder implementation/polyfill.
Decode bytes to strings according to WHATWG Encoding specification.
new TextEncoder()TextEncoder implementation/polyfill.
Encode strings to UTF-8 bytes according to WHATWG Encoding specification.
new TextDecoderStream(label = 'utf-8', { fatal = false, ignoreBOM = false })TextDecoderStream implementation/polyfill.
A Streams wrapper for TextDecoder.
Requires Streams to be either supported by the platform or polyfilled.
new TextEncoderStream()TextEncoderStream implementation/polyfill.
A Streams wrapper for TextEncoder.
Requires Streams to be either supported by the platform or polyfilled.
labelToName(label)Implements get an encoding from a string label.
Convert an encoding label to its name, as a case-sensitive string.
If an encoding with that label does not exist, returns null.
All encoding names are also valid labels for corresponding encodings.
normalizeEncoding(label)Convert an encoding label to its name, as an ASCII-lowercased string.
If an encoding with that label does not exist, returns null.
This is the same as decoder.encoding getter,
except that it:
replacement encoding and its
labelsnullIt is identical to:
labelToName(label)?.toLowerCase() ?? null
All encoding names are also valid labels for corresponding encodings.
getBOMEncoding(input)Implements BOM sniff legacy hook.
Given a TypedArray or an ArrayBuffer instance input, returns either of:
'utf-8', if input starts with UTF-8 byte order mark.'utf-16le', if input starts with UTF-16LE byte order mark.'utf-16be', if input starts with UTF-16BE byte order mark.null otherwise.legacyHookDecode(input, fallbackEncoding = 'utf-8')Implements decode legacy hook.
Given a TypedArray or an ArrayBuffer instance input and an optional fallbackEncoding
encoding label,
sniffs encoding from BOM with fallbackEncoding fallback and then
decodes the input using that encoding, skipping BOM if it was present.
Notes:
fallbackEncoding option per spec.
Use with care.This method is similar to the following code, except that it doesn't support encoding labels and only expects lowercased encoding name:
new TextDecoder(getBOMEncoding(input) ?? fallbackEncoding).decode(input)
@exodus/bytes/encoding-lite.jsThe exact same exports as @exodus/bytes/encoding.js are also exported as
@exodus/bytes/encoding-lite.js, with the difference that the lite version does not load
multi-byte TextDecoder encodings by default to reduce bundle size 10x.
import { TextDecoder, TextEncoder } from '@exodus/bytes/encoding-lite.js'
import { TextDecoderStream, TextEncoderStream } from '@exodus/bytes/encoding-lite.js' // Requires Streams
// Hooks for standards
import { getBOMEncoding, legacyHookDecode, labelToName, normalizeEncoding } from '@exodus/bytes/encoding-lite.js'
The only affected encodings are: gbk, gb18030, big5, euc-jp, iso-2022-jp, shift_jis
and their labels when used with TextDecoder.
Legacy single-byte encodingds are loaded by default in both cases.
TextEncoder and hooks for standards (including labelToName / normalizeEncoding) do not have any behavior
differences in the lite version and support full range if inputs.
To avoid inconsistencies, the exported classes and methods are exactly the same objects.
> lite = require('@exodus/bytes/encoding-lite.js')
[Module: null prototype] {
TextDecoder: [class TextDecoder],
TextDecoderStream: [class TextDecoderStream],
TextEncoder: [class TextEncoder],
TextEncoderStream: [class TextEncoderStream],
getBOMEncoding: [Function: getBOMEncoding],
labelToName: [Function: labelToName],
legacyHookDecode: [Function: legacyHookDecode],
normalizeEncoding: [Function: normalizeEncoding]
}
> new lite.TextDecoder('big5').decode(Uint8Array.of(0x25))
Uncaught:
Error: Legacy multi-byte encodings are disabled in /encoding-lite.js, use /encoding.js for full encodings range support
> full = require('@exodus/bytes/encoding.js')
[Module: null prototype] {
TextDecoder: [class TextDecoder],
TextDecoderStream: [class TextDecoderStream],
TextEncoder: [class TextEncoder],
TextEncoderStream: [class TextEncoderStream],
getBOMEncoding: [Function: getBOMEncoding],
labelToName: [Function: labelToName],
legacyHookDecode: [Function: legacyHookDecode],
normalizeEncoding: [Function: normalizeEncoding]
}
> full.TextDecoder === lite.TextDecoder
true
> new full.TextDecoder('big5').decode(Uint8Array.of(0x25))
'%'
> new lite.TextDecoder('big5').decode(Uint8Array.of(0x25))
'%'
@exodus/bytes/encoding-browser.jsSame as @exodus/bytes/encoding.js, but in browsers instead of polyfilling just uses whatever the
browser provides, drastically reducing the bundle size (to less than 2 KiB gzipped).
import { TextDecoder, TextEncoder } from '@exodus/bytes/encoding-browser.js'
import { TextDecoderStream, TextEncoderStream } from '@exodus/bytes/encoding-browser.js' // Requires Streams
// Hooks for standards
import { getBOMEncoding, legacyHookDecode, labelToName, normalizeEncoding } from '@exodus/bytes/encoding-browser.js'
Under non-browser engines (Node.js, React Native, etc.) a full polyfill is used as those platforms
do not provide sufficiently complete / non-buggy TextDecoder APIs.
Implementations in browsers have bugs,
but they are fixing them and the expected update window is short.
If you want to circumvent browser bugs, use full @exodus/bytes/encoding.js import.
See GitHub Releases tab