06tron: (medium)

With the book club this past semester I read Daphne du Maurier's Rebecca, Gerald Durrell's My Family and Other Animals, and most of Tony Mendez's Argo. I also read Cormac McCarthy's All the Pretty Horses and Oliver Burkeman's 4000 Weeks on my own.

Percent-Encoding Guide

According to STD 66 RFC 3986, the characters of the string !#$&'()*+,/:;=?@[] could have a special meaning in a URI and are reserved. The ASCII alphanumeric characters and those contained in -._~ are unreserved. Any character outside of these two sets must be percent-encoded before inclusion in a URI. This is what the JavaScript encodeURI() function does, in addition to encoding the square brackets [] which were not yet included in the set of URI characters when the superseded RFC 2396 was written.

The unreserved characters can always be left unencoded, so we just need to encode some subset of the reserved characters. This subset depends on the URI scheme being used and where in the URI the characters are. The encodeURIComponent() function encodes all of the reserved characters except for !'()* which probably don't need to be encoded as they weren't yet reserved in RFC 2396. We can encode a still smaller subset of the reserved characters in the following cases.

Data URIs

RFC 2397 states that the main content section of a data URI will consist of some number of 'uric' characters, and that these characters are defined in RFC 2396. It turns out that any percent-encoded, reserved, or unreserved character is allowed, except for #[] as these three are not 'uric' characters. The code below shows how an SVG data URI might be constructed.

const rectSVG = String.raw`<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 100 100"><rect fill="#69E" x="20" y="8" width="15.2" height="87"/></svg>`;
const rectDataURI = "data:image/svg+xml;charset=UTF-8," + encodeURI(rectSVG).replaceAll("#", "%23");

The resulting string is data:image/svg+xml;charset=UTF-8,%3Csvg%20xmlns=%22http://www.w3.org/2000/svg%22%20viewBox=%220%200%20100%20100%22%3E%3Crect%20fill=%22%2369E%22%20x=%2220%22%20y=%228%22%20width=%2215.2%22%20height=%2287%22/%3E%3C/svg%3E.

Note that String.raw() is helpful if the data contains backslashes, but if it contains backticks or the substring ${ then you can no longer simply paste the data into a template literal.

String.raw`_\`_\${_${"`"}_` === "_\\`_\\${_`_"

Since the square brackets were more recently reserved, I thought they might be allowed in data URIs, but as of now a link like <a href="data:,%23[]">#[]</a> is flagged for an illegal character error by the W3C markup validator. I saw a few GitHub issues like this one in support of unescaped square brackets, so they may be allowed in the future.

Query Strings

The query part of a URI begins after a question mark. It is composed of the 'query' characters defined in RFC 3986, and these are exactly the same as the 'uric' characters. However, the characters &+= have special purposes. The query string is a set of 'key=value' pairs, separated by ampersands, and in which plus signs represent spaces. The equals sign needs to be encoded in the 'key' portion, but not in the 'value' portion as only the first equals sign separates the two parts. Encode the data as you would for a data URI, then handle these three special characters, and as a final step we can change the encoding of spaces from %20 to +. This data URI contains two links which compare the encoding of the reserved characters and the space character in a data URI and in a query string.

const inlineStyle = `background-image:url("${rectDataURI}");color-scheme:light dark`;
const vertices = "[[1,0],[0.58,0.58],[0,1],[-0.58,0.58],[-1,0],[-0.58,-0.58],[0,-1],[0.58,-0.58]]";

function encodeQueryValue(val) {
	return encodeURI(val).replace(/[#&'+]|%20/g, function (char) {
		return { "#": "%23", "&": "%26", "'": "%27", "+": "%2B", "%20": "+" }[char];
	});}

const mirrorPolygonURL = `https://home.6t.lt/66c/mirror_polygon.svg?h=6&v=${encodeQueryValue(vertices)}&i=${encodeQueryValue(inlineStyle)}`;

The above code also encodes the single quote character, as this GitHub issue suggests doing so in some cases. The code generates the URL https://home.6t.lt/66c/mirror_polygon.svg?h=6&v=%5B%5B1,0%5D,%5B0.58,0.58%5D,%5B0,1%5D,%5B-0.58,0.58%5D,%5B-1,0%5D,%5B-0.58,-0.58%5D,%5B0,-1%5D,%5B0.58,-0.58%5D%5D&i=background-image:url(%22data:image/svg%2Bxml;charset=UTF-8,%253Csvg%2520xmlns=%2522http://www.w3.org/2000/svg%2522%2520viewBox=%25220%25200%2520100%2520100%2522%253E%253Crect%2520fill=%2522%252369E%2522%2520x=%252220%2522%2520y=%25228%2522%2520width=%252215.2%2522%2520height=%252287%2522/%253E%253C/svg%253E%22);color-scheme:light+dark.

If you want to encode the whole query string at once, then the reserved characters to be encoded are #'+[], and any ampersands or equals signs could be manually encoded if necessary. Just remember that all extra encoding has to be done after using encodeURI() to avoid double encoding.

End of 2024 Changes

Profile

06tron: trees on hamstead heath (Default)
Matthew Richardson

July 2025

S M T W T F S
  12345
6789 101112
13141516171819
20212223242526
2728293031  

Syndicate

RSS Atom

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Aug. 16th, 2025 03:52 am
Powered by Dreamwidth Studios