URI scheme

A URI scheme is the top level of the uniform resource identifier (URI) naming structure in computer networking. All URIs and absolute URI references are formed with a scheme name, followed by a colon character (":"), and the remainder of the URI called (in the outdated RFCs 1738 and 2396, but not the current STD 66/RFC 3986) the scheme-specific part. The syntax and semantics of the scheme-specific part are left largely to the specifications governing individual schemes, subject to certain constraints such as reserved characters and how to "escape" them.

URI schemes are frequently and incorrectly referred to as "protocols", or specifically as URI protocols or URL protocols, since most were originally designed to be used with a particular protocol, and often have the same name. The http scheme, for instance, is generally used for interacting with web resources using HyperText Transfer Protocol. Today, URIs with that scheme are also used for other purposes, such as RDF resource identifiers and XML namespaces, that are not related to the protocol. Furthermore, some URI schemes are not associated with any specific protocol (e.g. "file") and many others do not use the name of a protocol as their prefix (e.g. "news").

URI schemes should be registered with IANA, although non-registered schemes are used in practice. RFC 7595 describes the procedures for registering new URI schemes.

Generic syntax

Internet standard STD 66 (also RFC 3986) defines the generic syntax to be used in all URI schemes. Every URI is defined as consisting of four parts, as follows:

<scheme name> : <hierarchical part> [ ? <query> ] [ # <fragment> ]

The scheme name consists of a sequence of characters beginning with a letter and followed by any combination of letters, digits, plus ("+"), period ("."), or hyphen ("-"). Although schemes are case-insensitive, the canonical form is lowercase and documents that specify schemes must do so with lowercase letters. The scheme name is followed by a colon (":").

The hierarchical part of the URI is intended to hold identification information hierarchical in nature. If this part begins with a double forward slash ("//"), it is followed by an authority part and a path. If the hierarchical part doesn't begin with ("//") it contains only a path.

The authority part holds an optional user-information part, terminated with "@" (e.g. username:password@); a hostname (e.g., domain name or IP address); and an optional port number, preceded by a colon ":".
The path part, if present, may optionally begin with a single forward slash ("/"). It may not begin with two slash characters ("//"). The path is a sequence of segments (conceptually similar to directories, though not necessarily representing them) separated by a forward slash ("/"). Historically, each segment was specified to contain parameters separated from it using a semicolon (";"), though this was rarely used in practice and current specifications allow but no longer specify such semantics.

The query is an optional part, separated by a question mark ("?"), that contains additional identification information that is not hierarchical in nature. The syntax of the query string is not well defined; however by convention it is most often a sequence of <key>=<value> pairs separated by a semicolon^[1]^[2]^[3] or an ampersand. For example:

Semicolon: key1=value1;key2=value2;key3=value3
Ampersand: key1=value1&key2=value2&key3=value3

The fragment is an optional part separated from the front parts by a hash ("#"). It holds additional identifying information that provides direction to a secondary resource, e.g., a section heading (in an article) identified by the remainder of the URI. When the primary resource is an HTML document, the fragment is often an id attribute of a specific element and web browsers will make sure this element is visible.

Examples

The following figure displays two example URIs and their component parts.

                        hierarchical part
            ┌───────────────────┴─────────────────────┐
                        authority               path
            ┌───────────────┴───────────────┐┌───┴────┐
      abc://username:password@example.com:123/path/file?key=value#fragid1
      └┬┘   └───────┬───────┘ └────┬────┘ └┬┘           └───┬───┘ └──┬──┘
scheme name  user information     host    port            query   fragment

      urn:example:animal:ferret:nose
      └┬┘ └───────────┬────────────┘
scheme name          path

References

^ RFC 1866 section 8.2.1 : by Tim Berners-Lee in 1995 encourages CGI authors to support ';' in addition to '&'.
^ HTML 4.01 Specification: Implementation, and Design Notes: "CGI implementors support the use of ";" in place of "&" to save authors the trouble of escaping "&" characters in this manner."
^ Hypertext Markup Language - 2.0 "CGI implementors are encouraged to support the use of ';' in place of '&' "

External links

[1] RFC 1866 section 8.2.1 : by Tim Berners-Lee in 1995 encourages CGI authors to support ';' in addition to '&'.

[2] HTML 4.01 Specification: Implementation, and Design Notes: "CGI implementors support the use of ";" in place of "&" to save authors the trouble of escaping "&" characters in this manner."

[3] Hypertext Markup Language - 2.0 "CGI implementors are encouraged to support the use of ';' in place of '&' "

[1]

[2]

[3]

v t e Uniform Resource Identifier (URI) schemes
Official	about acct crid data file ftp geo gopher http https info ldap mailto nfs nntp sip / sips tag telnet urn view-source ws / wss xmpp
Unofficial	coffee ed2k gemini feed finger irc / irc6 / ircs ldaps magnet rsync ymsgr
Protocol list

v t e Hypermedia
Basics	Hypertext Hyperlink Hypertext fiction Hypervideo Adaptive hypermedia educational authoring Hyperlinks in virtual worlds
Resource identifiers	Uniform resource identifier Internationalized resource identifier Uniform resource name Uniform resource locator Extensible resource identifier Persistent uniform resource locator Semantic URL
Concepts	anchor text click path Domain name click here Copyright aspects of hyperlinking and framing deep linking Fat link URI fragment Hostname Hypertext Inline linking inbound link/backlink HTTP referer image map Internal link Internet bookmark linkback Link relation Link rot Object hyperlinking Path Screen hotspot Source tracking transclusion URI scheme URL normalization URL redirection Website Web page XML namespace
Technology	CURIE Hypertext Transfer Protocol XLink
See also	Digital poetry History of hypertext Interactive novel Interactive fiction Timeline of hypertext technology Copyright aspects of hyperlinking and framing World Wide Web History Domain Application Protocol