Jump to content

JSON

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 98.111.252.155 (talk) at 01:42, 28 April 2010 (Script Tag Injection). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
JSON
Filename extension
.json
Internet media type
application/json
Type of formatData interchange
Extended fromJavaScript
StandardRFC 4627
Websitehttp://json.org

JSON (an acronym for JavaScript Object Notation) is a lightweight text-based open standard designed for human-readable data interchange. It is derived from the JavaScript programming language for representing simple data structures and associative arrays, called objects (the “O” in “JSON”). Despite its relationship to JavaScript, it is language-independent, with parsers available for virtually every programming language.

The JSON format was originally specified in RFC 4627 by Douglas Crockford. The official Internet media type for JSON is application/json. The JSON filename extension is .json.

The JSON format is often used for serializing and transmitting structured data over a network connection. It is primarily used to transmit data between a server and web application, serving as an alternative to XML.

History

Although JSON was based on a subset of the JavaScript programming language (specifically, Standard ECMA-262 3rd Edition—December 1999[1]) and is commonly used with that language, it is considered to be a language-independent data format. Code for parsing and generating JSON data is readily available for a large variety of programming languages. json.org provides a comprehensive listing of existing JSON libraries, organized by language.

JSON was used at State Software in 2001. The JSON.org website was launched in 2002. In December 2005, Yahoo! began offering some of its web services in JSON.[2] Google started offering JSON feeds for its GData web protocol in December 2006.[3]

Data types, syntax and example

JSON's basic types are:

The following example shows the JSON representation of an object that describes a person. The object has string fields for first name and last name, contains an object representing the person's address, and contains a list (an array) of phone number objects.

{
     "firstName": "John",
     "lastName": "Smith",
     "age": 25,
     "address": {
         "streetAddress": "21 2nd Street",
         "city": "New York",
         "state": "NY",
         "postalCode": "10021"
     },
     "phoneNumber": [
         { "type": "home", "number": "212 555-1234" },
         { "type": "fax", "number": "646 555-4567" }
     ]
 }

A possible equivalent for the above in XML could be:

<Person>
  <firstName>John</firstName>
  <lastName>Smith</lastName>
  <age>25</age>
  <address>
    <streetAddress>21 2nd Street</streetAddress>
    <city>New York</city>
    <state>NY</state>
    <postalCode>10021</postalCode>
  </address>
  <phoneNumber type="home">212 555-1234</phoneNumber>
  <phoneNumber type="fax">646 555-4567</phoneNumber>
</Person>

Since JSON is a subset of JavaScript it is possible (but not recommended) to parse the JSON text into an object by invoking JavaScript's eval() function. For example, if the above JSON data is contained within a JavaScript string variable contact, one could use it to create the JavaScript object p like so:

 var p = eval("(" + contact + ")");

The contact variable must be wrapped in parentheses to avoid an ambiguity in JavaScript's syntax.[4]

The recommended way, however, is to use a JSON parser. Unless a client absolutely trusts the source of the text, or must parse and accept text which is not strictly JSON-compliant, one should avoid eval(). A correctly implemented JSON parser will accept only valid JSON, preventing potentially malicious code from running.

Modern browsers, such as Firefox 3.5 and Internet Explorer 8, include special features for parsing JSON. As native browser support is more efficient and secure than eval(), it is expected that native JSON support will be included in the next ECMAScript standard. [1]

JSON schema

There are several ways to verify the structure and data types inside a JSON object, much like an XML schema.

JSON Schema[5] is a specification for a JSON-based format for defining the structure of JSON data. JSON Schema provides a contract for what JSON data is required for a given application and how it can be modified, much like what XML Schema provides for XML. JSON Schema is intended to provide validation, documentation, and interaction control of JSON data. JSON Schema is based on the concepts from XML Schema, RelaxNG, and Kwalify, but is intended to be JSON-based, so that JSON data in the form of a schema can be used to validate JSON data, the same serialization/deserialization tools can be used for the schema and data, and it can be self descriptive.

Using JSON in Ajax

The following JavaScript code shows how the client can use an XMLHttpRequest to request an object in JSON format from the server. (The server-side programming is omitted; it has to be set up to respond to requests at url with a JSON-formatted string.)

var the_object = {}; 
var http_request = new XMLHttpRequest();
http_request.open( "GET", url, true );
http_request.onreadystatechange = function () {
    if ( http_request.readyState == 4 && http_request.status == 200 ) {
            the_object = JSON.parse( http_request.responseText );
        }
};
http_request.send(null);

Note that the use of XMLHttpRequest in this example is not cross-browser compatible; syntactic variations are available for Internet Explorer, Opera, Safari, and Mozilla-based browsers. The usefulness of XMLHttpRequest is limited by the same origin policy: the URL replying to the request must reside within the same DNS domain as the server that hosts the page containing the request. Alternatively, the JSONP approach incorporates the use of an encoded callback function passed between the client and server to allow the client to load JSON-encoded data from third-party domains and to notify the caller function upon completion, although this imposes some security risks and additional requirements upon the server.

Browsers can also use <iframe> elements to asynchronously request JSON data in a cross-browser fashion, or use simple <form action="url_to_cgi_script" target="name_of_hidden_iframe"> submissions. These approaches were prevalent prior to the advent of widespread support for XMLHttpRequest.

Dynamic <script> tags can also be used to transport JSON data. With this technique it is possible to get around the same origin policy but it is insecure. JSONRequest has been proposed as a safer alternative.

Security issues

Although JSON is intended as a data serialization format, its design as a subset of the JavaScript programming language poses several security concerns. These concerns center on the use of a JavaScript interpreter to dynamically execute JSON text as JavaScript, thus exposing a program to errant or malicious script contained therein—often a chief concern when dealing with data retrieved from the internet. While not the only way to process JSON, it is an easy and popular technique, stemming from JSON's compatibility with JavaScript's eval() function, and illustrated by the following code examples.

JavaScript eval()

Because all JSON-formatted text is also syntactically legal JavaScript code, an easy way for a JavaScript program to parse JSON-formatted data is to use the built-in JavaScript eval() function, which was designed to evaluate JavaScript expressions. Rather than using a JSON-specific parser, the JavaScript interpreter itself is used to execute the JSON data to produce native JavaScript objects.

Unless precautions are taken to validate the data first, the eval technique is subject to security vulnerabilities if the data and the entire JavaScript environment is not within the control of a single trusted source. If the data is itself not trusted, for example, it may be subject to malicious JavaScript code injection attacks. Also, such breaches of trust may create vulnerabilities for data theft, authentication forgery, and other potential misuse of data and resources. Regular expressions can be used to validate the data prior to invoking eval. For example, the RFC that defines JSON (RFC 4627) suggests using the following code to validate JSON before eval'ing it (the variable 'text' is the input JSON):[6]

var my_JSON_object = !(/[^,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]/.test(
text.replace(/"(\\.|[^"\\])*"/g, ''))) &&
eval('(' + text + ')');

A new function, JSON.parse(), has been proposed as a safer alternative to eval, as it is specifically intended to process JSON data and not JavaScript. It was to be included in the Fourth Edition of the ECMAScript standard,[7] though it is available now as a JavaScript library at http://www.JSON.org/json2.js and will be in the Fifth Edition of ECMAScript.[citation needed]

Native JSON

Recent web browsers now either have or are working on native JSON encoding/decoding. This removes the eval() security problem above and also makes it faster because it doesn't parse functions. Native JSON is generally faster compared to the JavaScript libraries commonly used before. As of June 2009 the following browsers have or will have native JSON support:

At least 5 popular JavaScript libraries have committed to use native JSON if available:

Comparison with other formats

XML

XML is often used to describe structured data and to serialize objects. Various XML-based protocols exist to represent the same kind of data structures as JSON for the same kind of data interchange purposes. However, XML being a general-purpose markup language, they are syntactically more complex and bigger in file size than JSON, which, in contrast, is specifically designed for data interchange.

Both lack an explicit mechanism for representing large binary data types such as image data (although binary data can be serialized in either case by applying a general-purpose binary-to-text encoding scheme). JSON lacks references (something XML has via extensions like XLink and XPointer) and has no standard path notation comparable to XPath.

YAML

Both functionally and syntactically, YAML is effectively a superset of JSON.[17] The common YAML library (Syck) also parses JSON.[18] Prior to YAML version 1.2, YAML was not quite a perfect superset of JSON, primarily because it lacked native handling of UTF-32 and required comma separators to be followed by a space.

The most distinguishing point of comparison is that YAML offers the following syntax enrichments which have no corresponding expression in JSON:

Relational:
YAML offers syntax for relational data: rather than repeating identical data later in a document, a YAML document can refer to an anchor earlier in the file/stream. Recursive structures (for example, an array containing itself) can be expressed this way. For example, a film data base might list actors (and their attributes) under a Movie's cast, and also list Movies (and their attributes) under an Actor's portfolio.
Extensible:
YAML also offers extensible data types beyond primitives (i.e., strings, floats, ints, bools) which can include class-type declarations.
Blocks:
YAML uses a block-indent syntax to allow formatting of structured data without use of additional characters (ie: braces, brackets, quotation marks, etc.). Besides giving YAML a different appearance than JSON, this block-indent device permits the encapsulation of text from other markup languages or even JSON in the other languages native literal style and without escaping of colliding sigils.

Efficiency

JSON is primarily used for communicating data over the Internet, but has certain inherent characteristics that may limit its efficiency for this purpose. Most of the limitations are general limitations of textual data formats and also apply to XML and YAML. For example, despite typically being generated by an algorithm (by machine), parsing must ironically be accomplished on a character-by-character basis. Additionally, the standard has no provision for data compression, interning of strings, or object references. Compression can, of course, be applied to the JSON formatted data (but the decompressed output typically still requires further full parsing by the browser for recognizable keywords, tags and delimiters).

In practice performance can be comparable to that of similar binary data formats [dubiousdiscuss] and often depends more on implementation quality than on the theoretical limitations of formats.

JSONP

JSONP or "JSON with padding" is a complement to the base JSON data format, a usage pattern that allows a page to request and more meaningfully use JSON from a server other than the primary server.

Under the same origin policy, a web page served from domain1.com cannot normally connect to or communicate with a server other than domain1.com. An exception is HTML <script> tags. Taking advantage of the open policy for <script> tags, some pages use them to retrieve JSON from other origins.

To see how that works, let's take one step back and consider a URL that, when requested, returns a JSON statement. In other words, a browser requesting the URL would receive something like:

   {"Name": "Cheeso", "Rank": 7}

The Basic Idea: Retrieving JSON via Script Tags

It's possible to specify any URL, including a URL that returns JSON, as the src attribute for a <script> tag.

Specifying a URL that returns plain JSON as the src-attribute for a script tag, would embed a data statement into a browser page. It's just data, and when evaluated within the browser's javascript execution context, it has no externally detectable effect.

One way to make that script have an effect is to use it as the argument to a function. invoke( {"Name": "Cheeso", "Rank": 7}) actually does something, if invoke() is a function in Javascript.

And that is how JSONP works. With JSONP, the browser provides a JavaScript "prefix" to the server in the src URL for the script tag; by convention, the browser provides the prefix as a named query string argument in its request to the server, e.g.,

 <script type="text/javascript" 
         src="http://domain2.com/getjson?jsonp=parseResponse">
 </script>

The server then wraps its JSON response with this prefix, or "padding", before sending it to the browser. When the browser receives the wrapped response from the server it is now a script, rather than simply a data declaration. In this example, what is received is

   parseResponse({"Name": "Cheeso", "Rank": 7})

...which can cause a change of state within the browser's execution context, because it invokes a method.

The Padding

While the padding (prefix) is typically the name of a callback function that is defined within the execution context of the browser, it may also be a variable assignment, an if statement, or any other Javascript statement prefix.

Script Tag Injection

But to make a JSONP call, you need a script tag. Therefore, for each new JSONP request, the browser must add a new <script> tag -- in other words, inject the tag -- into the HTML DOM, with the desired value for the src attribute. This element is then evaluated, the src URL is retrieved, and the response JSON is evaluated. Bob's your uncle.

In that way, the use of JSONP can be said to allow browser pages to work around the same origin policy via script tag injection.

Basic Security concerns

Because JSONP makes use of script tags, calls are essentially open to the world. For that reason, JSONP may be inappropriate for carrying sensitive data.[19]

Including script tags from remote sites allows the remote sites to inject any content into a website. If the remote sites have vulnerabilities that allow JavaScript injection, the original site can also be affected.

Cross-site request forgery

Naïve deployments of JSONP are subject to cross-site request forgery attacks (CSRF or XSRF).[20] Because the HTML <script> tag does not respect the same origin policy in web browser implementations, a malicious page can request and obtain JSON data belonging to another site. This will allow the JSON-encoded data to be evaluated in the context of the malicious page, possibly divulging passwords or other sensitive data if the user is currently logged into the other site.

This is only a problem if the JSON-encoded data contains sensitive information that should not be disclosed to a third party, and the server depends on the browser's Same Origin Policy to block the delivery of the data in the case of an improper request. There is no problem if the server determines the propriety of the request itself, only putting the data on the wire if the request is proper. Cookies are not by themselves adequate for determining if a request was authorized. Exclusive use of cookies is subject to cross-site request forgery.

History

The original proposal for JSONP appears to have been made by Bob Ippolito in 2005 [21] and is now used by many Web 2.0 applications such as Dojo Toolkit Applications, Google Web Toolkit Applications[22] and Web Services. Further extensions of this protocol have been proposed by considering additional input arguments as, for example, is the case of JSONPP[23] supported by S3DB web services.

Object references

The JSON standard does not support object references, but the Dojo Toolkit illustrates how conventions can be adopted to support such references using standard JSON. Specifically, the dojox.json.ref module provides support for several forms of referencing including circular, multiple, inter-message, and lazy referencing.[24]

See also

References

  1. ^ Crockford, Douglas (May 28, 2009). "Introducing JSON". json.org. Retrieved July 3, 2009.
  2. ^ Yahoo!. "Using JSON with Yahoo! Web services". Retrieved July 3, 2009.
  3. ^ Google. "Using JSON with Google Data APIs". Retrieved July 3, 2009. {{cite web}}: |author= has generic name (help)
  4. ^ Crockford, Douglas (July 9, 2008). "JSON in JavaScript". json.org. Retrieved September 8, 2008.
  5. ^ http://json-schema.org
  6. ^ Douglas Crockford (July 2006). "IANA Considerations". The application/json Media Type for JavaScript Object Notation (JSON). IETF. sec. 6. doi:10.17487/RFC4627. RFC 4627. Retrieved October 21, 2009.
  7. ^ Crockford, Douglas (December 6, 2006). "JSON: The Fat-Free Alternative to XML". Retrieved July 3, 2009.
  8. ^ "Using Native JSON". June 30, 2009. Retrieved July 3, 2009.
  9. ^ Barsan, Corneliu (September 10, 2008). "Native JSON in IE8". Retrieved July 3, 2009.
  10. ^ "Web specifications supported in Opera Presto 2.5". March 10, 2010. Retrieved March 29, 2010.
  11. ^ Hunt, Oliver (June 22, 2009). "Implement ES 3.1 JSON object". Retrieved July 3, 2009.
  12. ^ "YUI 2: JSON utility". September 1, 2009. Retrieved October 22, 2009.
  13. ^ "Learn JSON". April 7, 2010. Retrieved April 7, 2010.
  14. ^ "Ticket #4429". May 22, 2009. Retrieved July 3, 2009.
  15. ^ "Ticket #8111". June 15, 2009. Retrieved July 3, 2009.
  16. ^ "Ticket 419". October 11, 2008. Retrieved July 3, 2009.
  17. ^ Ben-Kiki, Oren; Evans, Clark; döt Net, Ingy (May 13, 2008). "YAML Ain't Markup Language (YAML) Version 1.2". Retrieved July 3, 2009. YAML can therefore be viewed as a natural superset of JSON, offering improved human readability and a more complete information model. This is also the case in practice; every JSON file is also a valid YAML file. This makes it easy to migrate from JSON to YAML if/when the additional features are required.
  18. ^ RedHanded (April 7, 2005). "YAML is JSON". Retrieved July 3, 2009.
  19. ^ RIAspot. "JSON P for Cross Site XHR".[dead link]
  20. ^ Grossman, Jeremiah (January 27, 2006). "Advanced Web Attack Techniques using GMail". Retrieved July 3, 2009.
  21. ^ "Remote JSON - JSONP". from __future__ import *. Bob.pythonmac.org. December 5, 2005. Retrieved September 8, 2008.
  22. ^ "GWT Tutorial: How to Read Web Services Client-Side with JSONP". Google Web Toolkit Applications. February 6, 2008. Retrieved July 3, 2009.
  23. ^ Almeida, Jonas (June 11, 2008). "JSON, JSONP, JSONPP?". S3DB. Retrieved April 26, 2009.
  24. ^ Zyp, Kris (June 17, 2008). "JSON referencing in Dojo". Retrieved July 3, 2009.