Jump to content

User:Dcoetzee/Wikicode/Specification

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Dcoetzee (talk | contribs) at 16:27, 5 May 2004 (My initial pseudocode proposal!). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

A while ago, on Talk: Binary tree, dze27 said:

Also, perhaps we need some sort of consistent pseudo-language for specifying pseudocode. Something like they do in Introduction to Algorithms, or smilar dze27

The pseudocode in Intro to Algorithms requires some rather complex typesetting (I've had to duplicate it before). This would be better for readers, but not conducive to editing by writers. I agree there needs to be consistency; I've written pseudocode on a number of pages, but I don't claim mine is ideal. I try to focus on a few aspects I consider important:
  • Clarity: the most important; it should be clear what the code does
  • Brevity: the code should not occupy much article space, and should consist mainly of algorithm content as opposed to unnecessary syntax
  • Ease of writing: the code should be easy to write and edit
  • Universality: the code should be clear to programmers familiar with nearly any common language, and not too similar to any one
To facilitate these I've adopted a few conventions, such as Wirth-style := for assignment, use of indention in place of blocks and end marks, and so on, but I appreciate any suggestions for improvements.
Derrick Coetzee 18:30, 4 May 2004 (UTC)

Here's my (that is, DCoetzee's) somewhat incomplete suggestion for a standard pseudocode, intended to meet the above general goals, please criticize, expand, and help wikify and finalize it:

The Wikicode Standard

This is an informal standard for a pseudocode which I call wikicode. The purpose of wikicode is to easily facilitate both writing and reading of clear and concise descriptions of algorithms and data structures. All wikicode will be typeset in a fixed-width font, for example by prefixing each line with spaces. Do not wrap the code in HTML <pre> tags; this disallows additional markup inside the code block. If wikicode occurs in the middle of encyclopedic text, this will be indicated by wrapping it in HTML <code> tags, which also use a fixed-width font.

Functions

A possibly recursive function is defined using the emboldened function keyword. It is given a list of arguments; types may or may not be specified for the arguments, but if they are they will be indicated by the argument name, a colon (:), and then the type name (see Types). Unabbreviated names should be preferred, unless the name is explained in the text or comments:

 function sort(list, comparisonPredicate)
     indented body of function
 function addModK(left : int, right : int, modulus : int)
     indented body of function

Functions are called by simply specifying the name, followed by a parenthesized list of arguments. If there is only one argument, the parentheses may be omitted, if this is clearer:

 addModK(5, 3, 7)
 sort(myList, func(x,y) x < y)
 sin 5

As shown above, anonymous functions may be created with func. You can also generally assume all sorts of useful math, string, etc. functions are available, as long as you explain them, in comments or in the text. However, when manipulating the built-in lists, sets, and maps, please use their standard interface (see their sections).

Comments

Comments are indicated by the use of italics occupying an entire line or a suffix of a line. In the former case, they should form complete (possible imperative) sentences; in the latter case, they should be preceded by //, to delineate them:

  Do something silly.
  i := i + 0       // add nothing to i

Variables

Variables are mutable, are referenced by name, and are assigned new values using the := operator. See Names for naming guidelines. Local variables need not be declared, but if you wish to you can use the var form for this:

 var w               // number of free munchkins
 var x := 2
 var y : int := 5
 var z : int
 var a := 2, b : int, c, d : int := 5

Variables need not be initialized when declared, but may not be used before being initialized.

Conditionals and Loops

Conditional tests can be done with the if-else statement, with if and else emboldened:

 if x = 5
     x is 5 here
 else
     x is not 5 here

There is no then or endif, and there need not be an else branch. The condition must have have bool type (see section Types). The equality comparison operator is simply =. For less-than or equal to (&le;) will be used, for greater-than or equal to (&ge;) will be used, and for not-equal (&ne;) will be used, as in mathematical texts.

There are only four types of loops, the while loop, and the do-while loop, the for-to loop, and the for-in loop. The while loop iterates while the condition holds; the until keyword may also be used to iterate until the condition does not hold. No extra parentheses or keywords are needed:

 while x ≠ 0
      body of while loop

The do-while loop, unlike the while loop, always executes at least once before testing its condition:

 do
     body
 while done = false

The for-to loop steps an integer variables along a range of integer values in increasing order. For decreasing order, for-downto can be used.

 for x := 1 to 10
     body, executed 10 times
 for x := 10 downto 1
     body, executed 10 times

Finally, the for-in loop is used with lists and sets (see section Sets) and iterates over the elements of the list in order or set in some arbitrary order:

 for n in numberList
     do something with n
 for n in numberSet
     do something with n

The keyword end loop can be used to break out of any loop at any time.

Types

There are only a few built-in types available, as well as mechanisms for creating user-defined types. Types (like comments) are always typeset in italics; no ambiguity arises because types should not occur on a line by themselves or be preceded by //.

The simplest built-in type bool has one of two built-in values: true or false. Any bool value is a valid condition for a conditional or loop construct. The built-in type int is an infinite-precision integer type including positive and negative values and zero. The type character includes all conceivable writable characters and is written by surrounding the character with single quotes ('), and string is a string of such characters, surrounded by double-quotes ("). Double-quotes may be used inside a string, as long as it's clear from context where the string ends; the outer double-quotes may be emphasized with bold ("hel"lo") if this is not clear.

There is also a primitive list type, a primitive set type, and a primitive map type. See their respective sections for more details.

A new record type (called a struct in C) may be defined using the record keyword, may be recursive, and must include field names but not necessarily types:

 record employee
     name : string
     age : int
     mentor : employee
     stuff

Records, as well as tagged unions below, are only referred to by reference, and references may contain the special value null to indicate they refer to nothing. Record fields are accessed using ., as in e.name, and such values may be assigned to. Multiple fields may be listed on a line, if commas are used:

 record employee
     name : string, age : int
     mentor : employee, stuff

Types may also be constructed using the tagged union keyword, which defines a tagged union, much like ML's datatype. Tagged unions may be recursive, and assign a tag name to each branch:

 tagged union tree
     Leaf:
     Node:  left : tree, right : tree

Fields are specified after each tag; only one set of these fields is available at a time, depending on the tag. The tag of a tagged union object may be determined using the cases keyword, and within the cases, fields of the correct branch of the tagged union may be accessed:

 cases t
     Leaf: // it's a leaf
     Node: // it's a node, do stuff with t.left and t.right

If a type is not specified for an entity, such as an argument, local variable, or record field, it can assume any value. Operations such as + may "fail" (make the pseudocode invalid) if they may be applied to a value they do not apply to.

List

A list stores a sequenced list of values. The primitive list operations are:

 list(1,2,3)               // the list containing 1, 2, 3 in that order

 emptyList                 // the list containing no elements

 insertFront list value    // inserts new value at front of list; push
 insertEnd list value      // inserts new value at end of list; enqueue
 removeFront list          // removes value from front of list; pop/dequeue
 removeEnd list            // removes value from end of list; undo enqueue
 isEmpty list              // returns true if and only if list is empty
 getFront list             // gets the value at the front of the list
 getEnd list               // gets the value at the end of the list
 list[i]                   // gets ith element of the list, zero for front
 size list                 // gets number of elements in the list
 listToSet list            // get a set containing all elements of the list

You can also use for-in to iterate over a list from front to back.

Set

A set stores an unordered sequence of values with no duplicates. The primitive set operations are:

 set(1,1,2,3)              // the set containing 1, 1, 2, 3
 emptySet                  // the empty set; don't use the math symbol
 insert set value          // insert a new value into the set
 remove set value          // remove a value from the set
 x := extractElement set   // removes a value from the set and returns it
 x ∈ set                   // x is in the set (written &isin;)
 x ∉ set                   // x is not in the set (written &notin;)
 size list                 // gets number of elements in the set
 setToList set             // get a list containing all elements of the set
 

You can also use for-in to iterate over a set in some arbitrary order.

Map

Maps are associative arrays. They take keys and produce associated values. All keys which have not been assigned values map to the special value unassigned, but it may be clearer to readers if you use the unassigned operation below. The primitive operations are:

 map(1 -> "a", 2 -> false) // the map mapping 1 to "a" and 2 to false
 emptyMap                  // the empty map
 map[x]                    // get value associated with x
 map[x] := y               // make x map to y from now on
 map[x] := unassigned      // unassign x
 size list                 // gets number of assigned keys in the map

You cannot iterate over a map.

Names

No two named entities, including functions, variables, and types, may share the same name in a single code listing (although duplicate names are possible in a single article). Valid names may include letters, numbers, greek letters, and mathematical markup, such as subscripts, superscripts, and primes (typed as apostrophes), but names may not start with a number. Names are case-sensitive, but names that differ only by case are strongly discouraged.

Multiword names will use mixed case, always beginning with lowercase. Underscores (_) or other word separators will not be used.

Line length

If possible, no line should exceed 78 characters, but there may be up to 100 characters on a line. Also, if lines can be made shorter without adding additional lines or sacrificing clarity, this should be done.

Indention

The entire code listing will be indented an initial 2 spaces. A function body or the body of a conditional or loop construct must always be indented exactly 4 additional spaces. If a function call's arguments are large enough that they cannot fit on one line, a line break will either follow an operator, a comma, or an opening parenthesis. If it follows a comma, the next line is indented to the column following the initial opening parenthesis of the call. If it follows an operator or parenthesis, it is indented 2 additional spaces. For example (don't really use names like the following):

 function foo(x, y)
     if x > y
         otherFunctionWithGreatBigName(thisArgumentFillsUpTheRestOfTheLine,
                                       argument1ToThePlusOperator +
                                         argument2ToThePlusOperator,
                                       callYetAnotherFunctionWithBigName(
                                         argToYetAnotherFunction))

Naturally, since this is less clear, clarity of naming and line length should be traded off intelligently.

Links may be used anywhere they are useful in pseudocode, particularly in comments and around calls to functions not specified in the current listing. However, if used on an emboldened or italic word, it should be used in addition to the existing markup. It may also be desirable to link tagged union or record the first time they are used conspicuously in an article. Also, use common link sense: link only strongly relevent articles.

{{msg:wikicode}}

A page will be created describing wikicode for readers, and each page which uses wikicode must immediately precede its first conspicuous use with {{msg:wikicode}}, a message which will provide the reader a link to this page.

Please add your own comments here