Jump to content

Wikipedia:Manual of Style/Glossaries/DD bug test cases

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by SMcCandlish (talk | contribs) at 06:33, 13 December 2017 (Examination of the main problem: markup tweaks). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Test cases and explanation of the definition list definition (<dd>) bug and how to avoid it.

Overview

Due to MediaWiki's very brittle handling of HTML list structures generally (in particular their inability to handle multiple paragraphs except under a very specific circumstance, outlined below), the ; and : wikimarkup features cannot be used to produce any but simplistic and easily broken definition lists, unlike real HTML. Some of the interrelated bug reports include: 1584 (unresolved as of September 2014), 5178 (unresolved as of September 2014),6200, etc.

The main problem is detailed below, but there are also other problems with the wikimarkup version of definition lists. Bug reports include: 6776, 11894, and many others.

It is unlikely that all these problems will be fixed any time within the next several years, especially as they are seen as "features" by some, especially the termination of a list item by a line break. (See detailed comments in the bug reports for more information.)

Examination of the main problem

In ; and : markup, if a single definition uses multiple paragraphs, it is necessary to specify the paragraphs explicitly with a <p> or the full <p>...</p> paragraph markup, and to do so without line breaks. We cannot rationally just create a second definition line for a second paragraph, as this indicates two separate definitions, not a multi-paragraph definition, and doing this causes all sorts of problems.

Due to bugs in (or alleged features of) the MediaWiki software (see above), the paragraphs must not be separated by a newline either inside or outside of the paragraph, in wikimarkup-based list items such as definitions:

;term 1
:<p>Part of definition of term 1.</p><p>More of definition of term 1.</p>

and it is highly unlikely that any regularly edited text would retain such precious formatting for long before someone broke it. The same holds true of the use (again without linebreaks) of <br /><br />, as suggested (along with <p>...</p>) at Help:List, which leads to accessibility problems anyway: text visually broken with simply <br /> line breaks this way will not be treated as separate paragraphs by screen readers.

To be clear, this example does not work:

;term 1
:<p>
Part of the definition of term 1.
</p><p>
More of definition of term 1.
</p>

and neither does this one:

;term 1
:<p>Part of definition of term 1.</p>
<p>More of definition of term 1.</p>

nor anything like them (see test cases below).

For the same reason, if a :-initiated definition requires an indented segment, one has to use <blockquote>...</blockquote> around it to get valid code as the result, and must butt those tags against any preceding or following paragraph tags in the same definition.

Help:List's suggestion to use : as an indent produces semantically invalid markup. While this is not an enormous concern with a quick-and-dirty wikimarkup lists, it would be a major problem for structured glossaries:

Sloppy code Looks okay... But output is poor
{{tnull|glossary}
{{tnull|term|term 1}}
{{tnull|defn|1=Beginning of definition.
:Indented text in definition.
Conclusion of definition.
}}
{{tnull|glossend}}
term 1
Beginning of definition.
Indented text in definition.
Conclusion of definition.
<dl>
<dt>term 1</dfn></dt>
<dd>Beginning of definition.
<dl>
<dd>Indented text in definition.</dd>
</dl>
Conclusion of definition.</dd>
</dl>

The rendered output of this looks correct to the sighted human reader, but in reality, the MediWiki parser has made the block quotation into an entire new definition list (glossary)! Wikimarkup definition list structures cannot be mixed with structured markup.

The brittle list handling bug/feature is quite general, and also affects ordered (#) and (*) unordered lists.

From the perspective of non-technical editors, simple tricks can be used to make the code more readable for some editors. One such kluge is to put linebreaks between definitions, as already illustrated, giving the false appearance of a well-spaced list, when in reality MediaWiki creates a whole slew of pointless micro-lists, ruining the semantic value of bothering to use definition list markup in the first place. Another is to code a multi-paragraph definition as multiple definitions, but write the prose as if it were a single definition in two paragraphs. And as already, noted this just blatantly falsifies the semantic markup, resulting in pretty-looking but technically awful output.

Such hacks will lead to confusing Wikipedia code (that other editors are likely to correct anyway), redundant MediaWiki output, and blatantly invalid XHTML, among other problems, resulting in accessibility and usability issues.

To repeat: It just doesn't work, due to MediaWiki bugs/features, but there is an easy solution as show below.

NB: Replacing : with a real <dd>...</dd> structure has no effect if ; is used, or vice-versa with <dt>...</dt> and :. The entire structure must be entirely [X]HTML in order to function properly. Which brings us to...

Workaround

The workaround, as illustrated below, is to abandon ; and : entirely for any case in which one intends to produce rich definition lists, including glossaries, and instead use pure XHTML markup: <dl>/<dt>/<dd>. And there is no reason to use HTML manually to produce a glossary when the easy-to-use structured glossary templates will do this for you, and do it consistently with other glossary articles. For non-structured glossaries, use bullet lists or use subheadings and plain-text entries, as recommended at WP:Manual of Style (glossaries).

Test cases

<dl>
;term 1
:<p>
This is part of the definition.
</p><p>
This is more of the definition.
</p>
</dl>
term 1

This is part of the definition.

This is more of the definition.


Failure: Definitions not indented.

<dl>
;term 2
:<p>This is part of the definition.</p>
<p>This is more of the definition.</p>
</dl>
term 2

This is part of the definition.

This is more of the definition.


Failure: Only one definition indented.

<dl>
;term 3
<dd>
<p>
This is part of the definition.
</p><p>
This is more of the definition.
</p>
</dd>
</dl>
term 3

This is part of the definition.

This is more of the definition.


Failure: Definitions not indented.

<dl>
;term 4
<dd>
<p>This is part of the definition.</p>
<p>This is more of the definition.</p>
</dd>
</dl>
term 4

This is part of the definition.

This is more of the definition.


Failure: Definitions not indented.

<dl>
;term 5
<dd><p>
This is part of the definition.
</p><p>
This is more of the definition.
</p></dd>
</dl>
term 5

This is part of the definition.

This is more of the definition.


Failure: Definitions not indented.

<dl>
;term 6
<dd><p>This is part of the definition.</p>
<p>This is more of the definition.</p></dd>
</dl>
term 6

This is part of the definition.

This is more of the definition.


Failure: Definitions not indented.

<dl>
;term 7
<dd>This is the entire definition.</dd>
</dl>
term 7
This is the entire definition.

Failure: Definition not indented.

<dl>
;term 8
<dd><p>This is the entire definition.</p></dd>
</dl>
term 8

This is the entire definition.


Failure: Definition not indented.

<dl>
;term 9
:<p>This is part of the definition.</p><p>This is more of the definition.</p>
</dl>
term 9

This is part of the definition.

This is more of the definition.


Poor: Indentation is typical WP style for this markup, but undesirable for proper use of the tags to create a glossary, because even the term is indented.

<dl>
;term 10
<dd><p>This is part of the definition.</p><p>This is more of the definition.</p></dd>
</dl>
term 10

This is part of the definition.

This is more of the definition.


Failure: Definitions not indented.

<dl>
<dt>term 11</dt>
<dd><p>This is part of the definition.</p><p>This is more of the definition.</p></dd>
</dl>
term 11

This is part of the definition.

This is more of the definition.


Poor: Undesirable indentation of the term is gone, so it looks right, but it requires </p><p> on the same line, which is easily broken.

<pre> <dl> <dt>term 12</dt> <dd>1. This is the first definition.</dd> <dd><p>2. This is part of the second definition.</p> <p>This is more of the second definition.</p></dd> </dl> </pre>

term 12
1. This is the first definition.

2. This is part of the second definition.

This is more of the second definition.


Success: Perfect, and supports <p> properly, which means it will also support <blockquote>, nested lists, etc.

<dl>
<dt>term 13</dt>
:<p>This is part of the definition.</p>
<p>This is more of the definition.</p></dd>
</dl>
term 13

This is part of the definition.

This is more of the definition.


Failure: Only one definition indented.

<dl>
<dt>term 14</dt>

<dd>1. This is the first definition.</dd>

<dd>
2. This is part of the second definition.

This is more of the second definition.
</dd>
</dl>
term 14
1. This is the first definition.
2. This is part of the second definition. This is more of the second definition.

Failure: Lack of auto-markup of paragraphs. This used to work perfectly, because MediaWiki would auto-generate paragraph markup for plain text entered on isolated lines, even inside a <dd>...</dd> wrapper, and didn't have any problem with definitions spaced apart from terms, either. This broke in late 2013.

<dl>
<dt>term 15</dt>
<dd>1. This is the first definition.
<dd><p>2. This is part of the second definition.</p>
<p>This is more of the second definition.</p></dd>
</dl>
term 15
1. This is the first definition.

2. This is part of the second definition.

This is more of the second definition.


Success: Perfect, even with a two-fold coding error (missing </p></dd> on the first definition)

<dl>
<dt>term 16</dt>
<dd>1. This is the first definition.</dd>
<dd>2. This is part of the second definition.
<p>This is more of the second definition.</p></dd>
</dl>
term 16
1. This is the first definition.
2. This is part of the second definition.

This is more of the second definition.


Acceptable: Vertical spacing isn't quite right, but with real paragraphs of content in place, no one would really notice or care.

<dl>
<dt>term 17</dt>
<dd>1. This is the first definition.</dd>
<dd><p>2. This is part of the second definition.</p>
<p>This is more of the second definition.</p></dd>
</dl>
term 17
1. This is the first definition.

2. This is part of the second definition.

This is more of the second definition.


Success: Perfect; no p markup required on single-paragraph entries.

<dl>
<dt>term 18</dt>
<dd>1. This is the first definition.</dd>
<dd><p>2. This is part of the second definition.</p>
<p>This is more of the second definition.</p></dd>
<dd>3. This is a third, complex definition:
{{gbq|With a block quotation}}
Another paragraph, and
* An embedded
* List here.
Conclusion.</dd>
<dd>4. Fourth definition, with blank line

to cause paragraph break.</dd>
</dl>
term 18
1. This is the first definition.

2. This is part of the second definition.

This is more of the second definition.

3. This is a third, complex definition:

With a block quotation

Another paragraph, and

  • An embedded
  • List here.
Conclusion.
4. Fourth definition, with blank line to cause paragraph break.

Success: Perfect – everything works as expected.

{{glossary}}
{{term|term A}}
{{defn|This is the definition.}}
{{term|term B}}
{{defn|1. This is the first definition.}}
{{defn|<p>2. This is part of the second definition.</p>
<p>This is more of the second definition.</p>
}}
{{defn|3. This is a third, complex definition:
{{gbq|With a block quotation}}
Another paragraph, and
* An embedded
* List here.
Conclusion.
}}
{{defn|4. Fourth definition, with blank line

to cause paragraph break.}}
{{term|term C}}
{{defn|This is the definition.}}
{{glossary end}}
term A
This is the definition.
term B
1. This is the first definition.

2. This is part of the second definition.

This is more of the second definition.

3. This is a third, complex definition:

With a block quotation

Another paragraph, and

  • An embedded
  • List here.

Conclusion.

4. Fourth definition, with blank line to cause paragraph break.
term C
This is the definition.

Success: Perfect – everything works as expected, using templated version of code.

Developer views

In the process of working on the now-closed MediWiki bug report Phabricator: T3584 (formerly Template:Bugzilla):

"Real HTML <li>[...]</li> tags are very rarely used, and then only by HTML-savvy editors who want to apply a class or style attribute, or by editors who cut-and-paste some existing HTML code. Since block or inline content is allowed in list items, this would be the way to allow correct and more complex formatting of the contents of list items, but at the cost of complex and unusual markup in the edit field, incompatible with wikitext lists."