Help:WordToWiki
![]() | This help page is a how-to guide. It explains concepts or processes used by the Wikipedia community. It is not one of Wikipedia's policies or guidelines, and may reflect varying levels of consensus. |
There are various methods to transfer content from word processor software into a MediaWiki format as used on Wikipedia.
Microsoft Word
VisualEditor
VisualEditor allows for the copying/pasting of content from Word documents directly into a wiki page. Most formatting is kept intact – including tables. However, images and advanced formatting may need to be cleaned up upon import.
Word2MediaWikiPlus
The following Visual Basic macros from 2007, unmaintained as of 2017, may still work: Word2MediaWikiPlus Tested with Office 365 word, conversion works despite getting a warning several times. NOTE: This will (apparently?) only work with 32-bit Office installations
Download from: https://sourceforge.net/projects/word2mediawikip/
Microsoft Office Word Add-in For MediaWiki
Microsoft released an add-in that allows you to save your Microsoft Office Word 2007 or above documents straight into MediaWiki.
- Download the "Microsoft Office Word Add-in For MediaWiki" from Microsoft Download Center, and install it.
- Save the document as "MediaWiki (*.txt)" file type.
- Copy the text from the (*.txt) file into your Wiki page
Note that this extension does not work for Word 2013 by default, however it can be made to work with a registry change. See this page.
Possible issues with alternative solution
- This add-in requires Windows as an operating system; it won't work with macOS
- This Microsoft add-in does not handle images. A placeholder is emitted.
- End notes and footnotes can't be converted. Including them in a document will throw an error.
- If you attempt to resolve the previous issue by inserting <ref> tags, upon conversion Word will replace the angled brackets with < and >
- Some text will be enclosed by <nowiki> and </nowiki> tags.
- Not supported for Office/Word 2013, see Word Add-in For MediaWiki not supported in Word 2013?
Nevertheless, for those who are unfamiliar with MediaWiki Markup Language and who are working on simple articles, the Microsoft Office Word Add-in For MediaWiki can be a useful tool.
Two-stage conversion from Word to MediaWiki
The following methods both perform: Word → HTML → MediaWiki. <html> <head> <title>Hello</title> </head> <body> Anchor here <a name="G200">G200</a> </body> </html>
Automated scripts
The conversion can also be done using a combination of two scripts and two software packages.
- The following two software packages must be installed:
- wvHtml Word to HTML converter – part of the "wvWare" word viewing library. (Note: wvHtml is deprecated and the site recommends using
AbiWord --to=html
instead. AbiWord can be obtained at abisource.com.) - HTML::WikiConverter – a Perl module to convert HTML to wiki markup language.
- wvHtml Word to HTML converter – part of the "wvWare" word viewing library. (Note: wvHtml is deprecated and the site recommends using
- Write the bash script "doc2mw", and the perl script "html2mw", both shown below.
- Call doc2mw passing the word document as parameter. i.e.
> doc2mw my_word.doc
- doc2mw
- a bash script taking a single parameter, which calls wvHtml followed by html2mw.
#!/bin/bash
# doc2mw - Word to MediaWiki converter
FILE=$1
TMP="$$-${FILE}"
if [ -x "./html2mw" ]; then
HTML2MW='./html2mw'
else
HTML2MW='html2mw'
fi
wvHtml --targetdir=/tmp "${FILE}" "${TMP}"
# but see also AbiWord: http://www.abisource.com/help/en-US/howto/howtoexporthtml.html
# Remove extra divs
perl -pi -e "s/\<div[^\>]+.\>//gi;" "/tmp/${TMP}"
${HTML2MW} "/tmp/${TMP}"
rm "/tmp/${TMP}"
- html2mw
- a perl script called by doc2mw, which uses HTML::WikiConverter to convert html -> mediawiki.
#!/usr/bin/perl
# html2mw - HTML to MediaWiki converter
use HTML::WikiConverter;
my $b;
while (<>) { $b .= $_; }
my $w = new HTML::WikiConverter( dialect => 'MediaWiki' );
my $p = $w->html2wiki($b);
# Substitutions to get rid of nasty things we don't need
$p =~ s/<br \/>//g;
$p =~ s/\ \;//g;
print $p;
Disclaimer: These scripts are probably not the best way to do this, only a possible way to do this. Please feel free to improve them.
OpenOffice or LibreOffice
LibreOffice Writer can save Word documents directly to wikitext: go to File → Export → Save as type: Mediawiki. (For Linux users it may be necessary to install the library libreoffice-wiki-publisher). Alternatively, use the command-line utility like this:
soffice --headless --convert-to txt:MediaWiki mydocument.doc
OpenOffice versions 3.3 and later can send documents in formats it supports (including Microsoft Word) directly to a MediaWiki, but this does not seem to work under Windows 7. (At least for the German version of OpenOffice 3.3.0 you need to install the ‘Sun Wiki Publisher’-extension first! Server url: http://en.wikipedia.org/w/ ) Once you have added the MediaWiki-server of your choice, future submissions can happen automatically.
- Open the document in OpenOffice or LibreOffice Writer.
- Go to File → Send-To → To MediaWiki or File → Export → Save file as: Mediawiki
- Select your MediaWiki-server (or click on the button "Add..." to add a new site).
- Select a title and summary for your article, check the box if it's a minor revision.
- Click the send button.
Alternatively the manual 'export-function' can be used: File → Export → choose ‘MediaWiki (.txt)’-format. LibreOffice Writer 5 can export as a MediaWiki .txt file under Windows 10 if the appropriate 32- or 64-bit Java Runtime Environment (JRE) has been installed and enabled in LO. The document to be converted has to use styles, etc.; for example headers must be in Heading 2 style to be bracketed by "==" when converted.
Pandoc
Pandoc is a command-line utility that can convert from and to many document formats. Once installed, converting from Word to Mediawiki looks like this:
$ pandoc -t mediawiki mydocument.docx > mydocument.wiki
See also the online Pandoc tool which can convert an HTML-export of the Word document to MediaWiki format.