Merge mit neuestem pywikipedia-framework und testen der neuen Funktionen. Es geht vorallem darum wikipedia.py zu erneuern, denn hier musste ich in Zeile 725-728 Korrekturen anbringen, damit der Bot zuverlässig und stabil läuft.
report; blockweise oder einzeln (ob jeder eintrag einz. sign. wird oder alles als block)
tolerance; nicht hinzufügen wenn nach kurzer zeit (hohe aktivität) neuer eintrag kommt
19
mehrsprachig/multi-language-support damit allg. funktionsfähig und auch für andere wikis brauchbar, ev. "contributen" an pywikipedia-bot-framework zuerst aber mit neuestem bot frameworkversion mergen/vergl.
DONE (engl. only!)
Bitte nicht verändern! Please do not change!
Version 0.1.0013
Id
What
Script
15
it would be a good idea to use https://jira.toolserver.org/secure/BrowseProjects.jspa the Toolserver issue tracker if the amount of bugs reported and to handle, exceed a certain number, but for the moment it is not needed. the problem is you cannot add a new project as normal user, so the system here has to deliver this service.
had issues with creation of BOT-MESSAGE containing some special chars (e.g. ':'). the algorithm should now be more stable, but it estimates the correct heading only, so could still make some problems.
Dogma: use as simple bot maintenance message texts as possible.
if such a message apears again, the bot now read its discussion page Benutzer Diskussion:DrTrigonBot and prints/show the content to output (log file). that informs you about the message and should lead to supress further display (of the same old message).
if the bot is executed by CRON job on toolserver the preprocessing/filtering of log file output does not work properly. have added another regex to the output handler, hope that helps.
runbotrun.py
10.3
purge page cache through wiki API according to example http://toolserver.org/~drtrigon/wiki-purge.html (has to be deleted from the toolserver now) implemented as wikipediaAPI.Page(...).purgePageCache(). the issues occured here earlier have disappeared or were due miss-programming/-understanding of the API by myself.
wikipediaAPI.py, sum_disc.py
10.2
the render given in http://svn.wikimedia.org/viewvc/pywikipedia/trunk/pywikiparser/ looks to be (very) old; the newest contributions are Modified Mon Oct 29 06:41:05 2007 UTC (19 months, 1 week ago) and the wiki API seems to do a quite good job. so the bot is using the API at the moment.
wikipediaAPI.py, sum_disc.py
11
because of improvements in 10.1 before the bot is now able to detect level 1 headings (through a work-a-round, but it is possible) with the old it was nearly impossible. so level 1 headings are taken into account except the bot has to fall-back into the old mode so the wrong display of discussions in front of level 1 headings (on problematic pages such as Löschdiskussionen bei Benuterseiten, WP:FzW, WP:Auskunft, Portale and others) should stop now.
wikipediaAPI.py, sum_disc.py
10.1
wikipediaAPI.Page(...).getSections() improved; now it is on a good level and able to find and synchronize the subsection headings in pages
wikipediaAPI.Page(...).getFull() introduced; to help ...getSections() it uses 'expandtemplates' API and therefore should be able to give a low level output of wikitext on pages, however it seems to hide HTML comments for example...
_checkRelevancyTh(...) improved with help of ...getSections() but it still has full fall-back capabilities.
wikipediaAPI.Page(...)._getParsedString(...) introduced and wikipediaAPI.Page(...).getParsedContent() improved; they should work now.
_parseNews improved because of the two mentioned before it was possible to replace _renderWiki(...) (and with it _renderWikiTempl(...) too) with _getParsedString(...) and they are useless now and can be deleted.
wikipediaAPI.py, sum_disc.py
7
'_setUser(...)' and '_getUsers()' used both two seperate parameter/option handling mechanisms. since the entry in Benutzer:DrTrigonBot/Diene Mir! without and parameters in brackets '{...}' offers already a config option; you can specify any target page within your userspace. this option and the "real" options in brackets where handeled in 'self.userListInfo' and 'self.param'. those two mechanisms were now merged and the target page in 'self.userListInfo' is added to 'self.param' as 'userResultPage'.
Dogma: 'self.param' is the only option/parameter handling mechanism (especaly for params changable by all users through wiki)
sum_disc.py
6
config management/system for running bot in seperate file; all options (global vars) moved there. the variables could be named nicer (more self-explainig) but the external options names (parameters that can be set within Benutzer:DrTrigonBot/Diene Mir!) must not be changed! never!
Dogma: the external options names (parameters that can be set within Benutzer:DrTrigonBot/Diene Mir!) must not be changed! never!
checkedit_count: how many recent edits should be checked (mainly for use with BLUbot)
reportchanged_switch: should changed (and already as new reported) discussions on change be listed again (mainly for use with BLUbot)
sum_disc.py
12
format/system of saved checksums had to be changed once again (3rd or 4th time) because of the dict used with subheadings as key was not unique (all subheadings on a page may have the same name); now the bot is using a list again, but a list of tuples (there are still some issues open with subheadings, this may make another change necessary...)
sum_disc.py
4
add interface for history's compression; was done initially but does not make sense since the permissions of user 'apache' will not allow to to re-write the history. the good possibilities are to do it by hand (manual) with a toolserver login (from time to time) or setup a e.g. monthly or weekly CRON job
Dogma: history compression should be done by hand or CRON job (if not by hand, do auto backup)
panel.py
3
improvement of admin log interface: 'filelist' and 'checkfiles' are not necessary, the whole path can be generated only once just before the call of 'os.remove'!
panel.py
2
instead of using 'runbotrun.py -auto', now a CRON job was created on toolserver, this should in order take care on the bot timing and be able to deal with server reboots. with this action all issues related with this are solved. call is of bot is 'runbotrun.py -cron'
runbotrun.py
1
admin interface added for removal of old log files instead of using cyclic logs (5 files e.g.)
Dogma: admin interface is used for log file removal - it's link is not published