User:HastyBot

This page for HastyBot. A Bot is a piece of software that performs several edits to the wiki automatically. Kevin is this bots wrangler!

If you want to see what it is doing then you might need to choose 'show' bots on Special:RecentChanges. The Schedule:Duration is to indicate how often you think it should run. Currently running the bot is a fairly manual process.

Hasty-Bot is actually several scripts that are developed from a common core of code freely available. HasyBot started out written in Python using the comprehensive Python Wikipediabot. However I recently have been learning Perl and decided the MediaWiki::Bot framework was more comfortable and to my taste. So new HastyBot scripts are written in Perl.

Tasks HastyBot is currently doing

 * 1) Add RatingBar to the top of a page - Schedule:Weekly [kbnewpages.pl run from kevin/bin of merode3, kevin's crontab]
 * 2) Monthly stats from statistics page and active users - Monthly, 1st of month [kbstats.pl run from kevin/bin of merode3, kevin's crontab]
 * 3) Invalidates template for the statistics so that the main page cache is invalidated and the stats appear on the main page see Template:PagesAndUsers - schedule daily [kb_main_page_template_invalidate.pl, and kb_TopTenPages_template_invalidate.pl run from kevin/bin of merode3, kevin's crontab]
 * 4) Check and mark broken external links - Schedule:Monthly. Links reported on Talk pages. [2am 1st of month. Kevin@merode using weblinkchecker.py from u1 dir, kevin's crontab)
 * 5) * This is the current working of the bot. Now I have messed around with the bots some more I think we can get the behavoir you want happening... It will take a little work... --Kev-The-Hasty 15:14, 20 November 2009 (UTC)
 * 6) * I will elaborate my "dream behavior" on my whiteboard. --Pitpat 08:00, 23 November 2009 (UTC)
 * 7) ** For now, could you just add those talk pages to an meta-category as !advert, stubs etc. are? In this way, we have an number of sites that need updating - perfect for the sprint when somebody feels uninspired to write something. Thanks. --Pitpat 17:30, 18 April 2010 (UTC) Done - see Category:!Dead Link
 * 8) * Why is he timing out after 10 seconds of seemingly no activity??
 * 9) * Can you please move the script to kevin/bin and schedule it there? It doesn't seem to have run since June.
 * 10) ** Good point - will do. Have run the script now 20/11/2010 and lots of new bad links - especially ywam.org links due to the site redesign. That breaks more than a few links on our site. I will try to list them...
 * 11) ** Pywikipedia bot now moved to merode3 server. now running monthly - repeat first to check marked bad links and then run a second time to scan for fresh bad links. --Kev-The-Hasty 21:43, 23 November 2010 (UTC)
 * 12) ** I still don't see the links added to the Dead-Link-Category. Maybe tomorrow?
 * 13) ** Dead links are dead? --Pitpat 21:07, 2 February 2011 (UTC)

Requests for HastyBot

 * Heading Beautifier: Any page with level 1 headings (bad wiki style) eg = My Title = have all headings increased one level. All titles in capitals be turned to initial caps, watching for Abreviations. Schedule:Monthly? - Kevin (PS need some python wrangling to make it do this - might be beyond my skills!)
 * Wikipedia has the policy, that spelling changes should not be done automatically (by a bot), it should at least be semi-automatical (bot proposing changes). Here I think that Capitalization should be considered with care. +1 for Level 1 Headings (That may be == == also, but there doesn't exist a second == == section), however, what if the Title of the Article is another one than the Article Title? --Pitpat 15:33, 19 November 2009 (UTC)
 * Well, we can make our own policies here if we want! I might not be clear here but there are some style problems. The H1 is for article titles (not editable). Users are supposed to use == (H2) not = (H1) as top heading in articles. I propose the bot will find an article with incorrect headings and then edit to move = to ==, == to === etc. - shift it all up by one. I do this manually often enough and it is a tedious task suited for a bot
 * The other thing is that many articles have headings in UPPER CAPS. This is ugly and a style that comes from typewriter days. My proposal is that headings are transformed to Initial Case Headings. Some common abbreviations should be excluded. YWAM, DTS, SOE, UofN, etc. Of course bot edits can be manually reviewed afterwards, or at the command line change by change...
 * Ok, those are task for an robot, true. I thought of something else: A heading for the whole article, often created when importing docs. Most of the time this information is redundant, so the first heading could be removed and all other headings moved one level ... down. (For me it is up (promote a heading to a more important one) but anyway.) --Pitpat 08:30, 20 November 2009 (UTC)
 * Yes, so in this case we want a bot to lift headings that start at >=3 and bring to 2 as well as strat from one and bring to 2... That could be coded into one bot easily. --Kev-The-Hasty 15:14, 20 November 2009 (UTC)


 * Inter-Linker
 * All words in [ ] or { } should not have any replacements made - this is to avoid nasty recursion problems.
 * All references to SOE should link to SOE.
 * The list of abbreviations could be User:HastyBot/AbbrForLinking
 * Format suggestions:
 * One Abbreviation per line
 * Optional Target separated by |
 * Support for regular expressions ? (if implemented easily)
 * Otherwise whole-word matching
 * Catching general spelling mistakes or issues such as ywam -> YWAM
 * The parsing of the suggestions needs to be strict - HastyBot should fail noisily if the page makes no sense.

Example: SOE| School of Evangelism Kevin| User:Kevin /YWAMers?/i|YWAMer


 * Check Category consistency
 * Does a article have both, a subcategory and its parent (or grand-father) category? Remove the more general one.
 * If the template is used, does it link to a category page that links back to this page?
 * If the template is used, are all linked articles linking to the same languagues with the same articles (i.e. is the text in ... the same?)


 * Calculate number of active users (e.g. min. 5 edits) - DONE, see User:Pitpat/Extension:ActiveUsers

For Reference: Tasks the Python HastyBot is capable of doing
From the CONTENTS file:

Utilities

 * basic.py              : Is a template from which simple bots can be made.
 * checkusage.py         : Provides a way for users of the Wikimedia toolserver to check the use of images from Commons on other Wikimedia wikis.
 * extract_wikilinks.py  : Two bots to get all linked-to wiki pages from an HTML-file. They differ in their output: extract_names gives bare names (can be used for solve_disambiguation.py, table2wiki.py or windows-chars.py), extract_wikilinks gives them in interwiki-link format (can be used for interwiki.py)
 * followlive.py         : Periodically grab the list of new articles and analyze them. If the article is too short, a menu will let you easily add a template.
 * get.py                : Script to get a page and write its contents to standard output.
 * login.py              : Log in to an account on your "home" wiki.
 * splitwarning.py       : split an interwiki.log file into warning files for each separate language. suggestion: Zip the created files up, put them somewhere on the internet, and send an announcement of the location on the robot mailinglist.
 * test.py               : Check whether you are logged in.
 * testfamily.py         : Check whether you are logged in all known languages in a family.
 * xmltest.py            : Read an XML file (e.g. the sax_parse_bug.txt sometimes created by interwiki.py), and if it contains an error, show a stacktrace with the location of the error.
 * editarticle.py        : Edit an article with your favourite editor. Run the script with the "--help" option to get detailed infortion on possiblities.
 * sqldump.py            : Extract information from local cur SQL dump files, like the ones at http://download.wikimedia.org
 * rcsort.py             : A tool to see the recentchanges ordered by user instead of by date.
 * threadpool.py         :
 * xmlreader.py          :
 * watchlist.py          : Allows access to the bot account's watchlist.
 * wikicomserver.py      : This library allows the use of the pywikipediabot directly from COM-aware applications.

Robots

 * capitalize_redirects.py: Script to create a redirect of capitalize articles.
 * casechecker.py        : Script to enumerate all pages in the wikipedia and find all titles with mixed Latin and Cyrillic alphabets.
 * category.py           : add a category link to all pages mentioned on a page, change or remove category tags
 * category_redirect.py  : Maintain category redirects and replace links to redirected categories.
 * catall.py             : Add or change categories on a number of pages.
 * catmove.pl            : Need Perl programming language for this; takes a list of category moves or removes to make and uses category.py.
 * clean_sandbox.py      : This bot makes the cleaned of the page of tests.
 * commons_link.py       : This robot include commons template to linking Commons and your wiki project.
 * copyright.py          : This robot check copyright text in Google, Yahoo! and Live Search.
 * cosmetic_changes.py   : Can do slight modifications to a wiki page source code such that the code looks cleaner.
 * delete.py             : This script can be used to delete pages en masse.
 * disambredir.py        : Changing redirect names in disambiguation pages.
 * featured.py           : A robot to check feature articles.
 * fixes.py              : This is not a bot, perform one of the predefined replacements tasks, used for "replace.py -fix:replacement".
 * image.py              : This script can be used to change one image to another or remove an image entirely.
 * imagetransfer.py      : Given a wiki page, check the interwiki links for images, and let the user choose among them for images to upload.
 * inline_images.py      : This bot looks for images that are linked inline (i.e., they are hosted from an external server and hotlinked).
 * interwiki.py          : A robot to check interwiki links on all pages (or a range of pages) of a wiki.
 * interwiki_graph.py    : Possible create graph with interwiki.py.
 * imageharvest.py       : Bot for getting multiple images from an external site.
 * isbn.py               : Bot to convert all ISBN-10 codes to the ISBN-13 format.
 * makecat.py            : Given an existing or new category, find pages for that category.
 * movepages.py          : Bot page moves to another title.
 * nowcommons.py         : This bot can delete images with NowCommons template.
 * pagefromfile.py       : This bot takes its input from a file that contains a number of pages to be put on the wiki.
 * piper.py              : Pipes article text through external program(s) on STDIN and collects its STDOUT which is used as the new article text if it differs from the original.
 * redirect.py           : Fix double redirects and broken redirects. Note: solve_disambiguation also has functions which treat redirects.
 * refcheck.py           : This script checks references to see if they are properly formatted.
 * replace.py            : Search articles for a text and replace it by another text. Both text are set in two configurable text files. The bot can either work on a set of given pages or crawl an SQL dump.
 * saveHTML.py           : Downloads the HTML-pages of articles and images.
 * selflink.py           : This bot goes over multiple pages of the home wiki, searches for selflinks, and allows removing them.
 * solve_disambiguation.py: Interactive robot doing disambiguation.
 * speedy_delete.py      : This bot load a list of pages from the category of candidates for speedy deletion and give the user an interactive prompt to decide whether each should be deleted or not.
 * spellcheck.py         : This bot spellchecks wiki pages.
 * standardize_interwiki.py:A robot that downloads a page, and reformats the interwiki links in a standard way (i.e. move all of them to the bottom or the top, with the same separator, in the right order).
 * standardize_notes.py  : Converts external links and notes/references to : Footnote3 ref/note format.  Rewrites References.
 * table2wiki.py         : Semi-automatic converting HTML-tables to wiki-tables.
 * templatecount.py      : Display the list of pages transcluding a given list of templates.
 * template.py           : change one template (that is ) into another.
 * touch.py              : Bot goes over all pages of the home wiki, and edits them without changing.
 * unlink.py             : This bot unlinks a page on every page that links to it.
 * unusedfiles.py        : Bot appends some text to all unused images and other text to the respective uploaders.
 * upload.py             : upload an image to a wiki.
 * us-states.py          : A robot to add redirects to cities for US state abbreviations.
 * warnfile.py           : A robot that parses a warning file created by interwiki.py on another language wiki, and implements the suggested changes without verifying them.
 * weblinkchecker.py     : Check if external links are still working.
 * welcome.py            : Script to welcome new users.
 * windows_chars.py      : Change characters that are not part of Latin-1 into something harmless. It is advisable to do this on Latin-1 wikis before switching to UTF-8.