User:Gdr/Yearbot

From Wikipedia, the free encyclopedia

Yearbot is a Wikipedia bot that assists with the updating of the "Births" and "Deaths" sections of year articles (see Wikipedia:WikiProject Years) in the English Wikipedia. It uses the Python Wikipedia Robot Framework.

Yearbot is currently in development. It is operated from time to time by User:Gdrbot.

Operator instructions[edit]

Run python yearbot.py year, then follow the instructions. The script presents you first with the people born in that year (in batches of 20), then with the people who died in that year. The people are presented like this:

1* [[January 1]] - [[Person A]], description (died [[year]])
2- [[August 22]] - [[Person B]], description (died [[year]])

The asterisk in "1*" means that Person A will be included in the page; the minus in "2-" means that Person B will be excluded, because the text "c." was found next to the date of birth/death in their article.

At the prompt, you can view and edit entries by typing <number><command>, as follows:

  • <number>x — Toggle whether person <number> is excluded/included.
  • <number>pPrint the current details for person <number>.
  • <number>i — Print the introductory paragraph for person <number>.
  • <number>t — Print the text of the whole article for person <number>.
  • <number>d:<desc> — Enter a new description for person <number>.
  • <number>d<n> — Truncate the current description for person <number> to <n> words.
  • <number>P:<prefix> — Enter new Prefix text for person <number>.
  • <number>/<from>/<to>/ — Apply regular expression search and replace to the description for person <number>.

These commands view or operate on the whole list:

  • hHelp.
  • lList all entries.
  • sSave these entries (but don't update the page yet), and move onto the next batch.
  • qQuit.
  • /<from>/<to>/ — Apply regular expression search and replace to the descriptions for all people; also add pattern to yearbot-patterns to apply to future runs of the program.

When you've saved both the births and the deaths, you have a final chance to agree to the changes to the page, then it gets updated.

What it does[edit]

  1. Gets the [[year]] article; parses lines in the "Births" and "Deaths" sections to get birth/death dates, article names, brief descriptions, and death/birth year.
  2. Get [[Category:year births]] and [[Category:year deaths]].
  3. For each article in the [[year]] article, and for each article mentioned in these categories, get that article (using Special:Export to reduce the server load), then parse that article, trying to discover:
    1. A sort key for the person (parsed from the person's category entries).
    2. Their birth/death date (parsed from the introductory paragraph if it's in the standard format specified at Wikipedia:Manual of Style (dates and numbers)).
    3. A brief description of them (parsed from the introductory paragraph by looking for the first clause after the person's dates).
    4. Whether the birth/death dates are certain or merely approximate (by looking for "c." and similar terms in the introductory paragraph); people with approximate dates are excluded by default.
  4. Loads regular expressions from the file yearbot-patterns and applies them to all descriptions.
  5. Presents the accumulated information as described above, allowing the operator to edit the descriptions.
  6. Sorts the entries into order: first the people whose dates are known, ordered by date; then the rest of the people, ordered by their sort key if any, otherwise by the last word of their article name.
  7. Updates the "Births" and "Deaths" sections of the [[year]] article with the entries produced by the above process.

Source code[edit]

See User:Gdr/yearbot.py and User:Gdr/history.py