Utilities¶
This module implements a set of utilities for extracting topic labels from English Wikipedia using the WikiProject taxonomy.
Draft Topic CLI¶
drafttopic¶
$ drafttopic -h
This script provides access to a set of utilities for extracting features
and building draft topic predictors.
* add_central_africa -- Adds "Geography.Regions.Africa.Central Africa" to
the labels manually.
* balance_sample -- Generates an approximately balances sample of each
label
* extract_from_text -- Extracts features from raw text
* fetch_article_text -- Gathers current article text for each labeling
observation from a MediaWiki API
* fetch_draft_text -- Gathers first revision article text for each labeling
observation from a MediaWiki API
* taxo_label -- Labels a set of observations based on their
WikiProject templates
* write_labels -- Extracts all labels from a wikiprojects labeled dataset
and writes them out to config
Usage:
drafttopic (-h | --help)
drafttopic <utility> [-h | --help]
Options:
-h | --help Prints this documentation
<utility> The name of the utility to run