### Detector Example Usage Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/detector.html Demonstrates how to use the Detector class to process revisions and detect reverts. Revisions are processed in chronological order. ```python >>> import mwreverts >>> detector = mwreverts.Detector() >>> >>> detector.process("aaa", {'rev_id': 1}) >>> detector.process("bbb", {'rev_id': 2}) >>> detector.process("aaa", {'rev_id': 3}) Revert(reverting={'rev_id': 3}, reverteds=[{'rev_id': 2}], reverted_to={'rev_id': 1}) >>> detector.process("ccc", {'rev_id': 4}) ``` -------------------------------- ### Get Archived Namespace, Title, and Timestamp Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/db.html Retrieves the namespace, title, and timestamp for a given archived revision ID. Essential for fetching revision details when only the ID is known. ```python def get_archived_namespace_title_and_timestamp(schema, rev_id): with schema.transaction() as session: row = session.query( schema.archive.c.ar_namespace, schema.archive.c.ar_title, schema.archive.c.ar_timestamp).filter( schema.archive.c.ar_rev_id == rev_id).first() return row[0], row[1], Timestamp(row[2]) ``` -------------------------------- ### Get Revisions Before a Specific Revision Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/db.html Retrieves a specified number of revisions that occurred before a given revision ID on a particular page. The results are yielded in chronological order. ```python def n_edits_before(schema, rev_id, page_id, n, rvprop=None): with schema.transaction() as session: result = session.query(schema.revision).filter( and_(schema.revision.c.rev_page == page_id, schema.revision.c.rev_id < rev_id)).order_by( schema.revision.c.rev_id.desc()).limit(n) # Reverse order because of the query pattern rows = reversed(list(result)) for row in rows: yield row ``` -------------------------------- ### Get Revisions Before a Specific Revision Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/api.html Fetches a specified number of revisions that occurred before a given revision ID on a specific page. The revisions are returned in reverse chronological order. ```python def n_edits_before(session, rev_id, page_id, n, timestamp=None, rvprop=None): doc = session.get(action='query', prop='revisions', pageids=page_id, rvstartid=rev_id, rvend=timestamp, rvdir='older', rvlimit=n, rvprop=rvprop) page_doc = list(doc['query']['pages'].values())[0] # Reverse order because of the query pattern revisions = reversed(page_doc.get('revisions', [])) if 'revisions' in page_doc: del page_doc['revisions'] for revision_doc in revisions: revision_doc['page'] = page_doc yield revision_doc ``` -------------------------------- ### Get Page ID from Revision ID Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/db.html Fetches the page ID associated with a given revision ID from the database. ```python def get_page_id(schema, rev_id): with schema.transaction() as session: row = session.query(schema.revision.c.rev_page).filter( schema.revision.c.rev_id == rev_id).first() return row[0] ``` -------------------------------- ### Get Revisions After a Specific Revision Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/api.html Fetches a specified number of revisions that occurred after a given revision ID on a specific page. Useful for analyzing subsequent edits. ```python def n_edits_after(session, rev_id, page_id, n, timestamp=None, rvprop=None): doc = session.get(action='query', prop='revisions', pageids=page_id, rvstartid=rev_id, rvend=timestamp, rvdir='newer', rvlimit=n, rvprop=rvprop) page_doc = list(doc['query']['pages'].values())[0] revisions = page_doc.get('revisions', []) if 'revisions' in page_doc: del page_doc['revisions'] for revision_doc in revisions: revision_doc['page'] = page_doc yield revision_doc ``` -------------------------------- ### Get Deleted Revision Title and Timestamp Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/api.html Fetches the title and timestamp for a given deleted revision ID. This is a prerequisite for other functions that require page title and timestamp information. Raises a KeyError if the revision is not found. ```python def get_deleted_title_and_timestamp(session, rev_id): doc = session.get(action='query', prop='deletedrevisions', revids=rev_id, drvprop=['ids', 'timestamp']) if 'badrevids' in doc['query']: raise KeyError("Archived revision {0} not found.".format(rev_id)) page_doc = list(doc['query']['pages'].values())[0] return ( page_doc['title'], Timestamp(page_doc['deletedrevisions'][0]['timestamp'])) ``` -------------------------------- ### Get SHA1 Hash from Row Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/db.html Retrieves the SHA1 hash from a database row object, checking for 'rev_sha1' or 'ar_sha1' attributes. ```python def get_sha1(row): if hasattr(row, 'rev_sha1'): return row.rev_sha1 else: return row.ar_sha1 ``` -------------------------------- ### Get Page ID from Revision ID Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/api.html Retrieves the page ID associated with a given revision ID by querying the MediaWiki API. Raises a KeyError if the revision is not found. ```python def get_page_id(session, rev_id): doc = session.get(action='query', prop='revisions', revids=rev_id, rvprop=['ids']) if 'badrevids' in doc['query']: raise KeyError("Revision {0} not found.".format(rev_id)) page_doc = list(doc['query']['pages'].values())[0] return page_doc['pageid'] ``` -------------------------------- ### Get Revision ID from Row Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/db.html Retrieves the revision ID from a database row object, checking for 'rev_id' or 'ar_rev_id' attributes. ```python def get_rev_id(row): if hasattr(row, 'rev_id'): return row.rev_id else: return row.ar_rev_id ``` -------------------------------- ### Get Revisions After a Specific Revision Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/db.html Retrieves a specified number of revisions that occurred after a given revision ID on a particular page. It can optionally filter revisions saved before a certain timestamp. ```python def n_edits_after(schema, rev_id, page_id, n, before=None): if before is not None: before_fmt = bytes(before.short_format(), 'utf8') else: before_fmt = bytes(Timestamp(time.time()).short_format(), 'utf8') with schema.transaction() as session: result = session.query(schema.revision).filter( and_(schema.revision.c.rev_page == page_id, schema.revision.c.rev_id > rev_id, schema.revision.c.rev_timestamp <= before_fmt)).order_by( schema.revision.c.rev_id.asc()).limit(n) for row in result: yield row ``` -------------------------------- ### Detector.initialize Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/detector.html Initializes the Detector instance with a specified radius for the history. ```APIDOC ## Method initialize ### Description Initializes the Detector with a given radius. The radius determines the maximum revision distance for detecting reverts. ### Parameters * **radius** (int) - A positive integer indicating the maximum revision distance that a revert can span. Defaults to `defaults.RADIUS`. ### Raises * **TypeError** - If the provided radius is not a positive integer. ``` -------------------------------- ### Handling Unknown Checksums - mwreverts.DummyChecksum Source: https://pythonhosted.org/mwreverts/detection.html Use `mwreverts.DummyChecksum` when the checksum of a revision is unknown. DummyChecksums are hashable and will only match themselves, not other DummyChecksums or any other values. ```python dummy1 = DummyChecksum() dummy1 == dummy1 dummy2 = DummyChecksum() dummy1 == dummy2 {"foo", "bar", dummy1, dummy1, dummy2} ``` -------------------------------- ### Process Revisions One-at-a-Time - mwreverts.Detector Source: https://pythonhosted.org/mwreverts/detection.html Instantiate `mwreverts.Detector` and use its `process()` method to detect reverts as revisions arrive chronologically. This is useful for streaming data. The `radius` parameter can be configured during initialization. ```python import mwreverts detector = mwreverts.Detector() detector.process("aaa", {'rev_id': 1}) detector.process("bbb", {'rev_id': 2}) detector.process("aaa", {'rev_id': 3}) detector.process("ccc", {'rev_id': 4}) ``` -------------------------------- ### Initialize Detector with Radius Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/detector.html Initializes the Detector with a maximum revision distance. Raises a TypeError if the radius is not a positive integer. ```python def initialize(self, radius=defaults.RADIUS): if radius < 1: raise TypeError("invalid radius. Expected a positive integer.") super().initialize(maxsize=radius + 1) ``` -------------------------------- ### mwreverts.Detector.process() Source: https://pythonhosted.org/mwreverts/detection.html Process a new revision and detect a revert if it occurred. Note that you can pass whatever you like as revision and it will be returned in the case that a revert occurs. ```APIDOC ## process(checksum, revision=None) ### Description Process a new revision and detect a revert if it occurred. Note that you can pass whatever you like as revision and it will be returned in the case that a revert occurs. ### Parameters #### Path Parameters - **checksum** (str) - Required - Any identity-machable string-based hash of revision content - **revision** (mixed) - Optional - Revision metadata. Note that any data will just be returned in the case of a revert. ### Returns a `Revert` if one occured or None ``` -------------------------------- ### Build Revert Tuple Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/api.html Constructs a tuple containing information about reverting, reverted, and reverted_to revisions. It processes a checksum-revision mapping to detect revert patterns. ```python checksum_revisions = chain( ((rev['sha1'] if 'sha1' in rev else DummyChecksum(), rev) for rev in past_revs), [(current_rev.get('sha1', DummyChecksum()), current_rev)], ((rev['sha1'] if 'sha1' in rev else DummyChecksum(), rev) for rev in future_revs), ) reverting, reverted, reverted_to = None, None, None for revert in detect(checksum_revisions, radius=radius): if reverting is None and revert.reverting['revid'] == rev_id: reverting = revert if reverted is None and \ rev_id in {rev['revid'] for rev in revert.reverteds}: reverted = revert if reverted_to is None and revert.reverted_to['revid'] == rev_id: reverted_to = revert return reverting, reverted, reverted_to ``` -------------------------------- ### Build Revert Tuple Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/db.html Constructs a tuple containing information about reverting actions, including the reverting revision, reverted revisions, and the revision it was reverted to. It processes a list of checksum-revision pairs and identifies revert patterns using the 'detect' function. ```python def build_revert_tuple(rev_id, past_revs, current_rev, future_revs, radius): # Convert to an iterable of (checksum, rev) pairs for detect() to consume checksum_revisions = chain( ((get_sha1(rev) or DummyChecksum(), rev) for rev in past_revs), [(get_sha1(current_rev) or DummyChecksum(), current_rev)], ((get_sha1(rev) or DummyChecksum(), rev) for rev in future_revs), ) reverting, reverted, reverted_to = None, None, None for revert in detect(checksum_revisions, radius=radius): if reverting is None and get_rev_id(revert.reverting) == rev_id: reverting = revert if reverted is None and \ rev_id in {get_rev_id(rev) for rev in revert.reverteds}: reverted = revert if reverted_to is None and get_rev_id(revert.reverted_to) == rev_id: reverted_to = revert return reverting, reverted, reverted_to ``` -------------------------------- ### Detector.process Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/detector.html Processes a new revision to detect if a revert has occurred. It compares the checksum of the new revision with previous ones. ```APIDOC ## Method process ### Description Process a new revision and detect a revert if it occurred. Note that you can pass whatever you like as `revision` and it will be returned in the case that a revert occurs. ### Parameters * **checksum** (str) - Any identity-machable string-based hash of revision content. * **revision** (mixed) - Revision metadata. Any data passed will be returned in the case of a revert. ### Returns * a :class:`~mwreverts.Revert` if one occurred or `None`. ``` -------------------------------- ### Extract Reverts from XML Dump Source: https://pythonhosted.org/mwreverts/utilities.html Use this command to extract revert information from MediaWiki XML dumps. It supports specifying input files, radius for revert references, SHA1 usage, threading, output path, compression type, and verbosity. ```bash mwpersistence dump2reverts -h ``` ```bash Extracts reverts from an XML dump. Usage: dump2reverts (-h|--help) dump2reverts [...] [--radius=] [--use-sha1] [--threads=] [--output=] [--compress=] [--verbose] [--debug] Options: -h|--help Print this documentation The path to file containing MediaWiki XML [default: ] --radius= The maximum number of revisions that a revert can reference. [default: 15] --use-sha1 Use the sha1 field even if a text field is available. --threads= If a collection of files are provided, how many processor threads? [default: ] --output= Write output to a directory with one output file per input path. [default: ] --compress= If set, output written to the output-dir will be compressed in this format. [default: bz2] --verbose Print dots and stuff to stderr --debug Print debug logs. ``` -------------------------------- ### Process Revision and Detect Revert Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/detector.html Processes a new revision, comparing its checksum to previously seen revisions to detect a revert. Returns a Revert object if a revert is detected, otherwise returns None. The `revision` parameter can be any metadata that will be returned if a revert occurs. ```python def process(self, checksum, revision=None): """ Process a new revision and detect a revert if it occurred. Note that you can pass whatever you like as `revision` and it will be returned in the case that a revert occurs. :Parameters: checksum : str Any identity-machable string-based hash of revision content revision : `mixed` Revision metadata. Note that any data will just be returned in the case of a revert. :Returns: a :class:`~mwreverts.Revert` if one occured or `None` """ revert = None if checksum in self: # potential revert reverteds = list(self.up_to(checksum)) if len(reverteds) > 0: # If no reverted revisions, this is a noop revert = Revert(revision, reverteds, self[checksum]) self.insert(checksum, revision) return revert ``` -------------------------------- ### mwreverts.detect Source: https://pythonhosted.org/mwreverts/_sources/detection.txt Takes an iterable of (checksum, revision_data) pairs and returns an iterator of mwreverts.Revert objects. ```APIDOC ## mwreverts.detect ### Description Takes an iterable of (checksum, revision_data) pairs and returns an iterator of :class:`mwreverts.Revert` objects. ### Parameters * **checksum** (any) - Description of checksum * **revision_data** (any) - Description of revision_data ### Returns An iterator of :class:`mwreverts.Revert` objects. ``` -------------------------------- ### Revert Class Definition Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/revert.html Defines the Revert class, which represents a revert event. It includes initialization, iteration, equality comparison, and item access methods. The datatypes of reverting, reverteds, and reverted_to are not specified as they depend on the revision data provided. ```python import jsonable [docs] class Revert(jsonable.Type): """ Represents a revert event. This class behaves like :class:`collections.namedtuple`. Note that the datatypes of `reverting`, `reverteds` and `reverted_to` is not specified since those types will depend on the revision data provided during revert detection. :Attributes: **reverting** The reverting revision data : `mixed` **reverteds** The reverted revision data (ordered chronologically) : list( `mixed` ) **reverted_to** The reverted-to revision data : `mixed` """ __slots__ = ('reverting', 'reverteds', 'reverted_to') def initialize(self, reverting=None, reverteds=None, reverted_to=None): self.reverting = reverting self.reverteds = list(reverteds or []) self.reverted_to = reverted_to def __iter__(self): yield self.reverting yield self.reverteds yield self.reverted_to def __eq__(self, other): if isinstance(other, tuple): return tuple(self) == other def __getitem__(self, index): if index == 0: return self.reverting elif index == 1: return self.reverteds else: return self.reverted_to ``` -------------------------------- ### Extract Reverts from Revision Documents Source: https://pythonhosted.org/mwreverts/utilities.html This command extracts revert information from page-partitioned JSON revision documents. Similar to `dump2reverts`, it allows configuration of input files, revert radius, SHA1 usage, threading, output, compression, and logging. ```bash mwpersistence revdocs2reverts -h ``` ```bash Extracts reverts from a page-partitioned sequence of JSON revision documents. Usage: revdocs2reverts (-h|--help) revdocs2reverts [...] [--radius=] [--use-sha1] [--threads=] [--output=] [--compress=] [--verbose] [--debug] Options: -h|--help Print this documentation The path to file containing page-partitioned JSON revision documents. [default: ] --radius= The maximum number of revisions that a revert can reference. [default: 15] --use-sha1 Use the sha1 field even if a text field is available. --threads= If a collection of files are provided, how many processor threads? [default: ] --output= Write output to a directory with one output file per input path. [default: ] --compress= If set, output written to the output-dir will be compressed in this format. [default: bz2] --verbose Print progress information to stderr. --debug Print debug logs. ``` -------------------------------- ### mwreverts.Detector Source: https://pythonhosted.org/mwreverts/detection.html Detects revert events in a stream of revisions (to the same page) based on matching checksums. To detect reverts, construct an instance of this class and call `process()` in chronological order. ```APIDOC ## class mwreverts.Detector(radius=15) ### Description Detects revert events in a stream of revisions (to the same page) based on matching checksums. To detect reverts, construct an instance of this class and call `process()` in chronological order. See https://meta.wikimedia.org/wiki/R:Identity_revert ### Parameters #### Path Parameters - **radius** (int) - Optional - a positive integer indicating the maximum revision distance that a revert can span. ### Example ```python >>> import mwreverts >>> detector = mwreverts.Detector() >>> >>> detector.process("aaa", {'rev_id': 1}) >>> detector.process("bbb", {'rev_id': 2}) >>> detector.process("aaa", {'rev_id': 3}) Revert(reverting={'rev_id': 3}, reverteds=[{'rev_id': 2}], reverted_to={'rev_id': 1}) >>> detector.process("ccc", {'rev_id': 4}) ``` ``` -------------------------------- ### DummyChecksum Class Definition Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/dummy_checksum.html Defines the DummyChecksum class, which provides a unique, non-matching checksum for unknown revisions. It is hashable and matches only itself. ```python class DummyChecksum(): """ Used in when checking for reverts when the checksum of the revision of interest is unknown. DummyChecksums won't match eachother or anything else, but they will match themselves and they are hashable. >>> dummy1 = DummyChecksum() >>> dummy1 <#140687347334280> >>> dummy1 == dummy1 True >>> >>> dummy2 = DummyChecksum() >>> dummy2 <#140687347334504> >>> dummy1 == dummy2 False >>> >>> {"foo", "bar", dummy1, dummy1, dummy2} {<#140687347334280>, 'foo', <#140687347334504>, 'bar'} """ def __str__(self): repr(self) def __repr__(self): return "<#" + str(id(self)) + ">" ``` -------------------------------- ### mwreverts.Detector Source: https://pythonhosted.org/mwreverts/_sources/detection.txt Provides a process method that allows you to process revisions one-at-a-time. ```APIDOC ## mwreverts.Detector ### Description A class that provides a `process` method for handling revisions individually. ### Methods #### process(checksum, revision_data) Processes a single revision. * **checksum** (any) - The checksum of the revision. * **revision_data** (any) - The data associated with the revision. ### Returns (Implicitly returns None or processes internally) ``` -------------------------------- ### Load Future Revisions Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/db.html Loads future revisions for a given revision ID and page. Used within the context of checking revert status. ```python if window is not None and before is None: before = Timestamp(current_rev.rev_timestamp) + window # Load future revisions future_revs = list(n_edits_after( schema, rev_id, page_id, n=radius, before=before)) return build_revert_tuple( rev_id, past_revs, current_rev, future_revs, radius) ``` -------------------------------- ### Representing a Revert Event - mwreverts.Revert Source: https://pythonhosted.org/mwreverts/detection.html The `mwreverts.Revert` class, behaving like a `collections.namedtuple`, represents a detected revert event. It stores the reverting revision, a list of reverted revisions (chronologically ordered), and the revision it reverted to. ```python Revert(reverting={'rev_id': 3}, reverteds=[{'rev_id': 2}], reverted_to={'rev_id': 1}) ``` -------------------------------- ### detect Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/functions.html Detects reverts that occur in a sequence of revisions. This function serves as a convenience wrapper around calls to `mwreverts.Detector`'s `process` method. ```APIDOC ## detect ### Description Detects reverts that occur in a sequence of revisions. Note that, `revision` data meta will simply be returned in the case of a revert. This function serves as a convenience wrapper around calls to :class:`mwreverts.Detector`'s :func:`~mwreverts.Detector.process` method. ### Parameters #### Parameters - **checksum_revisions** (`iterable` ( (checksum, revision) )) - Required - an iterable over tuples of checksum and revision meta data - **radius** (`int`) - Optional - a positive integer indicating the maximum revision distance that a revert can span. ### Return a iterator over :class:`mwreverts.Revert` ### Example ```python >>> import mwreverts >>> >>> checksum_revisions = [ ... ("aaa", {'rev_id': 1}), ... ("bbb", {'rev_id': 2}), ... ("aaa", {'rev_id': 3}), ... ("ccc", {'rev_id': 4}) ... ] >>> >>> list(mwreverts.detect(checksum_revisions)) [Revert(reverting={'rev_id': 3}, reverteds=[{'rev_id': 2}], reverted_to={'rev_id': 1})] ``` ``` -------------------------------- ### Detect Reverts in an Iterable - mwreverts.detect() Source: https://pythonhosted.org/mwreverts/detection.html Use `mwreverts.detect()` to find revert events within a sequence of (checksum, revision_data) pairs. It returns an iterator of `mwreverts.Revert` objects. The `radius` parameter controls the maximum revision distance for a revert. ```python import mwreverts checksum_revisions = [ ("aaa", {'rev_id': 1}), ("bbb", {'rev_id': 2}), ("aaa", {'rev_id': 3}), ("ccc", {'rev_id': 4}) ] list(mwreverts.detect(checksum_revisions)) ``` -------------------------------- ### Detect Reverts in Revision Sequence Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/functions.html Use this function to detect reverts within a given sequence of revisions. It processes checksum-revision pairs and yields Revert objects for detected reverts. Ensure the input is an iterable of (checksum, revision) tuples. ```python from . import defaults from .detector import Detector [docs] def detect(checksum_revisions, radius=defaults.RADIUS): """ Detects reverts that occur in a sequence of revisions. Note that, `revision` data meta will simply be returned in the case of a revert. This function serves as a convenience wrapper around calls to :class:`mwreverts.Detector`'s :func:`~mwreverts.Detector.process` method. :Parameters: checksum_revisions : `iterable` ( (checksum, revision) ) an iterable over tuples of checksum and revision meta data radius : int a positive integer indicating the maximum revision distance that a revert can span. :Return: a iterator over :class:`mwreverts.Revert` :Example: >>> import mwreverts >>> >>> checksum_revisions = [ ... ("aaa", {'rev_id': 1}), ... ("bbb", {'rev_id': 2}), ... ("aaa", {'rev_id': 3}), ... ("ccc", {'rev_id': 4}) ... ] >>> >>> list(mwreverts.detect(checksum_revisions)) [Revert(reverting={'rev_id': 3}, reverteds=[{'rev_id': 2}], reverted_to={'rev_id': 1})] """ revert_detector = Detector(radius) for checksum, revision in checksum_revisions: revert = revert_detector.process(checksum, revision) if revert is not None: yield revert ``` -------------------------------- ### Detector Class Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/detector.html The Detector class is used to detect revert events in a stream of revisions. It maintains a history of revisions and identifies reverts based on matching checksums within a defined radius. ```APIDOC ## Class Detector ### Description Detects revert events in a stream of revisions (to the same page) based on matching checksums. To detect reverts, construct an instance of this class and call :func:`~mwreverts.Detector.process` in chronological order. See https://meta.wikimedia.org/wiki/R:Identity_revert ### Parameters * **radius** (int) - a positive integer indicating the maximum revision distance that a revert can span. ### Example ```python import mwreverts detector = mwreverts.Detector() detector.process("aaa", {'rev_id': 1}) detector.process("bbb", {'rev_id': 2}) detector.process("aaa", {'rev_id': 3}) # Returns: Revert(reverting={'rev_id': 3}, reverteds=[{'rev_id': 2}], reverted_to={'rev_id': 1}) detector.process("ccc", {'rev_id': 4}) ``` ``` -------------------------------- ### Fetch Deleted Revisions After a Specific Revision Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/api.html Retrieves a list of deleted revisions after a given revision ID. Useful for analyzing changes that occurred after a specific point in time. Requires a MediaWiki API session. ```python def n_deleted_edits_after(session, rev_id, title, timestamp, n, before=None, rvprop=None): doc = session.get(action='query', prop='deletedrevisions', titles=title, drvstart=timestamp, drvend=before, drvdir='newer', drvlimit=n, drvprop=rvprop) page_doc = list(doc['query']['pages'].values())[0] revisions = page_doc.get('deletedrevisions', []) revisions = [r for r in revisions if r['revid'] >= rev_id] if 'revisions' in page_doc: del page_doc['revisions'] for revision_doc in revisions: revision_doc['page'] = page_doc yield revision_doc ``` -------------------------------- ### Fetch Deleted Revisions Before a Specific Revision Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/api.html Retrieves a list of deleted revisions before a given revision ID. This function is useful for analyzing the history leading up to a specific revision. The revisions are returned in reverse chronological order. ```python def n_deleted_edits_before(session, rev_id, title, timestamp, n, rvprop=None): doc = session.get(action='query', prop='deletedrevisions', titles=title, drvstart=timestamp, drvdir='older', drvlimit=n, drvprop=rvprop) page_doc = list(doc['query']['pages'].values())[0] # Reverse order because of the query pattern revisions = list(reversed(page_doc.get('deletedrevisions', []))) revisions = [r for r in revisions if r['revid'] <= rev_id] if 'revisions' in page_doc: del page_doc['revisions'] for revision_doc in revisions: revision_doc['page'] = page_doc yield revision_doc ``` -------------------------------- ### Check Revert Status of a Revision Source: https://pythonhosted.org/mwreverts/db.html Use this function to determine if a revision is a 'reverting' edit, was 'reverted' by another edit, or was 'reverted_to' by another edit. Requires a database schema object and revision ID. ```python import mwdb import mwreverts.api schema = mwdb.Schema("mysql+pymysql://enwiki.labsdb/enwiki_p" + "?read_default_file=~/replica.my.cnf") def print_revert(revert): if revert is None: print(None) else: print(revert.reverting['rev_id'], [r['rev_id'] for r in revert.reverteds], revert.reverted_to['rev_id']) reverting, reverted, reverted_to = \ mwreverts.db.check(schema, 679778587) print_revert(reverting) print_revert(reverted) print_revert(reverted_to) ``` -------------------------------- ### Check Revert Status of a Revision Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/api.html Determines the revert status of a given revision, indicating if it reverted other edits, was reverted, or was reverted to. Requires an API session and revision ID. Page ID is optional but improves performance. ```python def check(session, rev_id, page_id=None, radius=defaults.RADIUS, before=None, window=None, rvprop=None): """ Checks the revert status of a revision. With this method, you can determine whether an edit is a 'reverting' edit, was 'reverted' by another edit and/or was 'reverted_to' by another edit. :Parameters: session : :class:`mwapi.Session` An API session to make use of rev_id : int the ID of the revision to check page_id : int the ID of the page the revision occupies (slower if not provided) radius : int a positive integer indicating the maximum number of revisions that can be reverted before : :class:`mwtypes.Timestamp` if set, limits the search for *reverting* revisions to those which were saved before this timestamp window : int if set, limits the search for *reverting* revisions to those which were saved within `window` seconds after the reverted edit rvprop : set( str ) a set of properties to include in revisions :Returns: A triple :class:`mwreverts.Revert` | `None` * reverting -- If this edit reverted other edit(s) * reverted -- If this edit was reverted by another edit * reverted_to -- If this edit was reverted to by another edit :Example: >>> import mwapi >>> import mwreverts.api >>> >>> session = mwapi.Session("https://en.wikipedia.org") >>> >>> def print_revert(revert): ... if revert is None: ... print(None) ... else: ... print(revert.reverting['revid'], ... [r['revid'] for r in revert.reverteds], ... revert.reverted_to['revid']) ... >>> reverting, reverted, reverted_to = \ ... mwreverts.api.check(session, 679778587) >>> print_revert(reverting) None >>> print_revert(reverted) 679778743 [679778587] 679742862 >>> print_revert(reverted_to) None """ rev_id = int(rev_id) radius = int(radius) if radius < 1: raise TypeError("invalid radius. Expected a positive integer.") page_id = int(page_id) if page_id is not None else None before = Timestamp(before) if before is not None else None rvprop = set(rvprop) if rvprop is not None else set() # If we don't have the page_id, we're going to need to look them up if page_id is None: page_id = get_page_id(session, rev_id) # Load history and current rev current_and_past_revs = list(n_edits_before( session, rev_id, page_id, n=radius + 1, rvprop={'ids', 'timestamp', 'sha1'} | rvprop )) if len(current_and_past_revs) < 1: raise KeyError("Revision {0} not found in page {1}." .format(rev_id, page_id)) current_rev, past_revs = ( current_and_past_revs[-1], # Current current_and_past_revs[:-1] # Past revisions ) ``` -------------------------------- ### check Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/api.html Checks the revert status of a revision. Determines if an edit is a 'reverting' edit, was 'reverted' by another edit, and/or was 'reverted_to' by another edit. ```APIDOC ## check ### Description Checks the revert status of a revision. With this method, you can determine whether an edit is a 'reverting' edit, was 'reverted' by another edit and/or was 'reverted_to' by another edit. ### Parameters * **session** (:class:`mwapi.Session`) - An API session to make use of * **rev_id** (int) - the ID of the revision to check * **page_id** (int) - the ID of the page the revision occupies (slower if not provided) * **radius** (int) - a positive integer indicating the maximum number of revisions that can be reverted * **before** (:class:`mwtypes.Timestamp`) - if set, limits the search for *reverting* revisions to those which were saved before this timestamp * **window** (int) - if set, limits the search for *reverting* revisions to those which were saved within `window` seconds after the reverted edit * **rvprop** (set( str )) - a set of properties to include in revisions ### Returns A triple :class:`mwreverts.Revert` | `None` * reverting -- If this edit reverted other edit(s) * reverted -- If this edit was reverted by another edit * reverted_to -- If this edit was reverted to by another edit ### Example: ```python import mwapi import mwreverts.api session = mwapi.Session("https://en.wikipedia.org") def print_revert(revert): if revert is None: print(None) else: print(revert.reverting['revid'], [r['revid'] for r in revert.reverteds], revert.reverted_to['revid']) reverting, reverted, reverted_to = \ mwreverts.api.check(session, 679778587) print_revert(reverting) print_revert(reverted) print_revert(reverted_to) ``` ``` -------------------------------- ### Check Revert Status of a Deleted Revision Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/api.html Determines if a deleted revision is a reverting edit, was reverted by another edit, or was reverted to by another edit. Requires a MediaWiki API session and the revision ID. Optionally accepts page title, timestamp, radius, and time window for the search. ```python def check_deleted(session, rev_id, title=None, timestamp=None, radius=defaults.RADIUS, before=None, window=None, rvprop=None): """ Checks the revert status of a deleted revision. With this method, you can determine whether an edit is a 'reverting' edit, was 'reverted' by another edit and/or was 'reverted_to' by another edit. :Parameters: session : :class:`mwapi.Session` An API session to make use of rev_id : int the ID of the revision to check title : str the title of the page the revision occupies (slower if not provided) Note that the MediaWiki API expects the title to include the namespace prefix (e.g. "User_talk:EpochFail") radius : int a positive integer indicating the maximum number of revisions that can be reverted before : :class:`mwtypes.Timestamp` if set, limits the search for *reverting* revisions to those which were saved before this timestamp window : int if set, limits the search for *reverting* revisions to those which were saved within `window` seconds after the reverted edit rvprop : set( str ) a set of properties to include in revisions :Returns: A triple :class:`mwreverts.Revert` | `None` * reverting -- If this edit reverted other edit(s) * reverted -- If this edit was reverted by another edit * reverted_to -- If this edit was reverted to by another edit """ rev_id = int(rev_id) radius = int(radius) if radius < 1: raise TypeError("invalid radius. Expected a positive integer.") title = str(title) if title is not None else None before = Timestamp(before) if before is not None else None rvprop = set(rvprop) if rvprop is not None else set() # If we don't have the title, we're going to need to look it up if title is None or timestamp is None: title, timestamp = get_deleted_title_and_timestamp(session, rev_id) # Load history and current rev current_and_past_revs = list(n_deleted_edits_before( session, rev_id, title, timestamp, n=radius + 1, rvprop={'ids', 'timestamp', 'sha1'} | rvprop )) if len(current_and_past_revs) < 1: raise KeyError("Revision {0} not found in page {1}." .format(rev_id, title)) current_rev, past_revs = ( current_and_past_revs[-1], # Current current_and_past_revs[:-1] # Past revisions ) if window is not None and before is None: before = Timestamp(current_rev['timestamp']) + window ``` -------------------------------- ### Query Archived Edits Before Revision Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/db.html Queries archived revisions before a specific revision ID within a given namespace and title. Results are reversed to maintain chronological order. ```python def n_archived_edits_before(schema, rev_id, namespace, title, timestamp, n, rvprop=None): with schema.transaction() as session: result = session.query(schema.archive).filter( and_(schema.archive.c.ar_namespace == namespace, schema.archive.c.ar_title == title, schema.archive.c.ar_timestamp < bytes(timestamp.short_format(), 'utf8'), schema.archive.c.ar_rev_id < rev_id)).order_by( schema.archive.c.ar_rev_id.desc()).limit(n) # Reverse order because of the query pattern rows = reversed(list(result)) for row in rows: yield row ``` -------------------------------- ### check Source: https://pythonhosted.org/mwreverts/_modules/mwreverts/db.html Checks the revert status of a revision, determining if it's a reverting edit, was reverted, or was reverted to. It analyzes a specified number of preceding revisions within a given page and radius. ```APIDOC ## check ### Description Checks the revert status of a revision. With this method, you can determine whether an edit is a 'reverting' edit, was 'reverted' by another edit and/or was 'reverted_to' by another edit. ### Parameters #### Path Parameters - **schema** (object) - Required - The database schema object. - **rev_id** (int) - Required - The ID of the revision to check. - **page_id** (int) - Optional - The ID of the page the revision occupies (slower if not provided). - **radius** (int) - Optional - A positive integer indicating the maximum number of revisions that can be reverted. Defaults to `defaults.RADIUS`. - **before** (Timestamp) - Optional - If set, limits the search for *reverting* revisions to those which were saved before this timestamp. - **window** (int) - Optional - If set, limits the search for *reverting* revisions to those which were saved within `window` seconds after the reverted edit. ### Returns A triple :class:`mwreverts.Revert` | `None`: * reverting -- If this edit reverted other edit(s) * reverted -- If this edit was reverted by another edit * reverted_to -- If this edit was reverted to by another edit ### Example ```python import mwdb import mwreverts.db schema = mwdb.Schema("mysql+pymysql://enwiki.labsdb/enwiki_p" + "?read_default_file=~/replica.my.cnf") def print_revert(revert): if revert is None: print(None) else: print(revert.reverting['rev_id'], [r['rev_id'] for r in revert.reverteds], revert.reverted_to['rev_id']) reverting, reverted, reverted_to = \ mwreverts.db.check(schema, 679778587) print_revert(reverting) print_revert(reverted) print_revert(reverted_to) ``` ``` -------------------------------- ### mwreverts.db.check Source: https://pythonhosted.org/mwreverts/db.html Checks the revert status of a revision. Determines if an edit is a reverting edit, was reverted by another edit, or was reverted_to by another edit. ```APIDOC ## mwreverts.db.check ### Description Checks the revert status of a revision. With this method, you can determine whether an edit is a ‘reverting’ edit, was ‘reverted’ by another edit and/or was ‘reverted_to’ by another edit. ### Parameters #### Path Parameters - **schema** (mwdb.Schema) - Required - The database schema to use. - **rev_id** (int) - Required - The ID of the revision to check. #### Query Parameters - **page_id** (int) - Optional - The ID of the page the revision occupies (slower if not provided). - **radius** (int) - Optional - A positive integer indicating the maximum number of revisions that can be reverted (default: 15). - **before** (mwtypes.Timestamp) - Optional - If set, limits the search for _reverting_ revisions to those which were saved before this timestamp. - **window** (int) - Optional - If set, limits the search for _reverting_ revisions to those which were saved within window seconds after the reverted edit. - **rvprop** (set(str)) - Optional - A set of properties to include in revisions. ### Returns A triple `mwreverts.Revert` or None: * reverting – If this edit reverted other edit(s) * reverted – If this edit was reverted by another edit * reverted_to – If this edit was reverted to by another edit ### Request Example ```python import mwdb import mwreverts.api schema = mwdb.Schema("mysql+pymysql://enwiki.labsdb/enwiki_p" + "?read_default_file=~/replica.my.cnf") def print_revert(revert): if revert is None: print(None) else: print(revert.reverting['rev_id'], [r['rev_id'] for r in revert.reverteds], revert.reverted_to['rev_id']) reverting, reverted, reverted_to = \ mwreverts.db.check(schema, 679778587) print_revert(reverting) print_revert(reverted) print_revert(reverted_to) ``` ```