fabian flöck.

wikiwho: authorship attribution and more.


introduction to the functionality of wikiwho.

wikiwho source code.

the python code of the original wikiwho publication plus some extensions we made since then.

wikiwho api.

An api for change, authorship and conflict information on live Wikipedia data. still under development.

wikiwho core algorithm description.

the research paper about the wikiwho algorithm for mining authorship (plus evaluation material used in the paper).

developed @


the core functionality of wikiwho is to parse the complete set of all historical revisions (versions) of a wikipedia article in order to find out who wrote and/or removed which exact text at what point in time. this means that given a specific revision of an article (e.g., the current one) wikiwho can determine for each word and special character which user first introduced that word and if and how it was deleted/reintroduced afterwards. this functionality is not offered by wikipedia as such and wikiwho was shown to perform this task with very high accuracy (~95%) and very efficiently, being the only tool that has been scientifically proven to perform this task that well (cf. the paper).

on top of the generated authorship and change data, other data can be mined and other tools can be build. we have extended the original model to also provide relationships between editors in an article such as "delete" or "reintroduce" based on the word they delete or add. we are currently working on a visualization of these networks as well as other visualization of metrics and word authorship useful for end-users that are interested in exploring the collaborative writing dynamics of wikipedia.

wikiwho api.

We offer an API for word provenance/authorship for the English Wikipedia:
You can get word/token-wise information from which revision what content originated (and thereby which editor originally authored the word) as well as all changes a token was ever subject to. Try the graphical interface at:


Or make an example call: https://api.wikiwho.net/api/v1.0.0-beta/rev_content/Cologne/?o_rev_id=true&editor=true&token_id=true&out=true&in=true

IF YOU CAN: Let me know if you use it / like it / don't like it / fine any specific errors / want any specific features. Email: f.floeck-youknowwhat-gmail.com

CREDIT: Kenan Erdogan, Maribel Acosta, Philipp Singer, Pavan Kumar Pandappa.

wikiwho source code.

the original code plus some variants that contain extensions, especially a new function extracting relations between editors. note that extended versions might include additional computational steps that can lead to higher runtimes than the original. all available under MIT licsense at:


wikiwho paper: detecting authorship of revisioned content.