You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
wikiteam/README.md

57 lines
4.5 KiB
Markdown

10 years ago
# WikiTeam
### We archive wikis, from Wikipedia to tiniest wikis
10 years ago
10 years ago
**WikiTeam software is a set of tools for archiving wikis.** They work on MediaWiki wikis, but we want to expand to other wiki engines. As of June 2014, WikiTeam has preserved more than [13,000 stand-alone wikis](https://github.com/WikiTeam/wikiteam/wiki/Available-Backups), several wikifarms, regular Wikipedia dumps and [24TB of Wikimedia Commons images](https://archive.org/details/wikimediacommons).
10 years ago
10 years ago
There are [thousands](http://wikiindex.org) of [wikis](https://wikiapiary.com) in the Internet. Everyday some of them are no longer publicly available and, due to lack of backups, lost forever. Millions of people download tons of media files (movies, music, books, etc) from the Internet, implementing a kind of distributed backup. Wikis, most of them under free licenses, disappear from time to time because nobody grabbed a copy of them. That is a shame that we would like to solve.
10 years ago
10 years ago
**WikiTeam** is the [Archive Team](http://www.archiveteam.org) ([GitHub](https://github.com/ArchiveTeam)) subcommittee on wikis. It was founded and originally developed by [Emilio J. Rodríguez-Posada](https://github.com/emijrp), a Wikipedia veteran editor and amateur archivist. Many people have help sending suggestions, [reporting bugs](https://github.com/WikiTeam/wikiteam/issues), writing [documentation](https://github.com/WikiTeam/wikiteam/wiki), providing help in the [mailing list](http://groups.google.com/group/wikiteam-discuss) and making [wiki backups](https://github.com/WikiTeam/wikiteam/wiki/Available-Backups). Thanks to all, especially to: [Federico Leva](https://github.com/nemobis), [Alex Buie](https://github.com/ab2525), [Scott Boyd](http://www.sdboyd56.com), [Hydriz](https://github.com/Hydriz), Platonides, Ian McEwen and [Mike Dupont](https://github.com/h4ck3rm1k3).
10 years ago
<table border=0 cellpadding=5px>
<tr><td>
10 years ago
<a href="https://github.com/WikiTeam/wikiteam/wiki/New-Tutorial"><img src="https://upload.wikimedia.org/wikipedia/commons/f/f3/Nuvola_apps_Wild.png" width=100px alt="Documentation" title="Documentation"/></a>
10 years ago
</td><td>
10 years ago
<a href="https://raw.githubusercontent.com/WikiTeam/wikiteam/master/dumpgenerator.py"><img src="http://upload.wikimedia.org/wikipedia/commons/2/2a/Nuvola_apps_kservices.png" width=100px alt="Source code" title="Source code"/></a>
10 years ago
</td><td>
10 years ago
<a href="https://github.com/WikiTeam/wikiteam/wiki/Available-Backups"><img src="https://upload.wikimedia.org/wikipedia/commons/3/37/Nuvola_devices_3floppy_mount.png" width=100px alt="Download available backups" title="Download available backups"/></a>
10 years ago
</td><td>
10 years ago
<a href="https://groups.google.com/group/wikiteam-discuss"><img src="https://upload.wikimedia.org/wikipedia/commons/0/0f/Nuvola_apps_kuser.png" width=100px alt="Community" title="Community"/></a>
10 years ago
</td><td>
10 years ago
<a href="https://twitter.com/_WikiTeam"><img src="https://upload.wikimedia.org/wikipedia/commons/e/eb/Twitter_logo_initial.png" width=90px alt="Follow us on Twitter" title="Follow us on Twitter"/></a>
10 years ago
</td></tr>
</table>
10 years ago
## Quick guide
10 years ago
This is a very quick guide for the most used features of WikiTeam tools. For further information, read the [tutorial](https://github.com/WikiTeam/wikiteam/wiki/New-Tutorial) and the rest of the [documentation](https://github.com/WikiTeam/wikiteam/wiki). You can also ask in the [mailing list](http://groups.google.com/group/wikiteam-discuss).
10 years ago
10 years ago
### Download any wiki
10 years ago
10 years ago
For downloading any wiki, use one of the following options:
10 years ago
10 years ago
`python dumpgenerator.py --api=http://wiki.domain.org/w/api.php --xml --images` (complete XML histories and images)
`python dumpgenerator.py --api=http://wiki.domain.org/w/api.php --xml` (complete XML histories)
`python dumpgenerator.py --api=http://wiki.domain.org/w/api.php --xml --curonly` (only current version of every page)
You can resume an aborted download:
`python dumpgenerator.py --api=http://wiki.domain.org/w/api.php --xml --images --resume --path=/path/to/incomplete-dump`
See more options:
`python dumpgenerator.py --help`
10 years ago
### Download Wikimedia dumps
10 years ago
For downloading [Wikimedia XML dumps](http://dumps.wikimedia.org/backup-index.html) (Wikipedia, Wikibooks, Wikinews, etc), use:
10 years ago
`python wikipediadownloader.py` (download all projects)
10 years ago
See more options with: `python wikipediadownloader.py --help`
10 years ago
### Download Wikimedia Commons images
There is a script for this, but we have [uploaded the tarballs](https://archive.org/details/wikimediacommons) to Internet Archive, so perhaps it is a better option download them from IA instead re-generating them with the script.