Federico Leva
54d9d8051e
Remove dead Miraheze wikis per checkalive.py
...
Closes issue #465
11 months ago
Federico Leva
c09db669c9
Update checkalive.pl documentation
12 months ago
Federico Leva
1b02cee1d5
Revert "Update miraheze.org list with checkalive.py"
...
Some 70 % of the removed wikis still return an HTTP 200 although they
may be frozen or closed.
Tested with:
git show | grep ^- | cut -f3 -d/ | sed --regexp-extended 's,(.+),https://\1/wiki/,g ' | sort | shuf -n 100 | xargs -I§ -P10 sh -c "curl -Is -w '%{stderr}%{http_code}\n' § > /dev/null" 2>&1 | sort | uniq -c
This reverts commit 0a3dc23f98
.
12 months ago
Federico Leva
0a3dc23f98
Update miraheze.org list with checkalive.py
...
Addresses issue #465
12 months ago
Federico Leva
40a1f35dae
Update miraheze.org list of wikis
12 months ago
Liu
d9885e0845
Update shoutwiki-spider to remove duplicates
2 years ago
Liu
fcc4080b23
Update neoseeker.com.info instructions
2 years ago
Liu
e7f7266550
Update fandom.com spider and remove duplicates
2 years ago
Liu
9c5c55342d
Update miraheze.org spider and remove duplicates
2 years ago
Liu
4c970e358d
Remove duplicates from wiki-site.com
2 years ago
Liu
74a8e9609f
Update wiki-site.com spider and list
2 years ago
Liu
ba7fab2e96
Add fandom-spider and update metadata and lists
2 years ago
Liu
49e41ee75d
Update neoseeker.com spider and list
2 years ago
Liu
6346fd6553
Update shoutwiki.com spider and list
2 years ago
Liu
f93988e9c6
Update fandom.com to HTTPS
2 years ago
Liu
91faa34529
Update shoutwiki.com list
2 years ago
Liu
d6fe1d9ff8
Update battlestarwiki.org list
2 years ago
Liu
6f8f160d75
Update fandom.com list
2 years ago
Liu
6b39402ebf
Update miraheze.org list
2 years ago
Liu
f755153de9
Update neoseeker.com list
2 years ago
Federico Leva
10ee80ca3b
Rename wikia list to fandom
2 years ago
RhinosF1
3b28efab80
Update miraheze.org list
...
Using https://gist.github.com/RhinosF1/18c83dfbfadb84e28ee083628c029b41
4 years ago
Federico Leva
8fb2b44fdb
Update list of Wikia wikis with today's list from the API
4 years ago
Federico Leva
ed46725a89
Sort list of Wikia wikis again
...
No change in content.
4 years ago
Federico Leva
7dad9a44cd
Give up on Wikia-made dumps
...
There are less than 500 available right now, out of 400k active wikis.
4 years ago
Federico Leva
accc7db019
Update list of MediaWikis
...
* Run checkalive.py on the "originalurl" URLs from existing items in the
WikiTeam collection on the Internet Archive, minus dead wiki farms.
* Downloaded the list of unarchived wikis from WikiApiary.
4 years ago
Federico Leva
aa0b133c1d
Minimal update to list of Wikia wikis
...
* Change API URL to HTTPS and fandom.com.
* New output of the script (403k wikis), changed to wikia.com for diff purposes.
4 years ago
Federico Leva
baae839a38
Complete update of the Wikia lists
...
* Reduce the offset to 100, the new limit for non-bots.
* Continue listing even when we get an empty request because all
the wikis in a batch have become inactive and are filtered out.
* Print less from curl's requests.
* Automatically write the domain names to the files here.
6 years ago
Federico Leva
b8909baa3d
Update Wikia list with wikia.py
6 years ago
Federico Leva
293da80da9
Add alive MediaWikis from the WikiTeam acrhive.org collection
6 years ago
Federico Leva
6a34bf65ea
Wikia dumps now use 7z, not gz
...
Note that existence doesn't mean the dump is usable.
6 years ago
emijrp
0e20be9a6e
sort
7 years ago
emijrp
bbdaf7723b
update neoseeker
7 years ago
emijrp
fc48c895ae
update info
7 years ago
emijrp
c7d5f9bb2e
update, 2244 wikis
7 years ago
emijrp
75e7628a11
now get ALL wikis, even closed ones
7 years ago
Hydriz
a8270a7769
Update Miraheze wiki farm
7 years ago
Hydriz
9fd6df7a3c
Scan for closed wikis as well
7 years ago
Hydriz Scholz
9f97e21503
Update Miraheze wiki farm
8 years ago
Alexia E. Smith
cb766de5ff
Update gamepedia.com wikis.
...
This is current as of 2016-04-07 and is correct at 1,120 wikis.
8 years ago
emijrp
dde7eb90ba
wiki.wiki info
9 years ago
emijrp
8048b92029
adding wiki.wiki wikifarm list
9 years ago
emijrp
e30cd44384
new wikifarm list of wikis
9 years ago
emijrp
d44db951c2
update date
9 years ago
emijrp
64c30f2b50
updating neoseeker list and sorting, +1 new wiki
9 years ago
Southparkfan
ebffb99f48
Add Miraheze wiki farm
9 years ago
Hydriz Scholz
1550d3755d
Update orain.org wiki list
9 years ago
Federico Leva
a1921f0919
Update list of wikia.com unarchived wikis
...
The list of unarchived wikis was compared to the list of wikis that we
managed to download with dumpgenerator.py:
https://archive.org/details/wikia_dump_20141219
To allow the comparison, the naming format was aligned to the format
used by dumpgenerator.py for 7z files.
9 years ago
Federico Leva
ce6fbfee55
Use curl --fail instead and other fixes; add list
...
Now tested and used to produce the list of some 300k Wikia wikis
which don't yet have a public dump. Will soon be archived.
10 years ago
Federico Leva
7471900e56
It's easier if the list has the actual domains
10 years ago