Commit Graph

678 Commits (362309a2daf80edfc652bf344d9426a8f25f4d1b)
 

Author SHA1 Message Date
Antoine Musso 362309a2da Add tox env for flake8 linter
Most people know about pep8 which enforce coding style.  pyflakes
goes a step beyond by analyzing the code.

flake8 is basically a wrapper around both pep8 and pyflakes and comes
with some additional checks.  I find it very useful since you only need
to require one package to have a lot of code issues reported to you.

This patch provides a 'flake8' tox environement to easily install
and run the utility on the code base.  One simply has to:

     tox -eflake8

The repository in its current state does not pass checks We can later
easily ensure there is no regression by adjusting Travis configuration
to run this env.

The env has NOT been added to the default list of environement.

More informations about flake8: https://pypi.python.org/pypi/flake8
10 years ago
Federico Leva 8cf4d4e6ea Add 30k domains from another crawler
11011 were found alive by checkalive.py (though there could be more
if one checks more subdomains and subdirectories), some thousands
more by checklive.pl (but mostly or all false positives).

Of the alive ones, about 6245 were new to WikiApiary!
https://wikiapiary.com/wiki/Category:Oct_2014_Import
10 years ago
Federico Leva 7e0071ae7f Add some UseModWiki-looking domains 10 years ago
nemobis 6b11cef9dc A few thousands more doku.php URLs from own scraping 10 years ago
nemobis 0624d0303b Merge pull request #198 from Southparkfan/patch-1
Update list of Orain wikis
10 years ago
Southparkfan 8ca9eb8757 Update date of Orain wikilist 10 years ago
Southparkfan 2e2fe9b818 Update list of Orain wikis 10 years ago
Marek Šuppa 8c44cff165 readme: Small wording fixes
* Small fixed in `Download Wikimedia dumps` section.
10 years ago
nemobis 6f74781e78 Merge pull request #197 from mrshu/mrshu/autopep8fied-wikiadownloader
wikiadownloader: Autopep8fied
10 years ago
mr.Shu f022b02e47 wikiadownloader: Autopep8fied
* Made the source look a bit better, though this script might not be
  used anymore.

Signed-off-by: mr.Shu <mr@shu.io>
10 years ago
nemobis b3ef165529 Merge pull request #194 from mrshu/mrshu/dumpgenerator-pep8fied
dumpgenerator: AutoPEP8-fied
10 years ago
mr.Shu 04446a40a5 dumpgenerator: AutoPEP8-fied
* Used autopep8 to made sure the code looks nice and is actually PEP8
  compliant.

Signed-off-by: mr.Shu <mr@shu.io>
10 years ago
nemobis 23a60fa850 MediaWiki CamelCase 10 years ago
nemobis 31112b3a80 checkalive.py: more checks before accessing stuff 10 years ago
nemobis 225c3eb478 A thousand more doku.php URLs from search 10 years ago
nemobis e0f8e36bf4 Merge pull request #190 from PiRSquared17/api-allpages-disabled
Fallback to getPageTitlesScraper() if API allpages disabled
10 years ago
nemobis a7e1b13304 Merge pull request #193 from mrshu/mrshu/readme-fix-wording
readme: Fix wording
10 years ago
nemobis 3fc7dcb5de Add some more doku.php URLs 10 years ago
mr.Shu 54c373e9a0 readme: Fix wording
* Made a few wording changes to make the README.md more clear.

Signed-off-by: mr.Shu <mr@shu.io>
10 years ago
Marek Šuppa 40d863fb99 README: update working
* Updated wording to make the README more clear.
10 years ago
Emilio J. Rodríguez-Posada 87ce2d4540 Merge pull request #192 from mrshu/mrshu/add-travis-image
update: Add TravisCI image to README
10 years ago
mr.Shu 7b0b54b6e5 update: Add TravisCI image to README
* Added TravisCI image which specifies whether the tests are passing or
  not to Developers section.

Signed-off-by: mr.Shu <mr@shu.io>
10 years ago
Emilio J. Rodríguez-Posada 5c8e316e67 Merge pull request #189 from PiRSquared17/get-wiki-engine
Improve getWikiEngine()
10 years ago
Emilio J. Rodríguez-Posada 086415bc00 Merge pull request #191 from mrshu/mrshu/setup-travis
tests: Add .travis.yml and Travis CI
10 years ago
mr.Shu 14c62c6587 tests: Add .travis.yml and Travis CI
* Added .travis.yml to enable Travis CI

Signed-off-by: mr.Shu <mr@shu.io>
10 years ago
PiRSquared17 757019521a Fallback to scraper if API allpages disabled 10 years ago
PiRSquared17 4b3c862a58 Comment debugging print, fix test 10 years ago
PiRSquared17 7a1db0525b Add more wiki engines to getWikiEngine 10 years ago
nemobis 40c406cd00 Merge pull request #188 from PiRSquared17/wikiengine-lists
Add subdirectories to listsofwikis for different wiki engines
10 years ago
PiRSquared17 56c2177106 Add (incomplete) list of dokuwikis 10 years ago
PiRSquared17 03ddde3702 Move wiki lists to mediawiki subdirectory 10 years ago
Emilio J. Rodríguez-Posada 43a105335b Merge pull request #185 from PiRSquared17/fix-tests
Relax delay() test by 10 ms, add test for allpages
10 years ago
PiRSquared17 d7e43f92c7 Relax delay() test by 10 ms, add test for allpages 10 years ago
nemobis f52051f8ae Merge pull request #184 from PiRSquared17/fix-tests
Fix tox.ini and clean up/update tests, avoid a loop to make tests pass
10 years ago
PiRSquared17 b4818d2985 Avoid infinite loop in getImageNamesScraper 10 years ago
PiRSquared17 f2b7716e72 Fix tox.ini and clean up/update tests 10 years ago
nemobis 8a9b50b51d Merge pull request #183 from PiRSquared17/patch-7
Retry on ConnectionError in getXMLPageCore
10 years ago
nemobis 9828cbec3c Add PiRSquared17 to credits 10 years ago
nemobis 19c48d3dd0 Merge pull request #180 from PiRSquared17/patch-2
Get as much information from siteinfo as possible
10 years ago
nemobis d8360393da Merge pull request #182 from PiRSquared17/patch-6
AllPages API fix for old MediaWiki versions
10 years ago
Pi R. Squared f7187b7048 Retry on ConnectionError in getXMLPageCore
Previously it just gave a fatal error.
10 years ago
Pi R. Squared f31e4e6451 Dict not hashable, also not needed
Quick fix.
10 years ago
Pi R. Squared 399f609d70 AllPages API hack for old versions of MediaWiki
New API format: http://www.mediawiki.org/w/api.php?action=query&list=allpages&apnamespace=0&apfrom=!&format=json&aplimit=500
Old API format: http://wiki.damirsystems.com/api.php?action=query&list=allpages&apnamespace=0&apfrom=!&format=json
10 years ago
nemobis b3e77fe006 Merge pull request #181 from PiRSquared17/patch-4
Try getting index.php from siteinfo API
10 years ago
nemobis 9beda42385 Merge pull request #137 from hashar/tests-with-tox-and-nose
Easily run tests in a virtualenv with tox and nose
10 years ago
Pi R. Squared 498b64da3f Try getting index.php from siteinfo API
Fixes #49
10 years ago
Pi R. Squared ff0d230d08 Get as much information from siteinfo as possible
Properly fixes #74.

Algorithm:
1. Try all siteinfo props. If this gives an error, continue. Otherwise, stop.
2. Try MediaWiki 1.11-1.12 siteinfo props. If this gives an error, continue. Otherwise, stop.
3. Try minimal siteinfo props. Stop.
Not using sishowalldb=1 to avoid possible error (by default), since this data is of little use anyway.
10 years ago
nemobis ac1a7defae Merge pull request #178 from PiRSquared17/patch-1
Encode title using UTF-8 before printing
10 years ago
Pi R. Squared 322604cc23 Encode title using UTF-8 before printing
This fixes #170 and closes #174.
10 years ago
nemobis 11368310ee Merge pull request #173 from nemobis/issue/131
Fix #131: ValueError: No JSON object could be decoded
10 years ago