Mišo Belica
e42cfbe487
Cleanups
11 years ago
Mišo Belica
d40a89a683
Use nose collector for tests
11 years ago
Mišo Belica
e6b3567417
Be ready for wheel binary packaging
...
More info at http://pythonwheels.com/
11 years ago
Mišo Belica
687d2ecfdf
Merge branch 'master' of https://github.com/bookieio/breadability into upstream-sync
...
Conflicts:
CHANGELOG.rst
README.rst
breadability/document.py
breadability/scoring.py
breadability/scripts/client.py
setup.py
tests/test_articles/test_sweetshark/article.html
tests/test_articles/test_sweetshark/test.py
11 years ago
Mišo Belica
6549a6c307
Added alternative "newspaper" into README
11 years ago
Richard Harding
6906f3b2fa
Update logging to drop WARN to INFO
11 years ago
Richard Harding
347f3ea0b5
Lint
11 years ago
Richard Harding
17270db5f0
Add test for title
11 years ago
Richard Harding
19d3ee634c
Update readme to note py3 ready
11 years ago
Richard Harding
ca8bee0a7b
Update to 0.1.15
11 years ago
Richard Harding
1fc153d850
Rename it back. Respect others
11 years ago
Richard Harding
4cbde9cb5a
Don't need the old versions any more
11 years ago
Richard Harding
f4fa0c1040
Working on merging/updating changelog, news, and makefile
11 years ago
Richard Harding
dc0493f99b
Update to catch back up to craig's image helper
11 years ago
Richard Harding
433195e122
Update sycning with the other branch
11 years ago
Richard Harding
e9485b6fdf
Tests working, makefile back into play
11 years ago
Richard Harding
d6317cd2ce
Sync up with the fork
11 years ago
Mišo Belica
5f1b39fe0b
Cleanups [ci skip]
11 years ago
Mišo Belica
09b4040578
Append sibling node only when it doesn't already exist
11 years ago
Mišo Belica
3746ee5bb5
Treat images a little differently so they get more inclusion
...
- When the body of the article contains screenshots/etc we want
to try to keep those images around.
- Added test for Business Insider article.
- Added sweetshark test.
- Added craig to the credits.
11 years ago
Mišo Belica
02160fe2ae
Cleanup
11 years ago
Mišo Belica
573a05f940
Added alternative "python-goose" into README
11 years ago
Mišo Belica
d138b6394e
Cleanups
11 years ago
Mišo Belica
e5401d7ab2
Added URL into User-Agent string
11 years ago
Mišo Belica
d530acb8c6
I discovered maintainer meta-data parameter
11 years ago
Mišo Belica
c091249162
Changed execution of nosetests
11 years ago
Richard Harding
042779bd12
Update version to 0.1.14
11 years ago
Richard Harding
05e13a4834
Update to only append sibling if we don't already have it
11 years ago
Richard Harding
952ea273c5
Update to version 0.1.13
11 years ago
Craig Maloney
9b9ec5b0e6
Treat images a little differently so they get more inclusion.
...
- When the body of the article contains screenshots/etc we want to try to keep
those images around.
- Added test for Business Insider article
- Adding sweetshark test from issue #1
- Add craig to the credits
11 years ago
Mišo Belica
471db19a43
Added BTE tool into similar tools to readme
11 years ago
Mišo Belica
43cc38dc7b
Cleanup
11 years ago
Richard Harding
37c6c41d29
Update versions for 0.1.12
11 years ago
macmenot
4f2b744a3a
Set urllib useragent string.
...
- Use a custom string to help with identifying traffic
- Update version to 0.1.12
- Small linting
Adjust the user agent string, lint
11 years ago
Mišo Belica
81ba7aec3c
Create console scripts with python version suffix
11 years ago
Mišo Belica
51df29f05d
Write readable content into temp file in binary mode
11 years ago
Mišo Belica
42530d4af7
Use py3k compatible urllib with own User-Agent header
11 years ago
Mišo Belica
9ed02047dd
Added string representation for empty scored node
11 years ago
Mišo Belica
7630237b86
Added missing empty line
11 years ago
Mišo Belica
c34bc53d9e
Updated list of similar tools
11 years ago
Mišo Belica
bf6cfef556
Renamed '_py3k.py' -> '_compat.py'
11 years ago
Mišo Belica
bd084a8e28
Fixed named argument name 'fragment'
11 years ago
Mišo Belica
8f3ebf0950
Removed file with version number
11 years ago
Mišo Belica
8c775fee7f
Added new test article
11 years ago
Mišo Belica
c9afc38c49
Cleanups for function 'clean_document'
11 years ago
Mišo Belica
5c20673d45
Don't remove h1/h2 elements from readable article
11 years ago
Mišo Belica
c9e087d077
Cleanups
11 years ago
Mišo Belica
e0c87223ae
Better log messages while scoring candidates
11 years ago
Mišo Belica
df5cb8c8f6
Added scored nodes into candidates
11 years ago
Mišo Belica
f858f0dbb0
1 pt for 100 inner text chars is computed as float
11 years ago