Mišo Belica
aa83825334
Tests migrated into pytest style
6 years ago
Mišo Belica
e2f3391dc3
Better decoding page into unicode
...
- Fixes #22
- Fixes #23
Prepare for release
10 years ago
Mišo Belica
5cb028ec93
Tests are executable with pytest framework
...
Pytest ignores files with name "test.py" for me :(
10 years ago
Mišo Belica
d40a89a683
Use nose collector for tests
10 years ago
Mišo Belica
687d2ecfdf
Merge branch 'master' of https://github.com/bookieio/breadability into upstream-sync
...
Conflicts:
CHANGELOG.rst
README.rst
breadability/document.py
breadability/scoring.py
breadability/scripts/client.py
setup.py
tests/test_articles/test_sweetshark/article.html
tests/test_articles/test_sweetshark/test.py
10 years ago
Richard Harding
17270db5f0
Add test for title
11 years ago
Richard Harding
1fc153d850
Rename it back. Respect others
11 years ago
Richard Harding
dc0493f99b
Update to catch back up to craig's image helper
11 years ago
Richard Harding
433195e122
Update sycning with the other branch
11 years ago
Richard Harding
d6317cd2ce
Sync up with the fork
11 years ago
Mišo Belica
5f1b39fe0b
Cleanups [ci skip]
11 years ago
Mišo Belica
3746ee5bb5
Treat images a little differently so they get more inclusion
...
- When the body of the article contains screenshots/etc we want
to try to keep those images around.
- Added test for Business Insider article.
- Added sweetshark test.
- Added craig to the credits.
11 years ago
Mišo Belica
02160fe2ae
Cleanup
11 years ago
Mišo Belica
bf6cfef556
Renamed '_py3k.py' -> '_compat.py'
11 years ago
Mišo Belica
8c775fee7f
Added new test article
11 years ago
Mišo Belica
5c20673d45
Don't remove h1/h2 elements from readable article
11 years ago
Mišo Belica
df5cb8c8f6
Added scored nodes into candidates
11 years ago
Mišo Belica
f858f0dbb0
1 pt for 100 inner text chars is computed as float
11 years ago
Mišo Belica
d054823958
Added simple test for parser of annotated text
11 years ago
Mišo Belica
05d2230015
Load articles/snippets as binary strings
11 years ago
Mišo Belica
e6191fe0d1
Link density is computed with normalized whitespace
...
HTML code contains many whitespace and if there is
large amount of indentation characters link density
is small even if there are only links with usefull
text.
11 years ago
Mišo Belica
c2a5b74230
Changed representation of annotated text
11 years ago
Mišo Belica
e366721873
Convert <hr> tag into paragraphs
11 years ago
Mišo Belica
3449a33d87
Test for changing multiple <br> into <p>
11 years ago
Mišo Belica
7bd7231e25
Renamed property of 'OriginalDocument': 'html' -> 'dom'
11 years ago
Mišo Belica
69dd9ef4fd
Changed 'readable_annotated_text' -> 'main_text'
11 years ago
Mišo Belica
0df3a95c1e
Property of ``Article`` with annotated text
11 years ago
Mišo Belica
f5939f4608
Skip unused tests instead of useless passing
11 years ago
Mišo Belica
6b87ac5e07
Use unicode literals from future, not 'to_string'
11 years ago
Mišo Belica
eb8a8c5248
Replaced deprecated method 'getiterator' by 'iter'
11 years ago
Mišo Belica
5abe69d917
Added new test article
11 years ago
Mišo Belica
0178cfff5c
Added compatibility file with unittest2 import
11 years ago
Mišo Belica
26fe24789c
Made packages from all tests
11 years ago
Mišo Belica
ee483a7f91
Changed location of test HTML files
11 years ago
Mišo Belica
3b5b2b1522
Renamed to readability
11 years ago
Mišo Belica
1a5970b238
Better names and positions for variables
11 years ago
Mišo Belica
930b6ced12
Fixed transformation of leaf <div> into <p>
11 years ago
Mišo Belica
18b5c9b447
Refactored file 'scoring.py'
11 years ago
Mišo Belica
dcb7c18fd5
Refactored file 'document.py'
...
Removed non-intuitive parts and dead code
not covered by tests. Better names for objects.
Better coverage by tests.
11 years ago
Mišo Belica
b3b987440d
Added test runner via nosetests
11 years ago
Mišo Belica
3f71e1b7d4
Refactored checking of node's attribute
11 years ago
Mišo Belica
636a38d705
Refactored generating of hash ID
11 years ago
Mišo Belica
9a613317c0
Make package from tests
11 years ago
Mišo Belica
cc00976533
Replace implementation of 'cached_property'
...
Parameter 'ttl' isn't needed.
11 years ago
Mišo Belica
e3b6ee2fd6
Suppress warning "ResourceWarning: unclosed file"
11 years ago
Mišo Belica
101950478e
Simplify logging
11 years ago
Mišo Belica
3322681166
Use 'charade' for detecting encoding
11 years ago
Mišo Belica
544220e9a3
Replaced u"" literal wit function 'to_unnicode'
...
Literal u"" is not supported by Python v3.2.
11 years ago
Mišo Belica
94f6b0a84e
Tests passes for both Python v2.7, v3.3
11 years ago
Mišo Belica
912bb50b76
Skip failing test that I don't know how to fix
11 years ago