Commit Graph

227 Commits (master)
 

Author SHA1 Message Date
Richard Harding 84f6a079f9 Try to adjust the travis command to test py2.6 12 years ago
Richard Harding b18589ced8 Use the right package doh 12 years ago
Richard Harding 316c550709 Add python 2.6 to the travis ci 12 years ago
Richard Harding fee5c37b39 Add argparse as a install req for py <2.7 12 years ago
Richard Harding 3dea2f349b Update ignore file 12 years ago
Nathan Nifong 920094c81a Add a penalty for double quote chars in paragraphs.
- They are far more common in random commented code and proprietary metadata
  that keeps slipping by the filter as actual content.
- Downgraded the score value of commas for the same reason.
- Prep for 0.1.10 release with these changes.

Add credits and tweak the " and , scoring

Update version and update the scoring code
12 years ago
Richard Harding 60da675da5 Reprocess without candidate in case of errors using one
- Fixes #10
12 years ago
Richard Harding 3984e04668 Add better handling around xml parsing issues
- Fixes #9 with empty/non parsable docs
- Fixes #8 and removes kwargs for the decode statements.
- Fixes #7 by checking if the node has a parent before dropping.
12 years ago
Richard Harding fe9364295f prep for 0.1.7 release 12 years ago
Richard Harding ae355e9f2f Update kwarg for older python 12 years ago
Richard Harding 0de17a7b81 Update readme 12 years ago
Richard Harding e592f5322e Prep for 0.1.6 12 years ago
Richard Harding bf35e3410e Do some link filtring to drop stupid permalinks from the content. 12 years ago
Richard Harding 9cf19d9970 Prep for 0.1.5 12 years ago
Richard Harding ff37f3169f Add checks to links to remove really bad links from the scripting site 12 years ago
Richard Harding 5157b4570d Prep for the 0.1.4 release 12 years ago
Richard Harding 5704eb4c15 Start process of adding a newtest script for generating test cases
- Adds new breadability_newtest tool for generating test cases.
- Add fixes for the scripting.com test failure.
12 years ago
Richard Harding 3b00d33ad3 Prep for 0.1.3 release 12 years ago
Richard Harding c2f935bf51 Remove code we didn't need 12 years ago
Richard Harding 326fbfe107 Fix the processing and clean up the antipope article 12 years ago
Richard Harding 3ae64f165e Update and merge 12 years ago
Richard Harding edca1c74ba Add in test files for antipope blog post 12 years ago
Richard Harding d3c83b7255 Update scoring and tests for the antipope article 12 years ago
Richard Harding 3f70a49a22 Update to fix client, add head to the css downgrade weights 12 years ago
Richard Harding 46ede7ccfb Prep for 0.1.2 release 12 years ago
Richard Harding 811921775c Started to do some testing, but really not happy with it 12 years ago
Richard Harding 7c220535df Complete upstream merge 12 years ago
Greg Jastrab c8c53b304b Bonus per 100 chars logic was incorrect
Number of characters was being mod'd by 100 instead of divided,
so a paragraph with a character length of 103 would have
incorrectly gotten 3 bonus points added to the content score.

Add Greg to credits
12 years ago
Richard Harding be77f99be1 Add doc and candidates properties to the article 12 years ago
Richard Harding 2e3f416e3b Garden 12 years ago
Richard Harding e83a753b82 Garden and lint 12 years ago
Richard Harding 6d380712c5 Start process of testing full candidate scoring 12 years ago
Richard Harding ae9208374b Add some ScoredNode tests as well 12 years ago
Richard Harding e57f8f02ce Adding tests for the id/css weights and link density 12 years ago
Richard Harding 90a02569ca Prep for 0.1.1 release 12 years ago
Richard Harding e168484126 Garden readme 12 years ago
Richard Harding 645838c66c Update readme with ci and other important links 12 years ago
Richard Harding 1553eda145 Fix typo in travis config 12 years ago
Richard Harding ad3685d4f4 Start to add items to get travis ci builds working 12 years ago
Richard Harding 56f29a8585 Mark true so we can start sending tests to travisci 12 years ago
Richard Harding 32350fc3a1 Create LNODE and update bugs in parsing
- Add concept of a LNODE logger that outputs information about scoring, node,
    and generates a hash_id for the node content so we can track it.
- Add `-d` flag to the cmd line client to output the LNODE logging
- Update reading in of http content in the client to be unicode
- Wrap stdout with a unicode happy stream so we can pipe unicode to less/grep,
    etc
- Add html article to the scorable tags we work with
- Make sure we drop iframe along with noscript
- Fix scoring bugs around length points
- Add the hash_id as a scored node @property
12 years ago
Richard Harding f1623fc3e3 Redo the candidate logging to help us locate the best candidate 12 years ago
Richard Harding 278d695614 Update readme for the new cmd line flags 12 years ago
Richard Harding 6b92dd2f83 Add -f and -b flags to client
- added a -f flag that will override only getting a <div> fragement back and
return a fully constructed document
- added a -b flag to not just parse, but write to temp file and open in a
browser, great for testing
- Updated the Article to support the fragment=False so that you can get back a
fully wrapped <html> document with a header (especially with utf-8 content
type set yay)
12 years ago
Richard Harding 8b77675ab2 Fix up some tests since we should have run them before tagging 0.1...need to get into build server 12 years ago
Richard Harding 745598dff9 Update news file with initial release 12 years ago
Richard Harding 279788c003 Update the readme for install info 12 years ago
Richard Harding 9e6835bd92 Work on tweaking out parser algorithm to help find the right candidate: fixes #2 12 years ago
Richard Harding b78ea49c5a Update readme so people don't misunderstand 12 years ago
Richard Harding 454e283850 Add link to readability 12 years ago