Refs #209 - Increase score for elements containing large amount of text.

pull/212/head
Nicolas Perriault 9 years ago
parent 8510106638
commit 2c5ba594dd

@ -110,7 +110,7 @@ Readability.prototype = {
unlikelyCandidates: /banner|combx|comment|community|disqus|extra|foot|header|menu|related|remark|rss|share|shoutbox|sidebar|skyscraper|sponsor|ad-break|agegate|pagination|pager|popup/i,
okMaybeItsACandidate: /and|article|body|column|main|shadow/i,
positive: /article|body|content|entry|hentry|main|page|pagination|post|text|blog|story/i,
negative: /hidden|banner|combx|comment|com-|contact|foot|footer|footnote|masthead|media|meta|outbrain|promo|related|scroll|share|shoutbox|sidebar|skyscraper|sponsor|shopping|tags|tool|widget/i,
negative: /hidden|banner|combx|comment|com-|contact|control-?bar|foot|footer|footnote|masthead|media|meta|outbrain|promo|related|scroll|share|shoutbox|sidebar|skyscraper|sponsor|shopping|tags|tool|widget/i,
extraneous: /print|archive|comment|discuss|e[\-]?mail|share|reply|all|login|sign|single|utility/i,
byline: /byline|author|dateline|writtenby/i,
replaceFonts: /<(\/?)font[^>]*>/gi,
@ -722,6 +722,10 @@ Readability.prototype = {
// For every 100 characters in this paragraph, add another point. Up to 3 points.
contentScore += Math.min(Math.floor(innerText.length / 100), 3);
if (elementToScore.tagName !== "section" && innerText.length > 300) {
contentScore *= 1.5;
}
// Initialize and score ancestors.
this._forEachNode(ancestors, function(ancestor, level) {
if (!ancestor.tagName)

@ -1,4 +1,7 @@
<div id="readability-page-1" class="page">
<article itemscope="" itemtype="http://schema.org/NewsArticle" class="standalone">
<header> </header>
<section id="article-guts">
<div itemprop="articleBody" class="article-content clearfix">
<figure class="intro-image image center full-width"> <img src="http://cdn.arstechnica.net/wp-content/uploads/2015/04/server-crash-640x426.jpg" width="640" height="331">
<figcaption class="caption"> </figcaption>
@ -45,4 +48,14 @@
</blockquote>
<p>Ars is asking Mojang for comment and will update this post if company officials respond.</p>
</div>
<p><a href="http://arstechnica.com/security/2015/04/16/just-released-minecraft-exploit-makes-it-easy-to-crash-game-servers/">Expand full story</a></p>
</section>
<div id="article-footer-wrap">
<aside class="thin-divide-bottom"> </aside>
<section class="article-author clearfix-redux">
<a href="http://fakehost/author/dan-goodin"><img width="47" height="47" src="http://cdn.arstechnica.net/wp-content/uploads/authors/Dan-Goodin-sq.jpg"></a>
<p><a href="http://fakehost/author/dan-goodin" class="author-name">Dan Goodin</a> / Dan is the Security Editor at Ars Technica, which he joined in 2012 after working for The Register, the Associated Press, Bloomberg News, and other publications.</p>
</section>
</div>
</article>
</div>

@ -0,0 +1,6 @@
{
"title": "Nob Hill one-bedroom sells for $2.3 million",
"byline": "By Emily Landes",
"excerpt": "In early March, we told you about a one-bedroom in the prestigious Comstock in Nob Hill, which came to market at $2.495 million, making it the most expensive one-bedroom on the market at the time. The northwest facing unit—with amazing panoramic views of Twin Peaks, the Golden Gate Bridge, the",
"readerable": true
}

@ -0,0 +1,292 @@
<div id="readability-page-1" class="page">
<div class="entry">
<div class="gallery clearfix">
<div class="galleria asset_gallery" id="galleria_31011101">
<div id="hst-resgallery-31011101" class="hst-resgallery-container clearfix"><span gallery_title="1333 Jones St. #806" gallery_id="31011" class="gallery-metadata"></span>
<div class="control-panel"><span data="" id="gallerySection"></span>
<div class="control-panel-inner">
<div class="caption ">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">1</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The one-bedroom just came to market at $2.495 million</p>
<div class="caption_480">
<p> The one-bedroom just came to market at $2.495 million </p>
</div>
</div>
<p class="MsoNormal">The one-bedroom just came to market at $2.495 million</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">2</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Before the massive renovation, the living area still had great views, but was dated.</p>
<div class="caption_480">
<p> Before the massive renovation, the living area still had great views, but was dated. </p>
</div>
</div>
<p class="MsoNormal">Before the massive renovation, the living area still had great views,...but was dated.</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">3</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Beautiful views from the floor-to-ceiling windows.</p>
<div class="caption_480">
<p> Beautiful views from the floor-to-ceiling windows. </p>
</div>
</div>
<p class="MsoNormal">Beautiful views from the floor-to-ceiling windows.</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">4</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The open concept living room</p>
<div class="caption_480">
<p> The open concept living room </p>
</div>
</div>
<p class="MsoNormal">The open concept living room</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">5</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The dining area</p>
</div>
<p class="MsoNormal">The dining area</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">6</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The open entertaining space</p>
<div class="caption_480">
<p> The open entertaining space </p>
</div>
</div>
<p class="MsoNormal">The open entertaining space</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">7</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The dining area before</p>
</div>
<p class="MsoNormal">The dining area before</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">8</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The den</p>
</div>
<p class="MsoNormal">The den</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">9</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Another view of the den</p>
</div>
<p class="MsoNormal">Another view of the den</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">10</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The kitchen</p>
</div>
<p class="MsoNormal">The kitchen</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">11</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The kitchen</p>
</div>
<p class="MsoNormal">The kitchen</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">12</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Custom cabinets and granite counters</p>
<div class="caption_480">
<p> Custom cabinets and granite counters </p>
</div>
</div>
<p class="MsoNormal">Custom cabinets and granite counters</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">13</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Another shot of the kitchen</p>
<div class="caption_480">
<p> Another shot of the kitchen </p>
</div>
</div>
<p class="MsoNormal">Another shot of the kitchen</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">14</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The old, enclosed kitchen</p>
<div class="caption_480">
<p> The old, enclosed kitchen </p>
</div>
</div>
<p class="MsoNormal">The old, enclosed kitchen</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">15</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Views from the kitchen</p>
</div>
<p class="MsoNormal">Views from the kitchen</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">16</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Built-in office nook</p>
</div>
<p class="MsoNormal">Built-in office nook</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">17</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The only bedroom also takes advantage of the views. </p>
<div class="caption_480">
<p> The only bedroom also takes advantage of the views. </p>
</div>
</div>
<p class="MsoNormal">The only bedroom also takes advantage of the views. </p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">18</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Another shot of the bedroom</p>
<div class="caption_480">
<p> Another shot of the bedroom </p>
</div>
</div>
<p class="MsoNormal">Another shot of the bedroom</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">19</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The bedroom before</p>
</div>
<p class="MsoNormal">The bedroom before</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">20</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Master closet</p>
</div>
<p class="MsoNormal">Master closet</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">21</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Master bath</p>
</div>
<p class="MsoNormal">Master bath</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">22</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Another shot of the master bath</p>
<div class="caption_480">
<p> Another shot of the master bath </p>
</div>
</div>
<p class="MsoNormal">Another shot of the master bath</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">23</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Soaking tub</p>
</div>
<p class="MsoNormal">Soaking tub</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">24</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Entry</p>
</div>
<p class="MsoNormal">Entry</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">25</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Standing on the balcony</p>
</div>
<p class="MsoNormal">Standing on the balcony</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">26</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Views from the northwest facing condo</p>
<div class="caption_480">
<p> Views from the northwest facing condo </p>
</div>
</div>
<p class="MsoNormal">Views from the northwest facing condo</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">27</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">More views</p>
</div>
<p class="MsoNormal">More views</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">28</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Golden Gate Bridge views</p>
</div>
<p class="MsoNormal">Golden Gate Bridge views</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">29</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">From inside the condo</p>
</div>
<p class="MsoNormal">From inside the condo</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">30</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">This corner window before</p>
<div class="caption_480">
<p> This corner window before </p>
</div>
</div>
<p class="MsoNormal">This corner window before</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">31</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The Comstock</p>
</div>
<p class="MsoNormal">The Comstock</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">32</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">The lobby</p>
</div>
<p class="MsoNormal">The lobby</p>
<div class="caption staged">
<div class="slide-count">
<p class="nav-stats"> <span class="label">Image </span> <span class="item-count-current">33</span> <span> of </span><span class="item-count-total">33</span> <span class="gallery-title-divider">|</span><span class="main-gallery-title">1333 Jones St. #806</span> </p>
</div><span class="credit">MLS </span>
<p class="MsoNormal">Another view of the lobby</p>
<div class="caption_480">
<p> Another view of the lobby </p>
</div>
</div>
<p class="MsoNormal">Another view of the lobby</p>
</div>
</div>
</div>
</div>
</div>
<p>In early March, we told you about <a href="http://blog.sfgate.com/ontheblock/2015/03/05/2-495-million-for-s-f-s-most-expensive-one-bedroom/">a one-bedroom in the prestigious Comstock </a>in Nob Hill, which came to market at $2.495 million, making it the most expensive one-bedroom on the market at the time. The northwest facing unit—with amazing panoramic views of Twin Peaks, the Golden Gate Bridge, the North Bay, Alcatraz and Coit Tower in almost every room—has now sold for $2.3 million, or 8% under the asking price.</p>
<p>At just over 1,800 square feet, the sale works out to about $1,250 a square foot for the completely gutted and remodeled unit. That price is also about $700K higher than when the unit last sold at the end of 2006. Of course, it was also in a very different state at the time, with old carpeting; a closed-off, very outdated kitchen; and two bedrooms. (<strong>Before and afters are in the gallery above.</strong>)</p>
<p>Almost all the walls came down during the remodel and the carpeting was changed out for hardwoods, not to mention the addition of a completely new open concept kitchen and modern bathrooms. Also, the second bedroom was changed into a den/office with a glass wall separating it from the rest of the large entertaining space. Even the windows were updated to take better advantage of the unbelievable views.</p>
<p>By the way, if youre looking for the same great views with a (slightly) smaller price tag, <a href="http://www.comstock1402.com">a 1,200-square-foot one-bedroom</a> just came to market in the same building. Youll get less indoor space but what appears to be a larger balcony to take in those incredible San Francisco sights, all for the bargain price of $1.75 million.</p>
<p><em>Emily Landes is a writer and editor who is obsessed with all things real estate.</em></p>
</div>
</div>

File diff suppressed because one or more lines are too long

@ -18,7 +18,7 @@ function reformatError(err) {
function runTestsWithItems(label, beforeFn, expectedContent, expectedMetadata) {
describe(label, function() {
this.timeout(5000);
this.timeout(10000);
var result;

Loading…
Cancel
Save