Intermediate Rails: Understanding Models, Views and Controllers

Intermediate Rails: Understanding Models, Views and Controllers

BetterExplained.com better explains the MVC pattern.  The examples are using Ruby on Rails, but that’s irrelevant.  Many other MVC implementations in web frameworks are working in exactly the same manner.  If you are not familiar or not very comfortable with MVC, read the article.  It will make things clearer.

boilerpipe – Boilerplate Removal and Fulltext Extraction from HTML pages

boilerpipe – Boilerplate Removal and Fulltext Extraction from HTML pages

The boilerpipe library provides algorithms to detect and remove the surplus “clutter” (boilerplate, templates) around the main textual content of a web page.

The library already provides specific strategies for common tasks (for example: news article extraction) and may also be easily extended for individual problem settings.

Extracting content is very fast (milliseconds), just needs the input document (no global or site-level information required) and is usually quite accurate.