You are on page 1of 4

FiveFilters.

org: Full-Text RSS


http://fivefilters.org/content-only/
CHANGELOG
-----------------------------------3.2 (2013-05-14)
- A short excerpt from the first few lines of the extracted content can now be
included in the output (pass &summary=1 in querystring, see $options->summary in
config file for more info)
- Full content can now be excluded from the output (pass &content=0 in querystr
ing, see $options->content in config file for more info)
- Site config files can now be automatically updated from our GitHub repository
(URL to call visible in admin area)
- Site config files updated for better extraction
- PHP Readability updated to be more lenient when pruning HTML
- Language detection library updated
- HTML meta refresh redirects now also followed
- APC stats (if APC is available on your server) now visible in admin area
- Bug fix: Duplicate find_string and replace_string values in site config files
no longer removed (thanks Fabrizio!)
- Bug fix: MIME type actions now applied when following single page URLs
- Other minor fixes/improvements
3.1 (2013-03-06)
- PHP Readability updated to preserve more images/videos
- Site config files updated for better extraction
- SimplePie updated
- New config option favour_feed_titles and request parameter use_extracted_titl
e to allow extracted titles to be used in generated feed
- Remove image lazy loading (looks for markup used by http://wordpress.org/exte
nd/plugins/lazy-load/)
- <category> elements appearing inside <item> elements are now preserved in gen
erated feed
- <media:thumbnail> elements now preserved
- Allow multiple <media:content> elements (previously only one was preserved)
- Bug fix: No more self-closing iframe elements
- Bug fix: Fixed manifest.yml to prevent error message when deploying to AppFog
- Other minor fixes/improvements
3.0 (2012-09-04)
- Multi-page support - next_page_link now supported in site config (enable/disa
ble with $options->multipage)
- HTML5 parser available - use parser: html5lib in site config, also see $optio
ns->allowed_parsers
- Updated site patterns for better extraction
- New global site config to be applied to all sites (global.txt)
- APC caching of site config files to improve performance, if APC available - s
ee $options->apc
- Site config editor in admin/ - easily find, edit, test, and test site config
files, or add new ones
- Debug mode to see what's happening behind the scenes - see $options->debug
- Removed deprecated config options: restrict, message_to_prepend_with_key, mes
sage_to_append_with_key, error_message_with_key
- Removed extraction with CSS via querystring
- Removed config option: $options->alternative_url
- Bug fix: allow extraction of a single element
- Bug fix: redirect handling improved
- Strip 'http://' prefix when API key is supplied
- Site config merging (custom + standard + fingerprint + global)
- Site config command replace_string(find): replace can now be split over two l
ines: find_string: find, replace_string: replace

- YouTube and Vimeo URLs now return iframe embed code


- We now look for OpenGraph title and date elements
- Improved extraction from AJAX pages - we now look for AJAX triggers embedded
in HTML, per Google spec
- JSONP support - use &format=json&callback=functionName in querystring
- New config option to enable Cross-Origin Resource Sharing (CORS): $option->co
rs
- New config option to enable XSS filtering, if required: $option->xss_filter
- Zend_Cache updated
- Smart caching - experimental feature to store cache IDs in APC first, and wri
te output to disk on subsequent request (see $options->smart_cache)
- Easier cloud deploy - manifest.yml added for AppFog
- Override most config options with environment variables, e.g. ftr_max_entries
: 3
2.9.5 (2012-04-29)
- Language detection using Text_LanguageDetect or PHP-CLD (dc:language field in
output and $options->detect_language in config)
- New site patterns added and old ones updated
- Experimental tool for simpler site pattern updates (access admin/ folder)
- Plus other fixes/improvements
2.9.1 (2011-11-02)
- Fix: Character encoding issue affecting some non-English articles (makefullte
xtfeed.php and SimplePie/Misc.php changed)
2.9 (2011-11-01)
- New site patterns added and old ones updated
- New config option: require_key - restrict access to those with password/key
- New config option: rewrite_url - URL rewrite rules to be applied before HTTP
request
- New site config options to extract author(s) and publication date (matches in
cluded in feed item as <dc:creator> and <pubDate>)
- New site config option: replace_string([string to find]): [replacement string
]
- New site identification method: site fingerprints (HTML fragments linked to s
ite config)
- Update check now also checks for new site patterns
- Effective URL (URL after redirects/rewrites) now included in feed item as <dc
:identifier>
- Prevent indexing of generated feeds by search engines
- Enclosure support (enclosures preserved as <media:content> elements)
- Better handling of non-HTML content types
- Sending custom User-Agent HTTP header for matching sites now supported
- CSS extraction deprecated in favour of site patterns (still works, but form f
ield removed and feature may disappear in 3.0)
- Fix: Improved character-encoding detection
- Fix: URL parsing issues for certain URLs (SimplePie updated)
- Fix: Author and other Dublin Core (<dc:..>) elements now appear in JSON outpu
t
- Fix: Minor fixes for PHP Readability
- Plus other minor fixes/improvements
2.8 (2011-05-30)
- Tidy no longer stripping HTML5 elements
- JSON output (pass &format=json in querystring)
- New site patterns added and old ones updated
- New site config option to force full-page retrieval on multi-page articles: s
ingle_page_link
- User Guide (PDF) now included (although still a work in progress)

- URL placeholders now accepted in message_to_prepend/append config options


- Plus minor fixes...
2.7 (2011-03-21)
- Site patterns for better control over extraction (see site_config/README.txt)
- hNews support (improves content extraction for sites using hNews microformatt
ing)
- Cookie Jar now used to store and sends cookies when following HTTP redirects
- Better handling of certain cases where HTML Tidy fails to clean up properly
- Bug fix: curl_multi_select() timing out in certain environments (fixed in Hum
bleHttpAgent.php)
- Bug fix: broken HTTP header parsing in some environments (fixed in SimplePie_
HumbleHttpAgent.php)
- Bug fix: invalid API URL shown (fixed in index.php)
- Plus other minor fixes...
2.6 (2011-03-02)
- Rewriting of hash-bang (#!) URLs (see http://www.tbray.org/ongoing/When/201x/
2011/02/09/Hash-Blecch for an explanation)
- Improved parallel fetching support (HumbleHttpAgent uses curl_multi_* functio
ns if PECL HTTP extension is not present)
- Improved HTTP redirect support (now handled in HumbleHttpAgent, no longer rel
ies on PHP)
- Improved performance for single page (non-feed) requests: (SimplePie connecte
d to HumbleHttpAgent)
- Improved memory use for processing large feeds (HumbleHttpAgent's stored resp
onses cleared as they're retrieved)
- Bug fix: exclude on fail option no longer requires valid key
- Bug fix: workaround for PHP bug http://bugs.php.net/51192 (fixed in makefullt
extfeed.php)
- Plus other minor changes...
2.5 (2011-01-08)
- New option: custom extraction pattern (CSS selectors)
- New option: allowed URLs (restrict service to pre-defined feeds/domains)
- New option: exclude items on fail (remove items from feed if content extracti
on fails)
- Remove 'http://' from URL before form submission (prevents errors on hosts wh
ich have overly vigilant security software)
- Allow overriding of index.php with custom_index.php
- config.php now required (override with custom_config.php)
- index.php now uses config.php to determine what to display
- Bug fix: occasional fatal error in IRI::__toString() (IRI updated)
- Bug fix: workaround for PHP bug http://bugs.php.net/51192 (fixed in HumbleHtt
pAgent.php)
2.2 (2010-10-30)
- Character-encoding detection improved (minor change)
- Rewriting of relative URLs improved (tracks redirect URLs)
- Minor changes to prevent errors in certain hosting environments
- Compatibility test file updated with more tests
2.1 (2010-09-13)
- Better content extraction (using PHP Readability 1.7.1)
- Parallel HTTP requests (using Humble HTTP Agent)
- Auto loading of necessary classes
- Rewriting of relative URLs (using IRI)
- Added compatibility test file (to check if server meets requirements)
- Character-encoding support improved (using SimplePie)

1.5 (2010-05-30)
- Support for PHP 5.3 (thanks Murilo!)
- Character-encoding support improved (favours iconv over mb_convert_encoding)
1.0 (2010-03-05)
- Better support for different character-encodings
- Auto-cleanup of cache files
- Very basic option for load distribution (if you're planning on installing the
code on multiple servers)
- Separate config file (see config-sample.php)

You might also like