markmorgan wrote:Firstly every time Awasu refreshes the results of a Google News Search RSS feed it identifies every single item as updated.
Yes, if they're changing the text from "1 hour ago" to "2 hours ago" and so on, Awasu will flag each one as a new revision. In fact, Awasu can be configured to determine if an item has been revised in two ways: (1) if *anything* in the feed item's XML has changed, including stuff that you don't see or even is part of RSS (e.g. an extension) or (2) if only key things like the title, description, URL, etc. have changed.
However, this isn't going to help you in this particular case since Google is changing the description which will always trigger a revision.
markmorgan wrote:So with a 2 hour update frequency and an 10 hour blackout window I shouldn't get more than around 7 revisions. Yet looking at a news article from 05/04/2006 it has 95!
Actually, 95 is about right: you'll get around 7 revisions per day but with 14 days archived, 14 * 7 = 98
markmorgan wrote:Is there any way of switching off the revision storing and only store the latest?
This is a good idea. I'll mull over it...
markmorgan wrote:The 'disable item revisions' option on the properties page implies that if you switch it off then every time the feed is read it will treat every item as new... Not the desired result. I want only the latest revision.
Turning this off causes Awasu to show each revision as a separate entry in the item pane. If it's on, they get collapsed down into a single entry with the "has-revisions" indicator.
markmorgan wrote:BTW are the revisions stored as deltas or full feed items?
Full items. However, everything is compressed so it's probably not going to be too bad.
markmorgan wrote:This may account for the degredation in my Awasu performance
I've got some strong suspicions that it's something else. I'll be doing some work on it for 2.2.2.