ArchLUG Kwiki - www.archlug.org

Download the RSS XML Feed for this site

Download the RSS XML Feed for this site

Subscribe to this channel with Amphetadesk

Subscribe to this channel with RadioUserland

Add to Google

FeedAbuse


ArchLUG Kwiki - RSS Abuse Feed

This is a feed that is provided in the place of the actual ArchLUG Kwiki RSS feed to aggregators/readers that repeatedly download our feed without checking to see if it has changed.

So they get a special feed in the place of the real one. This is the feed provided to RSS feed abusers

Below is the content of that feed:


IP address blocked for RSS feed abuse

Your IP address has been blocked from retrieving RSS feeds from the ArchLUG web site, because your RSS aggregator is not properly checking whether our RSS feeds have changed before downloading them again. The repeated downloading of our feeds every hour, 24 hours a day, 7 days a week when they have not changed, causes excessive bandwidth usage for this web site, constituting RSS feed / bandwidth abuse.

If you wish to be unblocked, you will need to correct the problem with your aggregator, or use a different aggregator, then contact us and let us know what you've done to address the problem.


Here's the script I use that runs via the webserver user crontab once a day:

#!/bin/bash
#
log_file=${1:-/path/to/access_log}
today=${2:-`date '+%d/%b/%Y'`}
abuse_file=${3:-/path/to/kwiki/database/IP-FeedAbuse}
meta_file=${4:-/path/to/kwiki/metabase/metadata/IP-FeedAbuse}
#
touch $abuse_file
function abuse_ips()
{
list=${list:?}
threshold=${threshold:?}
echo "$list" | sort | uniq -c | while read count host
do
    [ $count -le $threshold ] && continue
    ip=`echo "$host" | egrep '^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$'`
    if [ "$host" != "$ip" ]; then
      ip=`host $host`
      if [ 0 -ne `echo "$ip" | grep 'not found' | wc -l` ]; then
        (printf "$ip -") >> $abuse_file
        continue
      fi
    fi
    [ 0 -ne `egrep -c "^$ip -$" $abuse_file` ] && continue
    (printf "$ip -") >> $abuse_file
  done
}
threshold=50
list=`grep "$today" $log_file | grep 'GET /kwiki/feed.rss' | sed -e 's/ -.*//'`
[ ! -z "$list" ] && abuse_ips
(
echo "edit_by: FeedAbuse"
echo "edit_ip: 127.0.0.1"
echo edit_time: `date -u | sed -e 's/UTC //'`
) >$meta_file

And here's the crontab entry for the above script:

05 23 * * * /path/to/rss-abuse.sh

Here's my rewrite rules in my apache configuration file:

# FeedAbuse IP addresses from a WikiPage
RewriteMap    feed-abuse  txt:/path/to/kwiki/database/IP-FeedAbuse
RewriteCond   ${feed-abuse:%{REMOTE_HOST}|NOT-FOUND} !=NOT-FOUND [OR]
RewriteCond   ${feed-abuse:%{REMOTE_ADDR}|NOT-FOUND} !=NOT-FOUND
RewriteRule   ^/kwiki/feed\.rss(.*) /rss-abuse.xml [PT,L]

Back to the ArchLUG Kwiki


Valid XHTML 1.0! Valid CSS!
InterTran (www.tranexp.com)
InterTran (www.tranexp.com)

Please MOVE AND HOLD your MOUSE CURSOR over any WORD in the translated web page in order to see a pop-up window with ALTERNATIVE TRANSLATIONS. Translations provided by: www.tranexp.com