rss2mail

Previously I have wrote lj2mail – a script which gets fresh posts from LiveJournal and emails them to the list of recepients. I have tried to avoid sending same items over and over again, but failed. The script was implemented with the help of LiveJournal API (LJ::Simple Perl module).

I got annoyed by that script repeating some items (mess with publishing date), so I wrote a different one. rss2mail simply gets the RSS feed, parses it and emails items as individual messages to the list of recepients. I guess that caching RSS item link is much better than LiveJournal’s publishing date. Also, rss2mail is much more flexible. It can be used with any RSS feed, not only LiveJournal’s. I have tried to make it as generic as possible. If it doesn’t work with other feeds, just check the fields of the RSS feed it uses.

rss2mail.pl

11 thoughts on “rss2mail”


  1. Seraphim,

    Yes, it is possible to use Sendmail instead of SMTP. Just change the line:

    my $mailer = Mail::Mailer->new('smtp',Server=>$smtp_server);

    to

    my $mailer = Mail::Mailer->new('sendmail');

    and it should work. Refer to the documentation of Mail::Mailer Perl module if you have a problem.


  2. This is a great script, On my server I set it up as a cron job and the script will continue to send email even though the urls are in the .sent file. in the end of recieving 6000 email I looked at the .sent file and saw hundrend of duplicate entries. what am I doing wrong?


  3. Hello, just thought you might like to know I took your rss2mail script and converted it to an rss2ascii script.

    It reformats the HTML into a sort of plain-text report and sends it’s output to stdout, where it can be fed into the standard input of other programs (like inews or sendmail) I’m using it to read plain-text versions of RSS on a local news system.


    RSS plaintext converter


  4. Leonid, I know there haven’t been any comments on this in a while, but I was having the same problem as Seraphim -- it was sending mail for every entry on every run. The problem is with blogs that use a “?” in the post URL (i.e. WordPress if fancy links haven’t been setup). Since you’re using grep with an unescaped string, a “?” in $url is seen as a wildcard. At least I think that’s what the problem is. Either way, I changed @cached_urls to be a hash (%cached_urls) using the url as both the key and the value. Now duplicates are not possible, and I simply changed the grep test to unless ( exists $cached_urls{$url} ) and everything seems to work fine, since it’s testing literal scalars/strings.


    1. Hi Jason. Thanks for the comment. It’s been a while indeed since I wrote this. :)

      Can you post the fixed source code anywhere? GitHub/Gist or blog or something. I think it will be easier for all the people who are coming over to just have the full working source than to apply patches and fixes from all the comments. ;)

      Thanks.

Leave a Comment