Miscellaneous ⇒ User & Newbie Resources :: Archives ⇒ Coat Hangers, Duct Tape and auto-posting RSS feeds as News :: Archived ⇒ Community Forums ⇒ CPG Dragonfly™ CMS
Forum IndexUser & Newbie Resources

Archived ⇒ Coat Hangers, Duct Tape and auto-posting RSS feeds as News


By request, I've made a tutorial on how I grab RSS feeds from other sites and stuff them into news articles.

Note well: Part of this process requires the CaRP Evolution commercial software. I am in no way affiliated with CaRP except that I use it. There may be a free way to do this, but this is how I've done it. There are also doubtless more elegant ways. This is what it is.

1. Installing CaRP Evolution

Unpack, upload and install per the directions in the included README.HTML. As recommended, put the carp directory outside of public_html (or whatever your web root is called). Be sure to delete the installer files after installation.

Now edit the mysql_setup.php. Lines 23-26 are where your database connection info goes. Use the same database where your Dragonfly is installed. Upload this file and point your browser to it to create the CaRP tables. Be sure to delete it from your web space afterwards.

2. Configuring CaRP Evolution

Edit /carp/plugins/mysql.php. Lines 15-18 contain the database connection info. Change these to the real values for connecting to your database. Use the same database where your Dragonfly is installed.

3. First cron job

You could combine this into one cron job, but I have it running as two at staggered times.

In the first one, we tell CaRP to read the RSS feeds of your choice and stuff them into its own database tables.

<?php require_once '/home/xxx/carp/carp.php'; //replace with full path to your carp.php file CarpLoadPlugin('mysql.php'); CarpLoadPlugin('replacetext.php'); /*Database Configuration*/ CarpConf('cache-method','mysql'); CarpConf('mysql-connect',1); CarpConf('mysql-database','xxx'); CarpConf('mysql-username','xxx'); CarpConf('mysql-password','xxx'); CarpConf('mysql-host','localhost'); /*RSS Feeds to get*/ CarpCacheShow('http://rss.cnn.com/rss/cnn_topstories.rss'); CarpCacheShow('http://newsrss.bbc.co.uk/rss/newsonline_world_edition/front_page/rss.xml'); ?>

Edit this so it has the full path to your carp.php, and your database connection info. The last two lines are examples, replace the urls with the urls to the feeds you want to get. You can have as many lines like that as you want.

Name this file something like rss.php and upload it outside of public_html.

To schedule this job in cPanel, click on Cron Jobs and your choice of Standard or Advanced. The command is going to look like this:

/usr/bin/php /home/xxx/rss.php >/dev/null

If the path to PHP on your host is not /usr/bin/php, change it to what it should be. The next part is the full path to the file you just created. You can remove " >/dev/null" if you have cron output email enabled and you want to check that your job is working. (Otherwise, even if cron output email is enabled, the mail will be sent to /dev/null.)

Schedule it as often as you want. The host might not like it if it's once a minute, though. Hourly seems to be sufficient for most needs.

At this point, you should have CaRP grabbing the feeds you've specified, and inserting them into CaRP's own tables, which by default are called rssfeeds and rssitems. All that is left is to stick them in the queue for the News module.

4. Second cron job/submitting feed items to the News module

First, we need a way to determine whether that rssitem has already been submitted to the News module. So go ahead and run this in phpMyadmin:

ALTER TABLE `rssitems` ADD `news_submitted` TINYINT( 4 ) NOT NULL DEFAULT '0';

Assuming of course that you kept the name of CaRP's rssitems table as the default. That adds a field which we will use to keep track of whether the item has already been submitted as a News article or not. Without it, you'll get the same stuff posted over and over again.

Now for the file to cron:
<?php $dbuser = ""; $dbpwd = ""; $dbname = ""; $dbhost = "localhost"; mysql_connect($dbhost,$dbuser,$dbpwd); @mysql_select_db($dbname) or die( "Unable to select database"); $query = "update rssitems set descr = concat('News from <a href=\"http://cnn.com\" rel=\"nofollow\">CNN</a>!<br /><br />', descr, '<br /><br />Get the full story at CNN <a href=\"', url, '\" rel=\"nofollow\">here</a>!') where feed_id = '1' and news_posted = '0'"; $result = $command; mysql_query($query); $query2 = "update rssitems set descr = concat('News from <a href=\"http://bbc.co.uk\" rel=\"nofollow\">BBC</a>!<br /><br />', descr, '<br /><br />Get the full story at BBC <a href=\"', url, '\" rel=\"nofollow\">here</a>!') where feed_id = '2' and news_posted = '0'"; $result = $command; mysql_query($query2); $query3 = "insert into cms_queue (uid, uname, subject, story, timestamp, topic) select '7927','RSS', title, descr, posted, '7' from rssitems where news_posted = '0';"; $result = $command; mysql_query($query3); $query4 = "update rssitems set news_submitted = '1'"; $result = $command; mysql_query($query4); mysql_close() ?>

All right, this is ugly but it works. Lines 2-5 are for your database connection info. If you've made it this far, you're an old hand at that by now.

Lines 9 and 12 replace some info in the CaRP table, making it a little friendlier. Also _nofollow is appended to prevent spiders following the link.

feed_id is hardcoded, replace that with whatever the real corresponding feed_id is from your rssitems table.

You can add more feeds, just keep changing the $query_ number, or use freeresult. I just never worried about it too much since it's 20 lines of code that run once an hour.

$query4 = "insert into cms_queue (uid, uname, subject, story, timestamp, topic) select '7927','RSS', title, descr, posted, '7' from rssitems where news_submitted = '0';";

See where it says 'RSS' there? that is the name that the news articles will appear to be posted by. You can change this to anything you want, like your own username. If you keep it as RSS or change it to a different dummy account, bear in mind that the News module will turn it into a link to their profile. If using a dummy it's a good idea to actually add that user to Dragonfly, and put something in their profile like "Hi, I'm only a script that automatically adds news! I never check my Private Messages!"

7927 is the userid of the user RSS on my site. Replace that with the userid of whatever user you decide to show your articles posted by.

7 is the topicid that I'm using. Replace that with the topicid you'd rather use.

If you have a different table prefix for Dragonfly than cms, replace that as well.

Name this file something like rssnews.php and upload it to somewhere outside the public_html directory, and schedule it as a cron job using the same procedure in Step 3. above. Do not schedule it at the exact same time as your other script, to avoid conflicts, duplicates, and scripts passing like ships in the night. If you scheduled rss.php at the top of the hour, schedule rssnews.php at five minutes after.

5. But this adds everything to the News submission queue! I don't care what is in the RSS feeds, I just want them to magically appear in News without having to lift a finger.

No problemo mi amigo.

In the above file, replace:
insert into cms_queue (uid, uname, subject, story, timestamp, topic) select '7927','RSS', title, descr, posted, '7' from rssitems where news_submitted = '0'

With:
insert into cms_stories (informant, hometext, bodytext, time, '7') select 'RSS', title, descr, posted, '7' from rssitems where news_submitted = '0'

6. But what if I have to change hosts? Won't this cause a huge pain?

Provided you:
a) Remembered to download from old host/upload to new host all files, including ones not in public_html
b) Exported from old host/imported to new host your complete database including CaRP's tables

Then the only thing you have to do is reschedule your cron jobs on the new host, which is about 30-60 seconds of your time.

Any questions/problems let me know, I'm subscribed to this thread.

Diagon Alley - Top Design

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/1.3.37/4.1.21-standard/4.4.4/9.1.1

Last edited by sarah on Mon Mar 26, 2007 8:33 am; edited 1 time in total


Tank you very much for this great tutorial sarah! not worthy
I'm already trying to get CaRP running 😉

sarah wrote

<?php require_once '/home/xxx/carp/carp.php'; //replace with full path to your carp.php file CarpLoadPlugin('mysql.php'); CarpLoadPlugin('replacetext.php'); /*Database Configuration*/ CarpConf('cache-method','mysql'); CarpConf('mysql-connect',1); CarpConf('mysql-database','xxx'); CarpConf('mysql-username','xxx'); CarpConf('mysql-password','xxx'); CarpConf('mysql-host','localhost'); /*RSS Feeds to get*/ CarpCacheShow('http://rss.cnn.com/rss/cnn_topstories.rss'); CarpCacheShow('http://newsrss.bbc.co.uk/rss/newsonline_world_edition/front_page/rss.xml'); ?>

I think you must load the plugin /carp/plugins/newerthan.php though.

Did you mean
CarpLoadPlugin('newerthan.php');
instead of
CarpLoadPlugin('replacetext.php');?

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/Apache/MySQL 5.0.45/PHP 5.2.2/DF 9.2.1


I think you must load the plugin /carp/plugins/newerthan.php though.


You know something, I just looked and apparently at no time am I using newerthan.php.

When I wrote this, I ran a file compare to see what files I edited from the package. At some point I thought it must have been a good idea to edit that, but it doesn't seem to be called by any files that I'm using.

Sorry, I will edit the original post. I spotted one more error too in the second cron file.

Diagon Alley - Top Design

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/1.3.37/4.1.21-standard/4.4.4/9.1.1


I'm working on this as well.

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Unix / 2.0.46 (Red Hat) / 0.9.7a / 4.1.9-standard / 4.3.2 / 9.0.6.1


You mean you're making your own script to do this without the use of a third-party one like CaRP? Sweet, dude.

Diagon Alley - Top Design

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/1.3.37/4.1.21-standard/4.4.4/9.1.1


I'm not sure what $result = $command; does.

Also if you want to insert directly into the cms_stories table, you should fill in more fields...

$query3a = "insert into cms_stories (informant, title, hometext, time, topic, catid, comments, counter, ihome, alanguage, acomm, haspoll, poll_id, score, ratings, display_order) select 'Name', title, descr, posted, '6', '16', '0','0','1','english','1','0','0','0','0','0' from rssitems where feed_id = '6' and news_submitted = '0'";
//$result = $command;
mysql_query($query3a);

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Unix / 2.0.46 (Red Hat) / 0.9.7a / 4.1.9-standard / 4.3.2 / 9.0.6.1


Well, I've configured the script and it works fine, I had to fix some code in rss.php though.
First problem was the strange output of accents because my site uses several languages. I changed the encoding from ISO-8859-1 to UTF-8 like this:
<?php require_once '/home/xxx/carp/carp.php'; //replace with full path to your carp.php file CarpLoadPlugin('mysql.php'); CarpLoadPlugin('replacetext.php'); /*Database Configuration*/ CarpConf('cache-method','mysql'); CarpConf('mysql-connect',1); CarpConf('mysql-database','xxx'); CarpConf('mysql-username','xxx'); CarpConf('mysql-password','xxx'); CarpConf('mysql-host','localhost'); /*Text Configuration*/ CarpConf('encodingin','ISO-8859-1'); CarpConf('encodingout','UTF-8'); /*RSS Feeds to get*/ CarpCacheShow('http://rss.cnn.com/rss/cnn_topstories.rss'); CarpCacheShow('http://newsrss.bbc.co.uk/rss/newsonline_world_edition/front_page/rss.xml'); ?>

And then "news_submitted" must be used (instead of "news_posted") in rssnews.php...
...and I changed "xy" to 'xy' for $dbuser, $dbpwd, $dbname and $dbhost (difficulties with caracters in the pw which were interpreted as a piece of code)

<?php $dbuser = ''; $dbpwd = ''; $dbname = ''; $dbhost = 'localhost'; mysql_connect($dbhost,$dbuser,$dbpwd); @mysql_select_db($dbname) or die( "Unable to select database"); $query = "update rssitems set descr = concat('News from <a href=\"http://cnn.com\" rel=\"nofollow\">CNN</a>!<br /><br />', descr, '<br /><br />Get the full story at CNN <a href=\"', url, '\" rel=\"nofollow\">here</a>!') where feed_id = '1' and news_submitted = '0'"; $result = $command; mysql_query($query); $query2 = "update rssitems set descr = concat('News from <a href=\"http://bbc.co.uk\" rel=\"nofollow\">BBC</a>!<br /><br />', descr, '<br /><br />Get the full story at BBC <a href=\"', url, '\" rel=\"nofollow\">here</a>!') where feed_id = '2' and news_submitted = '0'"; $result = $command; mysql_query($query2); $query3 = "insert into cms_queue (uid, uname, subject, story, timestamp, topic) select '7927','RSS', title, descr, posted, '7' from rssitems where news_submitted = '0'"; $result = $command; mysql_query($query3); $query4 = "update rssitems set news_submitted = '1'"; $result = $command; mysql_query($query4); mysql_close() ?>

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/Apache/MySQL 5.0.45/PHP 5.2.2/DF 9.2.1

Last edited by albiurs on Tue Mar 27, 2007 7:14 pm; edited 2 times in total


sarah wrote
You mean you're making your own script to do this without the use of a third-party one like CaRP? Sweet, dude.


Sort of. I have it working with email. If you send me an email with certain words in the subject it will automatically insert into my news queue.

Also I have certain email lists that I send everything into the queue for just by watching for the from address.

Last night I implemented your version as well, but I have that info going directly into the stories table. A few variable typos as mentioned above, but other than that it works great.

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Unix / 2.0.46 (Red Hat) / 0.9.7a / 4.1.9-standard / 4.3.2 / 9.0.6.1


I forgot to mention it before. I modified in
cms_queue the field subject from varchar(100) to varchar(255)
and in
cms_stories the field title from varchar(80) to varchar(255)
because the field title in rssitems uses varchar(255) as well and some rss titles were too long to fit into the news queue.
Does anyone know if the changes will affect other parts of DF or any modules?

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/Apache/MySQL 5.0.45/PHP 5.2.2/DF 9.2.1


A few variable typos as mentioned above, but other than that it works great.


Oops. Smile And good.

Does anyone know if the changes will affect other parts of DF or any modules?


It should be fine, BUT: it looks like the installer will alter them back to the length it wants them to have during an upgrade. (I think.) If you just remember to remind yourself of this before upgrading, you can export that data, then upgrade, then set the lengths back how you want them, then update those fields back how they were before. (I am certain that of course you make a backup before upgrading, but it'll be more convenient to also export that particular data in isolation rather than wade through a dump of the whole site.)

Edit: or even easier, hack the installer, it's install/sql/tables/news.php and on the 9.1.2.1 one it was lines 78 and 127.

Diagon Alley - Top Design

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/1.3.37/4.1.21-standard/4.4.4/9.1.1


Sort of. I have it working with email. If you send me an email with certain words in the subject it will automatically insert into my news queue.


How about emailing images into Coppermine? That would be soooo cool...

Pro_News CM™ - Content Management for Dragonfly CMS™

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux / 1.3.39 - 2.4.9 / 5.5.42 - 5.6.16 / 5.4.37 - 5.5.11 / 9.4


sarah wrote
or even easier, hack the installer, it's install/sql/tables/news.php and on the 9.1.2.1 one it was lines 78 and 127.

Thanks for important thoughts sarah! I already hacked it.

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/Apache/MySQL 5.0.45/PHP 5.2.2/DF 9.2.1


Thanks for important thoughts sarah! I already hacked it.


Make a note to do it again when you upgrade.

Diagon Alley - Top Design

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/1.3.37/4.1.21-standard/4.4.4/9.1.1


Yes, I will do it so 😉

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/Apache/MySQL 5.0.45/PHP 5.2.2/DF 9.2.1


I noticed just now that the title will be truncated while posting the story, even though the whole title appears correctly in the submissions queue and the title field in cms_stories is varchar(255)...
Any idea where the related code is to prevent the title being truncated?

Server specs (Server OS / Apache / MySQL / PHP / DragonflyCMS):
Linux/Apache/MySQL 5.0.45/PHP 5.2.2/DF 9.2.1

All times are UTC