Plagiarism Guard’s footer has a ‘Blog Feed’ section which lists the five most recent blog posts from its Wordpress powered blog. This works by parsing the blog’s RSS feed (powered by Wordpress). However I found that simply parsing the feed and finding the latest 5 posts added anything from 500 ms to 1000 ms onto a page load, which was too much. To overcome this I could have used Django’s caching framework (which supports template-level caching). However since I potentially needed to access the blog post data elsewhere in my app, I decided to periodically save the blog post data to the database and read the data from here instead. This approach has two benefits:
- The process which updates the blog post data isn’t dependent on a user loading a page (as mentioned below, it’s handled via a Django management command triggered via a cron job)
- Reading the top N results from the database is a lot quicker (<5 ms)
The below lists the code I used to achieve this feature.
This article was written in late 2014, and whilst a lot of the commands and advise in this article are still relevant, a couple of links might be to an old ‘end of line’ version of software
In models.py I added the following:
This could obviously be extended to save more data e.g. the blog post/description, but I didn’t personally need this extra data.
Django Custom Management Command
So that I could make the parsing of the blog feed (and saving to the DB) flexible, I went down the route of adding a custom management command within my Django app. So I added a ‘management’ folder within my app, and a blank _init_.py file within this folder. Then I added a ‘commands’ subfolder and another _init_.py file within this subfolder. The actual feed parsing uses Feedparser, which can be installed via pip as normal:
pip install feedparser
Finally I added recent_blog_posts.py within ‘commands’:
Change the URL as needed but the code should be fairly self explanatory - it uses Feedparser to parse the URL, then it iterates over the top N (as passed into recent_blog_posts) results, saving them to the database model we added earlier.
Showing the ‘Blog Feed’
I added a custom tag into Django so that my template(s) simply had to reference the tag to retrieve the blog data (instead of e.g. adding a global context or template variable). So inside the app I added a ‘templatetags’ folder, with a blank _init_.py file and a custom_tags.py file which contains the following:
Then within the template where the blog feed should be displayed, you simply need to load the custom tags at the beginning:
And then the following code displays the blog feed:
Scheduling the Custom Management Tag
To schedule in the recent_blog_posts Django management command (which refreshes the recent blog data), I added the following cron job:
*/30 * * * * /usr/bin/python3 /home/plagiarismguard/manage.py recent_blog_posts 5
This retrieves the first 5 results - but this can be tweaked as needed. Since the template tag also supports fetching a certain number of results, there’s no specific reason for recent_blog_posts to also have this restriction. I knew I would only need the top N results so it made sense to add this argument, but you can easily change recent_blog_posts.py to subclass NoArgsCommand instead, and then the cron tab argument can be ignored.