Introducing Total ReCal – the Web Application

Total ReCal interface

Building on our experience from developing the Common Web Design (CWD), we present the CWDx 1.0, designed specifically for awesome web based applications.

CWDx is 100% HTML5 and CSS3, includes extensive WIA-ARIA mark-up for accessibility, is Ajax driven and is wonderfully minimalistic.

The application is build using the CodeIgniter 2.0 PHP framework, and uses MongoDB as the database. The front end uses jQuery, jQuery UI and a number of functions from PHP.js.

The calendar itself is an amazing jQuery plugin called FullCalendar which emulates a lot of Google Calendar’s functionality.

Currently students can view their academic timetable and if they have any, their Blackboard assignment deadlines. Both staff and students also have their own calendar which they can write to. Everyone can subscribe to their calendar on a mobile device or in another application such as Google Calendar.

In the next few weeks users will have access to their library book return dates, will be able to create new calendars and share events, and we’re currently developing a CalDav system that will allow true event syncing across multiple devices and applications.

We need faster interwebz!

Now that we’ve got live data being produced from Blackboard and CEMIS we can start writing scheduled jobs to insert this data into the Total ReCal database however in the case of CEMIS we’re having a few problems.

Everyday a CSV file is created of all of the timetable events for each student. This file is (currently) 157mb in size and has approximately 1.7 million rows. In his last post, Nick explained that we have now developed an events API for our Nucleus metadata service which is going to be the repository for all of this time space data. Currently we’re able to parse each row in this CSV file and insert it into Nucleus over the API at about 0.9s per row. This isn’t good enough. As my tweet the other day shows, we need to significantly speed this up:

So our timetable import into #totalrecal (1.7m records) will currently take 19 days. Our target is 90 mins. HmmmTue Oct 12 17:24:31 via web

At the moment we’re simply streaming data out of the CSV file line by line (using PHP’s fgets function) and then sending it to Nucleus over cURL. Our two main problems are that the CSV file is generated one student at a time and so ideally needs to be re-ordered to group events by the unique event ID in order to improve performance by reducing the number of calls to Nucleus because we can just send the event with all associated students as one. Our second problem is parsing and building large arrays results in high memory usage and currently our server only has 2gb of memory to play with. We’ve capped PHP at 1gb memory at the moment however that is going to impact on Apache performance and other processes running on the server. Ideally we don’t want to just stick more memory into the machine because that isn’t really going to encourage us to fine tune our code so that isn’t an option at the moment.

Over the next few days we’re going to explore a number of options including altering the current script to instead send batched data using asynchronous cURL requests, and also then re-writing that script in a lower level language, however the second is going to take a bit of time as one of us learns a new language. Both should hopefully result in significantly improved performance and a decrease in execution time.

I’ll write another post soon that explains our final solution.