Updates to CodeIgniter MongoDB library

May 5, 2011August 29, 2011 Alex Bilbie Code, CodeIgniter, MongoDB

This library can now be found at https://github.com/alexbilbie/codeigniter-mongodb-library and has been updated with numerous bug fixes since this post was written. If you find a new bug please add it to the issue tracker. Thanks!

I’ve spent some time this evening updating my CodeIgniter MongoDB library. You can get the latest release (4.0.1 at time of writing) at https://bitbucket.org/alexbilbie/codeigniter-mongo-library/

hg clone https://bitbucket.org/alexbilbie/codeigniter-mongo-library

Or if you’re one of the cool kids and using Sparks run

php tools/spark install -v4.0.1 mongodb

So what’s new?

You can now pass a mongo id in where and it will automatically be converted to the correct MongoId object type. You can also pass a field and value to the where function instead of an array. Thanks to Phil Sturgeon for this.

$this->mongo_db->where('_id', 'ced141265b96c037a3cab9dee0f3b4fa')->get('post')

I’ve also added a lot of new update functions which should make expressing updates much easier. Now you can write queries like:

$this->mongo_db ->where('_id', 'ced141265b96c037a3cab9dee0f3b4fa') ->set('title', 'My new blog post') ->inc('comment_count', 1) ->push('comments', array('id'=>1, 'name'=>'Alex', 'text'=>'Hello, world!')) ->update('post')

The new functions you can use are:

inc – increment the (integer) value of a field
dec – decrement the (integer) value of a field
set – sets the field to a new value
unset – unsets a field
push – pushes a new element into an array
pop – pops the last element from an array
pull – removes all occurrences of value from field
rename_field – renames a field (key remains intact)

There are a few missing functions that I couldn’t get to work this evening but I’ll add them shortly:

push_all – appends each value (where value is an array) to field
pull_all – removes all occurrences of value (where value is an array) in field
bit – does a bitwise update of a field
add_to_set – adds value to the array only if its not in the array already, if field is an existing array, otherwise sets field to the array value if field is not present

There are a number of other small enhancements, and I’ve updated the licence to the MIT License.

Coda syntax styles

January 21, 2011 Alex Bilbie Code Coda, syntax

A number of people in the past have asked me if they can have a copy of the syntax stylesheets I use in Coda. Well I’ve put them up on my Bitbucket so anyone can use them. Included are stylesheets for CSS, PHP, HTML, JavaScript and SQL.

https://bitbucket.org/alexbilbie/coda-styles/src

I’ve found that a dark stylesheet with pastel highlighting is the most comfortable to look at for long periods of time.

To import into Coda, open Preferences > Colors > Import.

#totalrecal: We need faster interwebz!

October 14, 2010 Alex Bilbie Code, Work cURL, PHP, Total ReCal

Now that we’ve got live data being produced from Blackboard and CEMIS we can start writing scheduled jobs to insert this data into the Total ReCal database however in the case of CEMIS we’re having a few problems.

Everyday a CSV file is created of all of the timetable events for each student. This file is (currently) 157mb in size and has approximately 1.7 million rows. In his last post, Nick explained that we have now developed an events API for our Nucleus metadata service which is going to be the repository for all of this time space data. Currently we’re able to parse each row in this CSV file and insert it into Nucleus over the API at about 0.9s per row. This isn’t good enough. As my tweet the other day shows, we need to significantly speed this up:

So our timetable import into #totalrecal (1.7m records) will currently take 19 days. Our target is 90 mins. HmmmTue Oct 12 17:24:31 via webAlex Bilbie
alexbilbie

At the moment we’re simply streaming data out of the CSV file line by line (using PHP’s fgets function) and then sending it to Nucleus over cURL. Our two main problems are that the CSV file is generated one student at a time and so ideally needs to be re-ordered to group events by the unique event ID in order to improve performance by reducing the number of calls to Nucleus because we can just send the event with all associated students as one. Our second problem is parsing and building large arrays results in high memory usage and currently our server only has 2gb of memory to play with. We’ve capped PHP at 1gb memory at the moment however that is going to impact on Apache performance and other processes running on the server. Ideally we don’t want to just stick more memory into the machine because that isn’t really going to encourage us to fine tune our code so that isn’t an option at the moment.

Over the next few days we’re going to explore a number of options including altering the current script to instead send batched data using asynchronous cURL requests, and also then re-writing that script in a lower level language, however the second is going to take a bit of time as one of us learns a new language. Both should hopefully result in significantly improved performance and a decrease in execution time.

I’ll write another post soon that explains our final solution.