Indexes on collections that don't exist: one of the mysteries of Mongo
I've been working on a Drupal module which aggregates and reports on some data from Mongo. In order to keep from having to re-do all the aggregations, the aggregated data itself is written off to a Mongo collection which is then retrieved and displayed when the reports are requested. To make the retrieval more efficient, I need an index on the table storing the aggregated data.
In a traditional Drupal environment, this is a simple setup. In the module's .install file, I'd build up the table structure, with the necessary index, of the storage table in a hook_schema() implementation. Then just wire the schema hook into a hook_install() implementation and viola. The problem is that Mongo collections don't exist until you write data into them. So how do you define an index on a table that doesn't exist?
It turns out the answer is freakishly simple: Mongo is perfectly happy to set an index on a field which doesn't exist in a collection which doesn't exist; the collection springs into existence as soon as you declare an index on it. Coming from an RDBMS background, this blew my mind the first time I saw it happen.
The text below shows the call and response from creating a new database, listing the collections in the database (to demonstrate that there are none), building an index on a new collection, and verifying the index.
>use junkdb switched to db junkdb >show collections > db.fnord.ensureIndex({foo:1}) > show collections fnord system.indexes > db.fnord.getIndexes() [ { "name" : "_id_", "ns" : "junkdb.fnord", "key" : { "_id" : 1 }, "v" : 0 }, { "_id" : ObjectId("4ec86ed7ba553b94e5ae3dbc"), "ns" : "junkdb.fnord", "key" : { "foo" : 1 }, "name" : "foo_1", "v" : 0 } ] >
