I'm modelling a MongoDB database in MongoEngine for a web project. I want to store the data in a slightly unusual way to be able to efficiently query it later.
Our data in MongoDB looks something like this:
// "outer"
{
"outer_data": "directors",
"embed": {
"some_md5_key": { "name": "P.T. Anderson" },
"another_md5_key": { "name": "T. Malick" },
...
}
}
My first instinct was to model it like this in MongoEngine:
class Inner(EmbeddedDocument):
name = StringField()
class Outer(Document):
outer_data = StringField()
embed = DictField(EmbeddedDocument(Inner)) # this isn't allowed but you get the point
In other words, what I essentially want is the same an storing an EmbeddedDocument in a ListField but rather in a DictField with dynamic keys for each EmbeddedDocument.
Example that is allowed with a ListField for reference:
class Inner(EmbeddedDocument):
inner_id = StringField(unique=True) # this replaces the dict keys
name = StringField()
class Outer(Document):
outer_data = StringField()
embed = ListField(EmbeddedDocument(Inner))
I would prefer to have MongoEngine objects returned also for the nested "Inner" documents while still using a DictField + EmbeddedDocument (as dict "value"). How can I model this in MongoEngine? Is it even possible or do I have to naively place all data under a generic DictField?
I finally found the answer to my problem. The correct way to achieve this pattern is by making use of a MapField
.
The corresponding model in MongoEngine looks like:
class Inner(EmbeddedDocument):
name = StringField()
class Outer(Document):
outer_data = StringField()
embed = MapField(EmbeddedDocumentField(Inner))
In MongoDB, all keys needs to be strings so there is no need to specify a "field type" for the keys in the MapField
.