Relational databases are not the greatest storage destination for polymorphic data structures (stack overflow). Polymorphism allows you to have heirarchies of objects that perform similar actions. By defining this action once, you can re-use it for its subclasses. It makes code cleaner and easier to maintain. I experimented with a few different ways of doing some of my subclassing and came away with an approach involving mixins that I really like so far.
Django has the ability to allow subclassing. So you can create code like:
class Vehicle(models.Model):
horsepower = models.FloatField()
def rev_engine(self):
...
class Corvette(Vehicle):
def impress_the_ladies(self):
...
class MackTruck(Vehicle):
def blow_horn(self):
...
and create MackTrucks and Corvettes just fine,
>> truck = MackTruck() >> truck.save() >> vette = Corvette() >> vette.save()
but if you want to pull them out as their superclass, Vehicle, and use polymorphism to call either impress_the_ladies() or blow_horn(), you are out of luck. All of the objects from the Vehicles queryset come out as Vehicle objects (by default)
>> Vehicles.objects.all() [<vehicle object ><vehicle object >]
There are a few ways around this, none of which are that appealing to me.
1. One is the use of the django snippet InheritanceClassModel. You would make your base class inherit from this class and it would do some magic to allow you to call vehicle.cast() to get either your Corvette or MackTruck object back.
What is this magic? Well, the InheritanceClassModel uses the ContentType class that comes in django.contrib to keep track of class types in the database. So when it stores your Corvette objects, it also stores an entry in a class-lookup table so that when you call
vehicle.cast(), it knows what sort of subclass to give you.
Sounds magical? Well there are drawbacks:
using InheritanceClassModel can impose serious performance hits on your queries because it has to look up not only the objects, but for each time you call .cast(), it must make an additional query on the class look-up table. Additionally, using filtering and searching on these class trees is sketchy at best, and many times you can’t use .filter() on the vehicle query set.
2. Another way around this is to create a custom queryset manager (stack overflow). Custom managers can be powerful, but the same fundamental problem comes about where you have to use the ContentType object to store class types in a look-up table.
Gross.
3. Another thing Django supports is mixins, and abstract base classes. The code above might be written:
class Vehicle(models.Model):
horsepower = models.FloatField()
def rev_engine(self):
...
class Meta:
abstract = True
class Corvette(Vehicle):
def impress_the_ladies(self):
...
class MackTruck(Vehicle):
def blow_horn(self):
...
Now, when Django is creating the tables relating to each class, this abstract portion of the Vehicle class tells it that all other classes which inherit from it should have those attributes. This will make a mack_truck table which has a horse_power field as well as a corvette table that has a horse_power field.
I’m sure you can see the potential for very wide tables, but I think those tables are much easier to read and maintain than piecing together class look-up tables.
There are plenty of Django polymorphism solutions, so what have you used for your Django subclassing? Have you had good experiences with InheritanceCastModel?
Tags: abstract, class, Django, inheritance, meta, object, polymorphism, subclassing
April 12, 2010 at 2:57 pm |
I have not found any practical use for inheritance (in Django!).
In the application I’m currently building I want list items per theme, where the items can either be books, DVDs or TV documentaries. The problem where I run into is, that while of of these items can be listed by doing a ‘Items.objects’-like query, in my templates they all are still items, which means that it is impossible to show the authors for books, and the director for documentaries.
What I will end up with is a InhertanceClassModel, without casting, but with overridden save definitions for each item type that simply set a media field for each type. In the Admin defs I hide fields as appropriate.
February 17, 2011 at 2:26 pm |
Nice post. One statement that is not quite correct:
“using InheritanceClassModel can impose serious performance hits on your queries because it has to look up not only the objects, but for each time you call .cast(), it must make an additional query on the class look-up table.”
If you look at the ContentTypes code, you’ll see that those queries are cached. The table look-up only happens for the first time when a specific class is called. If implemented properly, dynamic ContentType casting can be very efficient.
April 14, 2011 at 10:13 am |
The downside of using the abstract model is that you can’t create a ForeignKey to it.
You also can’t have any derived classes of the abstract model if they have their own fields.
For example, I have a generic MetaData object that my Task object owns. I want to be able to have any kind of MetaData and use polymorphic functions on MetaData that operate on the MetaDataSubclass’ fields.
April 14, 2011 at 10:38 am |
David,
Yes, Abstract models are not without limitations, but they are very handy when sharing fields and methods across classes which inherit from them.
In the above example, there is no ‘Vehicle’ table in the database, so it makes sense that there are no ForeignKeys to it, even though it may be desirable.
Let’s say MackTruck had a mudflap=CharField() field. You are correct in that a Vehicle instance cannot access the .mudflap attribute (not all vehicles have mudflaps) , but that subclass can still use its own .mudflap attribute if instantiated as a MackTruck.
Django is awesome, but it can’t solve ALL problems
[Edit: fixed indentation error in Vehicle class]
April 14, 2011 at 11:06 am |
Have you looked at django-polymorphic?
http://bserve.webhop.org/django_polymorphic/
It seems to do most of what I need (I’m experimenting with it now), since I need to M2M on a model that is polymorphic.
April 17, 2011 at 6:35 am |
@David: Great link. Thanks.
September 13, 2012 at 4:13 am |
And what if:
class Vehicle(models.Model):
horsepower = models.FloatField()
def rev_engine(self):
…
class Meta:
abstract = True
def do_something():
…..
class Corvette(Vehicle):
def impress_the_ladies(self):
…
class MackTruck(Vehicle):
def blow_horn(self):
and i want Vehicle.do_something() without knowing the child class?
September 13, 2012 at 8:25 am |
do_something() would be accessible from any of the subclasses.
September 13, 2012 at 8:37 am
Thank you for the answer, but is not that that i want.
I’m using Multi-table inheritance, and i want this:
v = Vehicle.objects.get(pk=29)
v.do_something()
But i want to execute the do_something() of the object that has a vehicle_ptr_id = 29. The object can be on table Corvette OR on table MackTruck. AFAIK django does’t know were to look.
Thanks
September 13, 2012 at 8:56 am |
Luis, that is correct. Django would not be able to find the subclasses version of do_something() if it was overridden in a subclass. One way (though not great) way to do it is to put a ‘type’
field on Vehicle… but then we’re getting away from the inheritance model we started with.
September 13, 2012 at 9:05 am |
I did something a little different (although i’m not happy with it). Since that in my project i only have to child classes:
try:
exec(“child = ” + str(cl) + “.objects.select_for_update().get(vehicle_ptr_id=” + str(vehicle.id) + “)”)
logger.debug(“Child found and ready to use.”)
return child
except ObjectDoesNotExist:
logger.debug(“Object does not exist, moving on…”)
pass
This is likely to have performance issues.
September 13, 2012 at 9:07 am |
I did something a little different (although i’m not happy with it). Since that in my project i only have to child classes:
child_class_list = get_all_classes_in_this_module()
for cl in child_class_list:
try:
exec(“child = ” + str(cl) + “.objects.select_for_update().get(vehicle_ptr_id=” + str(vehicle.id) + “)”)
logger.debug(“Child found and ready to use.”)
return child
except ObjectDoesNotExist:
logger.debug(“Object does not exist, moving on…”)
pass
This is likely to have performance issues.
September 13, 2012 at 9:34 am |
I would hesitate to use exec. Can you describe the case you are trying to accomplish in a little more detail? I think I’m lost with this example.
September 13, 2012 at 10:16 am
Well.. i’m not allowed to talk of my real project
. Let’s make an example with bank accounts.
class account(models.Model):
name = models……
class accounttypeA(account):
balance = models.float…..
def addToBalance(self, value):
self.balance += value
class accounttypeB(account):
balance = models.float….
def addToBalance(self, value):
self.balance += value
Now, i want to add a value to an accounttype, but all i have is an account object, for instance acc=account.object.get(pk=29) . So, who is the child of acc ?
Django automatically creates an account_ptr_id field in accounttypeA and accounttypeB. So, my solution was:
child_class_list = ['accounttypeA', 'accounttypeB']
for cl in child_class_list:
try:
exec(“child = ” + str(cl) + “.objects.select_for_update().get(account_ptr_id=” + str(acc.id) + “)”)
logger.debug(“Child found and ready to use.”)
return child
except ObjectDoesNotExist:
logger.debug(“Object does not exist, moving on…”)
I hope I have been clear in my example. Thanks
September 13, 2012 at 10:28 am |
Yes, that helps. Basically you are working too hard. You only need to write addToBalance and the balance field once, in the superclass. (I also made it decimalField for money for those who might find this post and use some of it)
————
class account(models.Model):
name = models.charField(...)
balance = models.DecimalField(...) #want to be able to have $.10
class Meta:
abstract = True
def addToBalance(self, value):
self.balance += value
class accounttypeA(account):
#Don't need balance or addToBalance here
pass
class accounttypeB(account):
#Don't need balance or addToBalance here
pass
This will create two tables, _accounttypea and _accounttypeb but NOT _account
----------
Beacuse the account models is abstract, both its fields and methods
are inherited by its subclasses.
You would be able to say
gazillion = 100000000000.10
accounttypeB.objects.all()[0].addToBalance(gazillion)
and also
accounttypeA.objects.all()[0].addToBalance(gazillion)
and
accounttypeA.objects.get(pk=2).addToBalance(gazillion)
but not
Account.objects.get(pk=2).addToBalance(gazillion) #Account Has no PK
Hope this helps
September 13, 2012 at 10:35 am |
Yes it helps.
Now, how about a more difficult one?
class account(models.Model):
name = models……
class accounttypeA(account):
balance = models.FLOATFIELD…
def addToBalance(self, value):
self.balance += value
class accounttypeB(account):
balance = models.CHARFIELD…….
def addToBalance(self, value):
value = do_strange_things_whith_value(value)
self.balance = value
This is were i’m stuck. Maybe i will have to review my models in order to have a good solution.
Thanks again.
September 13, 2012 at 11:46 am |
Yea, looks like a drawing board problem at this point
September 13, 2012 at 11:56 pm |
I have another approach using proxies. More detail here: bit.ly/O2goO2
September 13, 2012 at 11:57 pm |
I have another approach using proxies model and a get_proxy_class function. bit.ly/O2goO2