ORM - data model vs domain model
There’s been quite a lot of discussion lately about whether or not to use an ORM. As someone who’s been both proponent and opponent of the use of ORM’s over the years, I figured I’d write a blog post about my current opinion on the matter (which may change again a couple of times in the future ;-)).
Why use an ORM
Writing data-access code can be tedious and there’s a lot of boilerplate code involved.
Making a mistake can easily cause serious and hard to debug issues (like connection pool exhausting, …). ORM’s have already implemented and tested (always pick a commonly used ORM) all this functionality for you.
There’s often a lot of mapping code required to map between the object model and the sql statements. It’s easy to make mistakes here. Typo’s, or simply forgetting to include the right columns in your queries, can cause your queries to return incorrect data. To make things worse, everytime you change your database schema, you have to make sure you manually update your object-model and queries, which once again can lead to bugs.
Most ORM’s solve these issues by providing you with automatic mapping and type-safe queries. There’s still room for error (obviously the ORM needs to be configured correctly), but at least the compiler will give you an error if you’re querying a non-existing column.
Data model vs domain model
ORM’s try to solve the object relational impedance mismatch [1]. The idea is that there’s often no one-on-one relation between your database tables and your domain model. Depending on your application architecture, there can be a variety of reasons for this: normalization, inheritance, …
When I’m talking about a domain model, I don’t necessarily mean it in the typical DDD sense either. It can be any object model that you use in your application code.
The main issue with this type of complicated mapping is that it’s not easy for the ORM to generate optimized queries. If you’re querying for an object that maps to multiple database tables, the resulting sql statement will contain a bunch of joins. If your object is part of an inheritance chain, it’ll probably add some additional predicates to those join statements. The point is, you have little control over what the eventual query will looks like because it’s based on the configured mapping.
I’m a big proponent of avoiding these issues by separating your domain model (or whatever you use) from your data model. By defining a data model that’s a one-on-one reflection of your database tables, you can maintain control of your sql statements without giving up the advantages of using an ORM.
Because every class corresponds to a single table you easily predict what your sql statements will looks like and which joins will be produced. There won’t be any hidden magic involved since the ORM doesn’t have to do any complex mapping.
If you want to use a DDD approach and define aggregates, you can create separate objects to make up your domain model. You can map the classes from the data model to the domain model whenever necessary. When using a strongly typed language, these kind of mappings are easy to test and refactor becaus they’re checked by the compiler. A similar approach can be taken if you use another architecture (such as CQRS).
[1] https://en.wikipedia.org/wiki/Object%E2%80%93relational_impedance_mismatch