Top 10 Must Have Features in O/R Mapping Tools

Author: Iqbal Khan

O/R mapping tools are becoming more popular each day and people are realizing the productivity gain they provide to developers. Yet, many people don't know enough about O/R mapping to consider using these tools and many others are weary of using any code generators (including O/R mapping tools).

In this article, I will try to educate you about the various important features that a good O/R mapping tool would provide you and how it can be beneficial to you. I am not discussing any particular O/R mapping tool but rather all tools in general.

What is O/R mapping?

If you're developing an object oriented application that requires a relational database, you'll need to develop persistence objects that are modeled against your database design and know how to interact with different tables in the database. You can either develop these objects by hand or use an O/R mapping tool to map and generate them quickly. Hopefully, after reading this article, you'll be convinced that developing by-hand is a bad idea in most situations.

An O/R mapping tool connects to your database and reads its schema, then lets you map persistence objects to database tables and views, specify single-row transactional operations, queries, and stored procedure calls as methods to these objects. And, it also lets you define one-to-one, one-to-many, many-to-one, and many-to-many relationships between objects based on relationships in the database. It then generates fully working persistence objects code for you. Below is a simple example of some persistence objects. Please note that in this persistence design pattern as explained in Domain Objects Persistence Pattern for .NET, the persistence objects are broken up into "Domain Objects" and "Factory Objects". You can read more about this design pattern if you wish.

 /* Note one-to-many self relationship thru "reports_to"*/
 CREATE TABLE t_employees (
     employee_id int IDENTITY (1, 1) NOT NULL,
     name nvarchar (40) NOT NULL,
     birth_date datetime NULL,
     photo image NULL,
     reports_to int NULL,
 )
 // Domain object class for t_employees table

 public class Employee {
     Int32  _employeeId;
     String  _name;
     DateTime  _birthDate;
     Byte[]  _photo;
     Int32  _reportsTo;
     ArrayList  _subordinates;
     Employee  _supervisor;
     Int32 EmployeeId {
         get { return _employeeId; } set { _employeeId = value; }
     }
     String Name {
         get { return _name; } set { _name = value; }
     }
     DateTime BirthDate {
         get { return _birthDate; } set { _birthDate = value; }
     }
     Byte[] Photo {
         get { return _photo; } set { _photo = value; }
     }
     Int32 ReportsTo {
         get { return _reportsTo; } set { _reportsTo = value; }
     }
     ArrayList Subordinates { 
         get { return _subordinates; } set { _subordinates = value; }
     }
     Employee Supervisor { 
         get { return _employee; } set { _employee = value; }
     }
 }

 // Persistence object for Employee class
 public interface IEmployeeFactory {
     void Load (Employee object, int nDepth);
     void Insert (Employee object);
     void Update (Employee object);
     void Delete (Employee object);
     // Query methods
     ArrayList FindSomeEmployees ();
     // Relationship methods
     void LoadSupervisor (Employee emp);
     void LoadSubordinates (Employee emp, int nDepth);
 }
 Below is an example of how a client application will use
 this code:
     public class NorthwindApp {
         static void Main (string[] args) {
             Employee emp = new Employee ();
             EmployeeFactory empFactory = new EmployeeFactory ();

             // Let's load a employee from Northwind database.
             emp.EmployeeId = 10045;
             empFactory.load (emp);

             // empList is a collection of Employee objects 
             ArrayList empList = empFactory.FindSomeEmployees ();
             // subList is a collection of Employee's subordinates objects 
             ArrayList subList = empFactory.LoadSubordinates (emp, 1);

             // supervisor is Employee's supervisor object
             Employee supervisor = empFactory.LoadSupervisor (emp);
         }
     }

Feature 1: Flexible object mapping

Everything in O/R mapping starts with mapping your objects to your relational tables. Here are some specific features in this area that you should know:

Tables & views mapping: The tool should let you map objects to both tables and views in your relational database. Mapping to views is important because many real-life applications prefer to use views instead of tables.
Multi-table mapping: The tool should let you map an object not only to a single table but also to multiple tables and specify a join between these tables. If your application needs to fetch list of rows that span multiple tables (a common occurrence in web applications), you'll need this feature.
Naming convention: The tool should let you use a different naming convention in objects and their attributes than in relational databases. If your database table is named t_employees, your object may need to be named Employee.
Attribute mapping: There are a number of features that the tool should support:
1. Primary key: Your object must distinguish the primary key from other columns. It should also let you use a single-column or multi-column primary key.
2. Auto generated columns: Some columns are auto generated (IDENTITY or SEQUENCE) and your object must have code to handle fetching the generated values after an insert.
3. Read-only columns: Some columns are not meant to be set by the client but instead their values are system generated (e.g. creation_dtime column using getDate() function in SQL Server). Your object must have appropriate code to fetch these system-generated values.
4. Required columns: Your object must do data validation for required columns at the time of insert or update operations. This is much more efficient than wasting a trip to the database just to get an error message back.
5. Validation: In most cases, you have defined various constraints on your database columns. It would be nice to have the same validations done in your persistence objects so you can save an unnecessary trip to the database just to receive an error message.
6. Formula Fields: There are many situations where when you fetch data from the database, you use a regular expression rather than a column (e.g. Annual Salary object attribute might be a formula field monthly_salary * 12).
7. Data type mapping: Sometime, you want to map one data type from the database to another data type in your object. For example, a datetime type might be converted into a string. Your object must have the logic to do this automatically in both directions (read and write).

Feature 2: Use your existing domain objects

As you saw, a popular design pattern separates persistence objects into "domain" and "factory" objects. One important O/R mapping feature is to let you decide whether you want to generate both domain and factory objects or use your existing domain objects and only generate factory objects that know about your domain objects.

Some people do not want to generate "domain" objects and instead develop them by hand and only generate the "factory" objects. The reason behind this is that their domain objects are being used in almost all subsystems of their application and therefore they don't want them changing frequently through subsequent code regenerations. But, they don't mind generating the "factory" objects since their use is localized to a few places (for load and save operations).

Therefore, the O/R mapping tool should let you use your existing domain objects and map and generate only the factory objects. It should use .NET Reflection to read your domain object definition and after you have done the mapping, it should generate the factory objects in such a way that these factory objects use your domain objects to hold all the data.

Feature 3: Transactional operations (CRUD)

A database transaction allows you to group multiple operations as one atomic operation so either all operations succeed or none of them succeed. Transactional operations include create, read, update, and delete (also called insert, update, load, and delete). Each transaction operation is performed only on one row of data in a table.

You'll be working in one of two main transactional environments and your O/R mapping tools needs to know both of them so it can generate code accordingly. They options are:

COM+/MTS: Microsoft Transaction Server (MTS) manages all transactions of an application. Your objects do not start, commit, or rollback a transaction. They only return success or failure from their methods and MTS figures out when to do "BeginTrans", "Commit", or "Rollback". Additionally, all your factory objects are stateless so MTS can do object pooling on them. This is a specific design pattern that your O/R mapping tool must understand and generate your persistence objects to comply with it. Most common applications for this environment are ASP.NET applications and .NET Web Services.
Stand-alone: This is the environment where your application manages all the transactions itself. It needs to know where to go "BeginTrans", "Commit", and "Rollback". And, your O/R mapping tool needs to be aware of this environment and generate code to comply with it. Most common situations for this are Windows Forms based client/server applications that directly talk to the database server.

Feature 4: Relationships and life cycle management

The foundation of a relational database is that tables have relationships with other tables. Similarly, when you map objects to these tables, your objects also need to establish the same relationships with other mapped objects. Therefore, your O/R mapping tool must support this very important feature by letting you determine which relationships you want to keep in your objects. Below are the different types of relationships you must have:

One-to-one relationship: In this, your object must contain a reference to exactly one other object and must handle load and save scenarios for it.
Many-to-one relationship: This is very similar to one-to-one where your object must contain a reference to exactly one other object and must handle load and save scenarios.
One-to-many relationship: In this, your object must contain a collection of the related objects and must handle loading them with load and also adding and removing them with save operations.
Many-to-many relationship: This is the most complex relationship and involves a bridge table in the database to establish the relationship. There are two different situations for the bridge table as described below:
1. Bridge table with only primary key: In this situation, the bridge table contains only the primary key (which is actually composed of multiple foreign keys). So, your object need not have any bridge table attributes and only needs to keep a collection of the related objects (similar to one-to-many). In fact, the public interface of your object is usually identical to one-to-many but the underlying code is different because of the bridge table.
2. Bridge table with additional columns: This is the most complex situation because the bridge table has additional useful columns that your object must cater for. Your object needs to load a collection of composite objects containing both the bridge table and the related table information. The life cycle management feature must include the ability to load a primary (or parent) object and through this object load all the related objects (meaning all the different relationships). You should also be able to add newly created related objects or remove existing related objects from the relationships. And, when you save the primary object, it must save all the relationship information along with it (as one transaction). A good O/R mapping tool would let you define all types of relationships and also handle life cycle management in the generated code.

Feature 5: Object inheritance

As you already know, a very important aspect of object-oriented programming is inheritance. However, relational databases do not automatically provide inheritance in the relational model. But, there are a number of patterns on how to map object inheritance to a relational database. And, a good O/R mapping tool must provide this capability.

Here are a few ways in which object inheritance is mapped to relational databases.

One table per object: This is the most popular and flexible pattern. In this, each object is mapped to its own table in the database. And, there is a one-to-one relationship between every base object and its derived object. The foreign key of this relationship is kept in the derived object. It is the most flexible because without changing the structure of any existing tables, we can keep adding to the inheritance hierarchy. However, it is not the most efficient for loading both base and derived objects because a separate "load" is done for each object.
One table for all objects: In this pattern, the base object and all the derived objects are represented in one table in the database. This table contains columns representing attributes from all the objects. It is the most efficient for loading and saving data but is very limited because adding a new object to the inheritance requires changing the structure of an existing table in the database that is highly undesirable.
Keeping this in mind, the O/R mapping tool must support at least the "One table per object" approach and if it can also support the second approach that is icing on the cake. The generated code for base and derived classes should handle the following situations:
1. Insert and update operations: The derived class must first ask the base class to do Insert or Update and then do its own. But both the base and derived class operations must be performed in one transaction.
2. Delete operation: Unlike the insert and update operations, the delete operation is performed first on the derived object and then on the base object. However, both must be done in one transaction.
3. Load operation: The load operation in the derived class must also call load on the base class and both of these should be done in one transaction.

Feature 6: Static and dynamic queries

The next most common thing that a database application does is to retrieve rows of data from one or more tables. The application does this done by using SQL queries (SELECT statements). However, an object-oriented application wants to fetch a collection of objects and not rows. So, the O/R mapping tool must provide a way for you to create queries that return collections of objects.

Static queries are those that are defined at compile time and the only thing that changes at runtime for them are the parameter values. These queries can be precompiled and run very efficiently. So, the O/R mapping tool must allow you to define static queries as methods of your objects and also specify whether these queries take any run-time parameters or not.

Dynamic queries on the other hand are those where either the query or its criteria is created at runtime. These queries cannot be precompiled and must run as "Dynamic SQL". However, the benefit of these queries is that they allow those situations in your application where you're performing ad hoc search operations and based on the user input you determine what the query should look like. These queries need to also be provided as methods to your objects but with the flexibility that you can specify the "WHERE clause" and "ORDER BY clause" at run-time.

Feature 7: Stored procedure calls

Stored procedures have become very popular in high transaction environments because they allow you to put all your SQL inside the DBMS and in a compiled form. As a result, your SQL does not have to be compiled at runtime because that is a very expensive process. There are two situations that an O/R mapping tool must support when it comes to stored procedures as described below:

Existing Stored Procedures: First is when you already have custom stored procedures in the DBMS and you want to have your persistence objects call them. In this situation, the O/R mapping tool must allow you to define methods in your objects that can call stored procedures. It must also support different parameter types (in, out, in/out) and also whether the stored procedure returns a Recordset or not. If the stored procedure returns a Recordset then the object must return this data to its client.
Generate Stored Procedures: The second situation is where all the SQL (minus the dynamic queries) that is going to be generated as a result of your object-relational mapping is put inside the DBMS as stored procedures and your objects code is generated so it calls these stored procedures. If you didn't generate stored procedures for all the SQL, it would be put inside your object source code as "dynamic SQL".

Feature 8: Object caching

If your application is transaction intensive and supports high traffic, you really cannot live without effective caching built into your application. Microsoft provides ASP.NET Cache object but it is not sufficient for clustered environments where your application is running on multiple servers and needs a cache that is also clustered. However, there are commercial caching solutions available in .NET that cater for clustered environments.

Whichever caching product you use, you'll have to make sure that your persistence objects are making caching calls from appropriate locations. And, your O/R mapping tool should provide the ability to generate code that makes caching calls to one or more leading products.

Ideally, your O/R mapping tool should let you specify which objects you want to cache and which ones you do not want to cache. The most popular situation is where transactional objects (single row objects) are cached. However, you can also cache entire collections and even related objects

Feature 9: Customization of generated code

You'll always have situations where you need to customize generated code. However, if you change the generated code, it will most likely get overwritten the next time you generate code again. And, since software development is an iterative process, you'll have to generate code many times.

Additionally, whatever custom code you write must be called when the generated code is run. And, it must also be able to control the subsequent execution of the generated code. For example, if your custom code is called before doing an "Insert" and you find something wrong, you should be able to prevent the "Insert" from actually happening.

To prevent your code from being overwritten, the O/R mapping tool must allow you to mark your code as "Safe Code" which then does not get overwritten in future code regenerations. And, to ensure that your custom code gets called seamlessly, the O/R mapping tool needs to either support the concept of "Hooks" which are calls made from strategic places in the generated code and the result code returned by these "Hooks" determines what happens next. Or, the O/R mapping tool needs to let you derive the generated code and then use polymorphism to actually run your code instead of the generated code. You can then determine whether to call the "base class code" or not.

Feature 10: Template customization

A good O/R mapping tool is very likely using code templates to determine how to generate the code. The O/R mapping tool combines the templates it has with a combination of your object mapping input and the database schema information to determine exactly how to generate the code.

Since the O/R mapping tool is generating code from templates, it would be great if it let you modify these templates (or add new templates) so you could affect how the generated code should look.

A very simple example would be when you would like to put your own copyright header in each source code file. If you could go an insert this header in the code template file, it would automatically get used next time you generate code. You should also be able to write your own code templates (although this is only for advanced users) and let the O/R mapping tool use your templates but do everything else the same way as it always does.

Conclusion

You should seriously consider using an O/R mapping tool as it will save you a lot of development and testing time. And, when it comes to evaluating which tool is best for you, you should know what to look for. I hope this article helps you gain a better understanding of O/R mapping.

Author: Iqbal Khan works for Alachisoft , a leading software company providing .NET and Java distributed caching, O/R Mapping and SharePoint Storage Optimization solutions. You can reach him at iqbal@alachisoft.com.