Transactions - Friend or Foe?

From WikiContent

Jump to: navigation, search

Transactions are necessary for maintaining data integrity and consistency. However, they are also expensive. They can reduce database concurrency, cause database deadlocks, and significantly affect the performance and scalability of an application. Knowing when not to use transactions is just as important as knowing when to use them.

Whether a transaction is a friend or a foe depends on many factors. Are you doing read or update operations? What persistence framework are you using? Are you using Stored Procedures? What database are you using? All of these factors play into whether transactions are a good idea or bad idea.

For database updates, you must use a transaction to maintain ACID properties. However, what about database read operations? This decision really boils down to the type of database and persistence framework you are using. In general, the only reason you would want a transaction for read operations is to ensure read consistency or to programmatically change the database isolation level (both of which I generally consider edge cases). For a standard DAO or simple JDBC-based persistence framework, you don't want to start a transaction for read operations when using databases such as SQL Server or Sybase because you’ll hold read locks in the database, resulting in concurrency and performance issues. These issues aren’t present in databases like Oracle (and MySQL using the innodb engine) because they don't hold read locks.

However, you shouldn’t use transactions for read operations when using object-relational mapping frameworks such as Hibernate and TopLink, regardless of the database you are using. Suppose the object you are looking for is already in the ORM cache. If you use transactions, the application will open a database connection and start a transaction when in fact it never hit the database in the first place. This wastes database connections that may be needed by another operation. In the event the object isn’t in the cache, then the ORM framework will start it's own more efficient and shorter running low-level transaction to generate the SQL and retrieve the data from the database.

Another consideration is the length of time you hold onto a database connection. Application-managed transactions generally hold onto database connections (and consequently database locks) much longer than is necessary. Starting a transaction too early can result in poor database concurrency, extended database locks, poor application performance, reduced scalability, and increase the probability of database deadlocks. Reducing the transaction scope by using techniques like the "High Concurrency Transaction Strategy" (http://www.ibm.com/developerworks/java/library/j-ts5/index.html) can reduce or eliminate these issues.

Using database Stored Procedures can also cause transaction issues. For example, using an Oracle Stored Procedures containing DDL statements in the context of an application-managed transaction will result in a nested transaction error, as will using Sybase Stored Procedures in unchained mode and transact-SQL Stored Procedures containing a BEGIN TRANS statement. In cases such as these you will either have to suspend the current transaction prior to the stored procedure call or use a NOT_SUPPORTED transaction attribute directive.

Understanding when to use transactions (and when not to) can make all the difference in the world with regards to performance, scalability, and throughput in your application. Always use transactions for update operations, but know the issues when using them for database read operations.

By Mark Richards

This work is licensed under a Creative Commons Attribution 3

Back to the 97_Things_SQL home page

Personal tools