Transactions, Part 1
Preface
This work came out of an lack of consistent, cohesive documentation for beginners on transactions. Similar material does exist, but much of it suffers from either being hard to find, scattered and spotty, or not written at a basic enough level where prerequisite knowledge is not required in order to understand the material covered. Over the course of this series of articles, we’ll talk about what concepts you need to know in order to effectively and correctly make use of transactions in your applications. This series of articles is written as a set of core concepts to understanding the basics of transactions, and is intended for junior to intermediate level developers, but my hope is that others will benefit from this series as well. As always, feedback is greatly appreciated.
What is a transaction?
Etymology
The word transaction comes from the Latin word transactionem meaning “an agreement, accomplishment,” which itself comes from the past participle of the verb transigere, transactus, meaning “drove or carried through.”
Transactions defined
The word’s definition is typically related to business or economics, meaning a single business deal, or an exchange of goods between two parties. Most introductions to transactions use an example of debiting one bank account and crediting another. When we remove funds from one account, we need to make sure that they are deposited into the other. If something fails (sufficient funds do not exist in the account being debited, or the account being credited has been closed) then both accounts need to be returned to their original state before the exchange began. This is actually a very good example, but how does that apply to software?
Let’s back up for a second. If we focus on what’s going on, we begin to see there are two distinct activities in play:
- Grouping a set of multiple changes together that all have been applied successfully, or, in the event of a single failure, that all need to be undone.
- Managing (or coordinating) the execution of these changes, when their interactions might interfere with one another.
Examples of Transactional Systems
Like many other terms in the software lexicon such as “class”, “object”, “system” and “code”, the definition of the word “transaction” suffers a bit from the word’s inherent over-abstraction; it is widely applicable to a great many circumstances and situations. But once understood, is easy to spot scenarios where the pattern is used and applied. Transactions are important in software when updating shared resources, like a cache, database, or message queue, but also come into play when reading – especially when access must be coordinated across many separate but concurrent workers reading shared data. We’ll talk more about this in depth later. Some examples of systems that heavily utilize software transactions are:
- eCommerce (Shopping-Cart, Billing, Banking) Applications
- Relational Databases
- Caching Systems
- Messaging Systems (Message-Oriented Middleware)
- Source Code Management Systems
As you can see, not only are these fundamental building blocks of business software, but also modern distributed systems. Transactions are prevalent in the frameworks and tools we use to build, tune, and scale the software we write.
Specifying Transactions
ACID properties
A technological definition of the word transaction involves specification of qualities (or “properties”) that must be enforced in order for the transaction to be processed successfully. These properties, the ACID properties as they are known, grow out of our business concept of a transaction (an exchange between two parties). They exist as conventions that specify or constrain the software that’s responsible for processing transactions. Enforcing all four of these properties guarantees the transaction is reliable and therefore viable candidate for successful completion/processing.
- Atomic – When we say transactions need to be atomic, we don’t mean that they need to be small, though often they are. What we mean is that you can’t subdivide a transaction any further – the group of changes must be viewed as a single set. When applying that set of changes, it’s “all or nothing” – that is, each and every individual change processes successfully, or else the entire group is undone completely. There are no “partial” transactions; we can’t subdivide them further.
- Consistent – If a system successfully processes a transaction, then all resources involved in the transaction must be consistent afterwords – whether the transaction processed successfully or not. For example, if one account is debited and another is credited, the balance of the debited account must be equal to the difference of the original balance and the price of the transaction (and vice-versa for the credited account). Transactions that do not complete successfully should not have any side-effects to either party involved in the transaction.
- Independent – Transactions should not interfere with one another, even though they may involve shared resources. For example, one transaction should not be able to see the work another transaction is currently performing until that transaction has completed. They should be independent or isolated. Typically, transaction isolation becomes an issue as transactions being processed become larger: that is, the longer they live or the more resources that they enlist. Isolation also has performance implications as the amount of load or concurrency increases. Isolation mechanisms expose either locking or multi-version concurrency control (MVCC, “diff-ing” or delta-management). We will discuss these later when we talk about concurrency control and isolation.
- Durable – Systems processing transactions must record successful transactions so that a history of the transactions processed is not lost. Relevant actions taken are logged for auditing purposes. This audit trail allows the state of the transactional system to be recreated at any point in time if there is a system failure. If the information cannot be recorded in this way, we can not say we have a durable transaction, because we won’t be able to recreate the state of the system if a failure occurs.
In a real world business transaction, these conventions are likely enforced by human beings involved in the transaction. If one of these properties were violated during your checkout at a store in the mall, then either you or the clerk would be pretty upset and likely to call in relevant authorities to sort out the matter. For example, if you bought a shirt but were refused a receipt for the transaction, you would most likely be upset and demand a refund, as the receipt is confirmation of the transaction’s durability.
Why Should I Bother Using Transactions?
As you can see, transactions also provide us with some guarantees about what will happen in the event that something goes wrong during the interactions between shared resources. These guarantees are very useful to us as software designers and developers, but their benefits may not be intuitively obvious until we consider what our jobs would be like without them. Transactions give us with a safety net of assumptions; when something goes wrong, we can lean on the transaction’s guarantee to rollback, or automatically revert all changes from the start of the unit of work. Imagine if we didn’t have this guarantee: we would be forced to write code to cleanup the mess we’ve created, and not only is this code hard to maintain, it’s hard to write and get correct the first time. With transactions, we don’t have to worry about writing extra code to “cleanup” the mess we’ve created, and can instead focus on what the code we’re writing is actually supposed to do: the business logic. When something goes wrong, we can safely assume certain scenarios are out of the question by using transactions. In many circumstances, transactions are key to writing code that is functionally correct, and most of the time make the code cleaner, more robust and maintainable (especially when considering the alternative of coding cleanup logic by hand).
Transactions are fundamentally all about coordinating access to shared resources. A term that can be used interchangeably with the word “transaction” is unit of work. Transactions are important to software systems because they’re how we manage interactions with the different systems involved that have access or expose shared resources to clients.
Typically when we work with transactional systems, as application developers we:
- Write code that explicitly utilizes a transactional API, like JDBC, that will contain the semantics for how to handle the processing of the transaction, or…
- Configure a transaction management mechanism, like an EJB container, that can do the processing and coordination work on our behalf based on conventions.
The next article in the series will provide examples of both approaches. Either way, the end result provides a way to keep each unit of work’s interactions separate from each other, and guarantee that all changes in the set are applied together or undone completely.