A Developer with a Pencil

SimpleStateMachine - a Simple Enum Based State Machine for Ruby

Following my previous post on Using Enums in Ruby we have been working on a state machine implementation in Ruby, based on an Enum representation of states.

Why should you use a state machine?

A state machine is mainly used to indicate a specific status a certain object is in. A State is some kind of a description - usually a few words commonly stored as a string in the storage layer. State machine simplify the process of transition between certain states that an object is allowed to do, mechanizing the transition itself - transition, callbacks and persistence included.

You should be using a state machine when your code defines different behaviors for an object - based on the current status the object is in. For example: a simple example for a state machine is a blog post. A blog post can be saved as a “draft” - then it won’t be shown in the blog, and a “published” state which makes it available for display in the blog.

You’ll probably want to add some callback functionality for the state change, like sending out a tweet notifying the world of your new post when a post becomes “published”.

The state machine in the Ruby world: AASM

One of the most widely used state machine gems in the Ruby/Rails world is probably the AASM gem. AASM is a great gem that has one basic flaw - it requires the state column / attribute to be of a String type. Using this gem your database will look something like this (a simple user table for example):

varchar mysql column for states, using MySQL
1
2
3
4
5
6
7
8
mysql> select id, username, status from users limit 2;
+----+--------------+---------+
| id | username     | status  |
+----+--------------+---------+
| 18 | zidane       | retired |
| 71 | zessi        | active  |
+----+--------------+---------+
2 rows in set (0.00 sec)

and to simplify your queries you may have scopes in your model to match the the available states:

a user ActiveRecord::Base model with AASM
1
2
3
4
5
6
7
8
9
10
11
12
class User < ActiveRecord::Base

  include AASM

  initial_state :active

  # more AASM configuration 

  scope :active, where(status: "active")
  scope :retired, where(status: "retired")

end

Yes, this is simple and clear enough to be used on almost any application that requires a state machine functionality, but when your data size increases beyond a certain point the string type dependency can be a real PITA.

Storing states: String vs. Integer vs. Enum

You have 3 options when you choose your data storage type for a state column: String (VARCHAR), Integer and Enum (Not all databases are supported).

Strings

Storing strings is by far the easiest implementation you can choose. As shown earlier when you store states as strings the code is readable, the database output is readable and you can add and remove states pretty easily (as long as those new states do not exceed your string column length LIMIT bar). As far as indexing strings, MySQL string indexes are bigger comparing to integer column indexes and comparing strings is a little bit slower than comparing integers so traversing a string index performance can drop a little bit.

Enums

ENUM column type is available out of the box for MySQL:

Enum column in mysql
1
2
3
4
CREATE TABLE users (
    ...
    status ENUM('active', 'retired')
);

And you can still query the ENUM column based on the string representation of the ENUM:

querying an ENUM column
1
mysql> SELECT * FROM `users` WHERE status = 'active';

Adding an ENUM fields to a Postgres database is a little bit tricky and requires a custom defintion for the ENUM column type

While simple to create, maintaining an ENUM column has its own PITA:

  • When you use an ENUM column, you’re technically moving data from where it belongs (in actual database fields), to somewhere it doesn’t (into the database metadata, specifically a column definition). This is different than putting constraints on the data which can achieve the same result.
  • When you create an ENUM column, you basically say “this is never going to change”. but when it does you’ll need to ALTER TABLE that can take a serious amount of time on big tables.
  • You’ll have to maintain the same list of available ENUM values for application level usage (e.g: a drop down) since you can’t easily extract a distict list of available ENUM values from the column itself.
  • Aside from not being reusable on the application layer, ENUM members aren’t reusable even in other tables.
  • Inserting an illegal value to an ENUM does not cause an error (like a constraint will do) - in most databases it will simply truncate the value to an empty string and insert the faulty data silently, yay!.

Integers

The great ancestor of everything, the integer type is the simplest of them all but carries an overweight of being completely unreadable and depends on conjunction with the application code to provide a textual representation for vogue number based values. It is pretty clear that

varchar mysql column for states, using MySQL
1
2
3
4
5
6
7
mysql> select id, username, status from users limit 1;
+----+--------------+---------+
| id | username     | status  |
+----+--------------+---------+
| 18 | zidane       | 1       |
+----+--------------+---------+
1 rows in set (0.00 sec)

doesn’t really makes it clear on which status zidane is.

Introducting: SimpleStateMachine

The SimpleStateMachine gem is now available at a alpha version and includes a very simple state machine implementation:

  • State definition
  • Transition setup
  • Callbacks
  • Enum representation of states

SimpleStateMachine does not makes it necessary to use ActiveRecord or any kind of ORM, it is a ruby free implementation. state enum indexes will be stored as integers in case you choose to use some ORM, and will be transformed back to strings.

The is still some work to be down, especially on the query interface so patches and pull requests are welcome.

Short, short example

SimpleStateMachine example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class User
  include SimpleStateMachine

  state_machine do |sm|
    sm.state_field "status"
    sm.initial_state "happy"
    sm.add_state "sad", :before_enter => :cry
    sm.add_transition :be_sad, :from => :happy, :to => :sad
  end

  def cry
    puts "*sniff sniff*"
  end
end

# Console
> sad_clown = User.new
> sad_clown.status
0
> sad_clown.human_status_name
"happy"
> sad_clown.be_sad
*sniff sniff*
> sad_clown.status
1
> sad_clown.human_status_name
"sad"

Comments