A Developer with a Pencil

No More Fiverr, Please

Due to recent events in the so called relationship between me and Fiverr.com - my recent employer, I am forced to remove and cease any connection between me and this company including online social connentions, communications and affiliation.

I choose to do so because I am recluctant to proceed in this ongoing saga that in my belief, is aimed to dishonor the good men and women that put in a huge effort, their blood and their sweat, to make this idea come to life and to what it is today.

I wish all the Fiverr employees and the company itself a huge success in the future, i just don’t want to hear about them or know they exist ever again.

Being an Asshole Does Not Make You Awesome

Don’t be an asshole. period.

Being an asshole comes in all different shapes and sizes, in the tech community is it usually comes in the forms of being a bully to newcomers or being abusive towards women.

The latter one seems to be bothering the Rails and Ruby community a lot less than the first one, I have seen in several occasions people coming to the help of a new comer when someone thought it was cool to bring him down and make fun of him online (#rubyonrails on irc.freenode.net) while the phenomenon of degrading women is hardly addressed as a community.

A few years ago - Matt Almonetti decided that it is a good idea to name his CouchDb and Rails presentation as “CouchDB and Ruby - Perform like a porn star” - an act that led to the initial foundation of the Railsbridge group and eventually the RailsGirls movement (which is awesome btw). This specific event was the only time someone reacted to the fact that women are generally treated differently in the Rails / Ruby world and the first time a Community wide action took place.

The no-asshole policy started to kick in, or so I thought.

Yesterday there has been a vivid discussion in Twitter regarding a case of sexual harrasement that happened during one of the Ruby conferences last year in January (I will not link names or tweets until I know what is going on for real) that lead me to think about this whole issue - what are we REALLY doing to make women feel welcome in our community.

The answer is, not much. RailsGirls are doing an awesome job in bringing women into this overall wonderful community, but it is not enough - we need to pay attention to the stuff that makes us such a great community overall to new comers and men, and make it appealing for women too.

What is missing you ask? I think that there is no consideration in women when it comes to gem naming convention, here are a few gems that i found in a 5 mintues search on Rubygems.org to demonstrate why women and other groups probably feel uncomfortable when trying to get into the Rails community:

While some of you may think this is a righteous callout - I think that as a community we need to strive to be as appealing as possible, there is nothing cool about naming your gem “fuck” or “retarded” and we as a community - need to stop this from happening as much as we can.

In my opinion there is no immediate solution, only ones that are community driven and are accepted as standard. Some of the gems i specified above have more than 20,000 downloads, that’s 20,000 people that didn’t care. it should be different.

A proper suggestion but may be a non-realistic one is that Rubygems.org will refuse to accept gem pushes that uses offensive naming conventions - while this is a harsh move - It will show that a major, if not the most important one in the Rails and Ruby world is showing that it cares making the rest of us follow by example.

An Update

Obviously, i took the liberty of linking to this post in Facebook and Twitter. Almost all of the responses I got were great from people that agree with the idea and pain behind this post…

but this guy is something else.

I posted on the Ruby group on facebook.

This guy Aaron, is the admin. Take a look at the lowest level of people you can find in this industry.

It Is Time to Stop Using Acts_as_taggable

Have you ever added a tagging functionality to your Rails application? Then you have probably used either the acts_as_taggable gem, or its younger brother the acts_as_taggable_on gem.

These two gems are great, but they have some drawbacks that were unavoidable during the time of creation. Both of these gems rely on an RDBMS database that generally looks like this:

Tagging, RDBMS style

ActAsTaggable in all of its generations - was based on this model schema:

  • a Tags table that held the information about a specific tag (Basically, only the tag name)
  • a Taggings table, that held polymorphic associations references to the tagged instance (taggable) and the tagger instace (tagger).

So basically, when you wanted to get a tag list for some kind of a taggable instance or to see all the tags a tagger had made, you’d have to JOIN those 2 tables together. always.

Now, joining isn’t really bad - it is there for a reason - but it could be one of some serious issues arising from this schema in certain circumstances.

1. JOINing tables from different servers

What happens when you have 10M tags and 40M taggings? your MySQL / Postgres / You-name-it-db needs some kind of an extended server setup that includes more than one instance of your db server, and if you are splitting the data - you might want to split your data and JOIN between 2 database servers.

Yes, it is possible, MySQL supports the Federated Storage Engine that allows you to join and share query information between 2 or more servers, MSSQL has the linked-servers feature that is very similar to that and some of the other databases have it. The problem with this feature is that is far from being easy and simple to setup or maintain so by default if you are have a lot of tags or tagging and you want to add some sharding to the party, you are in a jiffy.

2. Indexing polymorphic association columns

Although these gems provide the necessery indices as part of the migration generator template, the fact that polymorphic association in Rails is composed out of a string (taggable_type) and an integer (taggable_id) is making the index’s diversity ratio rather low - meaning there are too many similar grouped entries in the index.

3. Autocomplete

Back to the 10M tags in the table example. Providing an autocomplete engine for this size of a table is horrific. You’ll have to use some kind of a full text engine like Solr or ElasticSearch to provide matching tags in real time.

4. Uniquness of tag name

How do you know if you create a newly provided tag or if you need to add tagging to an existing one? you first have to find if the tag exists already. 10M tags? good luck. Again, a full text search engine will provide a decent solution to this problem.

The solution: Redis

I love Redis. When it fits, it sits. If you are using Redis superpowers when you need to use them - it is an awesome tool. Redis provides several value types, each of them has it’s own superpower aimed for a specific problem - SET being the one we chose.

Storing tags in Redis for easy access

Redis Sets are basically arrays with unique members, for the following example we will use User as the tagger class, and Photo as the tagged class. Noticed there aren’t Tag or Tagging classes? We don’t need them anymore.

When User with ID 10 is tagging the Photo with ID 9 with the tag “Dog” we simply create a bunch of Redis sets that will allow easy access to any slice of data we might need:

Storing tags in redis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Add specific tagging of photo by a user
$redis.sadd "user:10:photo:9:tags", "dog"

# Add to photo specific tag set
$redis.sadd "photo:9:tags", "dog"

# Add a list of tagged photos to a tag set
$redis.sadd "tag:dog:photos", 9

# a list of photos tagged by a specific user
$redis.sadd "user:10:tagged_photos", 9

# Increase the usage counter for the "dog" tag

$redis.inc "tagged_by:dog"

Now we can have simple accessors to this information, for example:

photo.rb
1
2
3
4
5
6
7
class Photo

  # Get tags
  def tags
    $redis.smembers "photos:#{self.id}:tags"
  end
end

or for the Tag class:

tag.rb
1
2
3
4
5
class Tag
  def tagged_photo_ids
    $redis.smembers "tag:#{self.name}:photos"
  end
end

Generally, this is just an outline with a single rule - Normalize your data - instead of doing complicated join queries use a simple namespaced key value access to your data.

Ok, no joins. what about autocomplete?

Autocomplete is a PITA, but by using redis - We can maintain a list of your tag prefixes as keys to tags lists, for example the tag “liverpool” will be broken in to smaller pieces:

autocomplete.rb
1
2
3
4
5
6
7
$redis.add "tags:start_with:liv", "liverpool"
$redis.add "tags:start_with:live", "liverpool"
$redis.add "tags:start_with:liver", "liverpool"
$redis.add "tags:start_with:liverp", "liverpool"
$redis.add "tags:start_with:liverpo", "liverpool"
$redis.add "tags:start_with:liverpoo", "liverpool"
$redis.add "tags:start_with:liverpool", "liverpool"

This breakdown will allow us to easily access the list of tags (3 letters and up):

tag.rb
1
2
3
4
5
6
7
8
class Tag
  ...
  def Tag.tags_starting_with(tag_starts_with = "")
    $redis.smembers "tags:start_with:#{tag_starts_with}"
  end
end

Tag.tags_starting_with("liver") # => ["liver", "liverpool", "liverani",...]

Intersections!

Redis can provide an intersection between 2 sets, meaning you can “merge” between 2 sets and find either the indentical or different elements in both sets.

For example - if we would like to know which photos are tagged by both “dog” and “cat” will intersect those 2 sets.

intersection
1
$redis.sinter "tag:cat:photos", "tag:dog:photos" # => [12,93,94, ...]

Conslusion

Again, this is just an outline. There are many improvements to be added but we at ShinobiDevs are working on releasing a gem that could do just that - ideas are welcome. Redis is a powerfull tool, there is probably no need to store the tagged data in an RDBMS structure but to find a better one maybe just like the one suggested above.

Rails Bugmash in Israel

Last week, in August 30 - We held a Rails hunt & destroy bugmash in Israel with the unbelieveable courtesy of Ebay Innovation Center in Israel.

This bugmash was born as a result of a discussion in the Israeli Rails group and lead to an amazing event, About 40+ members of the Israeli Rails developers community, including awesome people from Google/Waze, Simplee, Fiverr, Scoreoid and a lot more freelancers and Rails lovers gathered in the Ebay innvation center in Israel at 9:00am (Yes, early as fuck) to hunt down some Rails bugs.

During this day, we used a special application we developed to allow people to choose and take ownership on Rails github issues - which resulted in 15 pull-requests and 1 merge to master so far.

It was an amazing event, that showed the power of the Israeli Rails community - and although we weren’t successful in finding a Rails core member willing to help us online - We managed to get something out of this day. A list of solved bugs and a wonderful day to Open Source.

I would like to thank the awesome guys (and girl) @ the Ebay innovation center, on crafting this event, getting us a south american mean buffet and providing us with an amazing office workspace that helped us achieve this wonderful result - I hope it wasn’t the last time that we are doing it.

You can read more on the Ebay innovation center blog post and see us in action in these pictures.

Hoping to see more of you there next time!

Joining AppsScrolls!

In the last couple of weeks I needed to boot up some new rails apps, all of them had the same skeleton structure including Devise, Bootstrap, RSpec and so on.

I remembered that there was an attempt to tackle this repetitive skeleton app generation process by Michael Bleigh’s RailsWizard so I pinged Intridea on Twitter and asked what was the status of it.

It seemed that RailsWizard isn’t maintained for a long time, but Dr Nic took over it a while ago and formed AppScrolls.

After a short conversation he happily added me to the contributors list so now I am a very proud co-maintainer of a very awesome gem.

Currently the plans are to add some missing scrolls, such as some mongo adapters and some testing frameworks. Hit me up if you are missing something - or even better, fork and pull-request.

Simple and Easy Getter for Your Models

Here is a little trick we use at ShinobiDevs when we need a quick getter method in our models.

We simply extend our models with a double brackets method, just like a Hash or an Array:

user.rb
1
2
3
4
5
class User < ActiveRecord::Base
  def self.[](id)
    self.where(id: id).first
  end
end

And use it:

1
2
3
4
5
eladmeidar@Elads-MacBook-Pro:~/projects/dummy (master *)$ rails c
Loading development environment (Rails 4.0.0)
[1] pry(main)> User[1]
  User Load (7.9ms)  SELECT `users`.* FROM `users` WHERE `users`.`id` = 1 LIMIT 1
=> #<User id: 1, email: "elad@testing.com"......>

Simple!

Some more use cases

As many people suggested, there is no real difference between using User.find(id) and User[id]. A much better use is to bind this getter to an attribute that isn’t the primary key for that model, for example - username

user.rb
1
2
3
4
5
class User < ActiveRecord::Base
  def self.[](username)
    self.where(username: username).first
  end
end

It is also a reasonable idea to use it with other finder methods:

user.rb
1
2
3
4
5
6
7
class User < ActiveRecord::Base

  # Grab the latest x users, specified by the limit parameter
  def self.[](limit)
    self.limit(limit).order("id DESC")
  end
end

This concept is aimed to provide a shorter syntax for a quick, common getter - I wouldn’t recommend doing something utterly clobbered in this method (conditionals and switches for example).

Dragonfly on Heroku - the Difference Between the Request Time and the Current Time Is Too Large

Lately we have been experiencing intermittent exceptions on Heroku when uploading images to S3 using “Dragonfly” on Heroku:

1
Excon::Errors::Forbidden: Expected(200) <=> Actual(403 Forbidden)

It seems that this exceptions doesn’t happen on every upload - so while examining the response body from S3 - we got this:

1
<Excon::Response:0x000000062a09d0 @body="<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Error><Code>RequestTimeTooSkewed</Code><Message>The difference between the request time and the current time is too large.</Message><MaxAllowedSkewMilliseconds>900000</MaxAllowedSkewMilliseconds><RequestId>B9CB09E0E0A7054B</RequestId><HostId>K3liRup7BjJoxBXgkCGpD7NSk/0jIUy6+nBY5Y63akNx4MNNLMvj7zSlEadDn87Q</HostId><RequestTime>Mon, 26 Aug 2013 11:39:58 +0000</RequestTime><ServerTime>2013-08-26T11:55:29Z</ServerTime></Error>", @headers={"x-amz-request-id"=>"xxxx", "x-amz-id-2"=>"xxxx", "Content-Type"=>"application/xml", "Transfer-Encoding"=>"chunked", "Date"=>"Mon, 26 Aug 2013 11:55:28 GMT", "nnCoection"=>"close", "Server"=>"AmazonS3"}, @status=403>

and especially this:

1
<Message>The difference between the request time and the current time is too large.</Message>

After investigating a bit, we found that it means that our requests to S3 are timestamped and compared to the server local time in order to assure authenticity, so it seems that something went wrong with our local clock on Heroku. After realizing that the local time on the heroku machines (using the rails console) was correct, we monkeypatched the Dragonfly::DataStorage::S3DataStore module to use the sync_clock method on every access to the storage getter.

config/initializers/sync_dragonfly.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Dragonfly::DataStorage::S3DataStore.module_eval do

  def storage
    require 'fog'

    @storage ||= Fog::Storage.new(
        :provider => 'AWS',
        :aws_access_key_id => access_key_id,
        :aws_secret_access_key => secret_access_key,
        :region => region
    )
    @storage.sync_clock
    @storage
  end
end

What this patch does is simply to sync the clock before returning the storage object.

Currently, one day after we seem to have no exceptions of this kind anymore.

Rails Validates :presence of Boolean - Not What You Think

Rails validates :presence and boolean fields

First, take a look at this following model:

survey_question.rb
1
2
3
4
5
6
7
8
9
10
11
class SurveyQuestion < ActiveRecord::Base
  belongs_to :owner, :polymorphic => true
  has_many :dependent_questions, :class_name => "SurveyQuestion", :as => :owner
  has_many :survey_answers
  serialize :options, Hash

  attr_accessible :owner, :owner_id, :owner_type, :question_body, :question_type, :options, :required, :parent_answer_condition, :client_integration_param_name

  validates :required, :question_type, :question_body, presence: true

end

and a short run in the rails console:

Rails console
1
2
3
4
5
6
7
8
9
10
[3] pry(main)> SurveyQuestion.create!({owner_type: "Survey", owner_id: 1, required: false, question_body: "How old are you?", question_type: "text"})
   (0.1ms)  begin transaction
   (0.1ms)  rollback transaction
ActiveRecord::RecordInvalid: Validation failed: Required can't be blank
from /Users/eladmeidar/.rvm/gems/ruby-1.9.3-p448/gems/activerecord-4.0.0/lib/active_record/validations.rb:57:in `save!'
[4] pry(main)> SurveyQuestion.create!({owner_type: "Survey", owner_id: 1, required: true, question_body: "How old are you?", question_type: "text"})
   (0.1ms)  begin transaction
  SQL (6.6ms)  INSERT INTO "survey_questions" ("options", "owner_id", "owner_type", "question_body", "question_type", "required") VALUES (?, ?, ?, ?, ?, ?)  [["options", "--- {}\n"], ["owner_id", 1], ["owner_type", "Survey"], ["question_body", "How old are you?"], ["question_type", "text"], ["required", true]]
   (2.3ms)  commit transaction
=> #<SurveyQuestion id: 1, owner_id: 1, owner_type: "Survey", question_body: "How old are you?", question_type: "text", options: {}, required: true, parent_answer_condition: nil, client_integration_param_name: nil>

Note that SurveyQuestion has a boolean attribute named required, we also added it to a :presence validation along side other attributes. When we tried to create an instance of SurveyQuestion in the console with a required: false value, we got the following exception

1
ActiveRecord::RecordInvalid: Validation failed: Required can't be blank

Obviously, it is there. what happened?

ActiveModel::Validations::PresenceValidator

ActiveModel::Validations::PresenceValidator is the class responsible to handle :presence validations in ActiveModel, there it is:

rails/activemodel/lib/active_model/validations/presence.rb
1
2
3
4
5
class PresenceValidator < EachValidator # :nodoc:
  def validate_each(record, attr_name, value)
    record.errors.add(attr_name, :blank, options) if value.blank?
  end
end

The PresenceValidator basically adds an error when the value of the validated field returns true for #blank? but, the way Object#blank? treats booleans is the way that causes this issue:

Object#blank?
1
2
[1] pry(main)> false.blank?
=> true

Which means, that this is the reason why the :presence validation is failing for boolean fields with a false value, it simply returns true for false values.

Solution

Not really a solution, rather an ugly workaround but the way to get it done is to use the :inclusion validation instead.

survey_question.rb
1
2
3
4
5
6
7
8
9
10
11
class SurveyQuestion < ActiveRecord::Base
  belongs_to :owner, :polymorphic => true
  has_many :dependent_questions, :class_name => "SurveyQuestion", :as => :owner
  has_many :survey_answers
  serialize :options, Hash

  attr_accessible :owner, :owner_id, :owner_type, :question_body, :question_type, :options, :required, :parent_answer_condition, :client_integration_param_name

  validates :question_type, :question_body, presence: true
  validates :required, inclusion: [true, false]
end

I think that a better solution for this problem is to pass the column/field object to the validation as well. It will allow any validator such as PresenceValidator to perform specific validation techniques that match the field/column type - in this case - allowing a boolean field to have a false value.

Service Oriented Architecture Talk - DevConTLV June 2013

It was an awesome day at the ”Ozen” bar in Tel-Aviv last week, met tons of cool people and watched more than a bunch on great talks by Shai Rubin on financial applications going wild and Alexander Fok’s erlang in the instant messaging world.

It was awesome, and any thanks need to go to [Raphael Fogel] for pulling this one yet another successful time.

My irregular talk was about what we as developers need to learn from the real world organizations about building a redundant, scalable and sustainable applications. And yes, by “real world scalable and sustain able organizations” I mean the drug cartel.

Here is the presentation, voice excluded of course.

don’t do drugs.

SimpleStateMachine - a Simple Enum Based State Machine for Ruby

Following my previous post on Using Enums in Ruby we have been working on a state machine implementation in Ruby, based on an Enum representation of states.

Why should you use a state machine?

A state machine is mainly used to indicate a specific status a certain object is in. A State is some kind of a description - usually a few words commonly stored as a string in the storage layer. State machine simplify the process of transition between certain states that an object is allowed to do, mechanizing the transition itself - transition, callbacks and persistence included.

You should be using a state machine when your code defines different behaviors for an object - based on the current status the object is in. For example: a simple example for a state machine is a blog post. A blog post can be saved as a “draft” - then it won’t be shown in the blog, and a “published” state which makes it available for display in the blog.

You’ll probably want to add some callback functionality for the state change, like sending out a tweet notifying the world of your new post when a post becomes “published”.

The state machine in the Ruby world: AASM

One of the most widely used state machine gems in the Ruby/Rails world is probably the AASM gem. AASM is a great gem that has one basic flaw - it requires the state column / attribute to be of a String type. Using this gem your database will look something like this (a simple user table for example):

varchar mysql column for states, using MySQL
1
2
3
4
5
6
7
8
mysql> select id, username, status from users limit 2;
+----+--------------+---------+
| id | username     | status  |
+----+--------------+---------+
| 18 | zidane       | retired |
| 71 | zessi        | active  |
+----+--------------+---------+
2 rows in set (0.00 sec)

and to simplify your queries you may have scopes in your model to match the the available states:

a user ActiveRecord::Base model with AASM
1
2
3
4
5
6
7
8
9
10
11
12
class User < ActiveRecord::Base

  include AASM

  initial_state :active

  # more AASM configuration 

  scope :active, where(status: "active")
  scope :retired, where(status: "retired")

end

Yes, this is simple and clear enough to be used on almost any application that requires a state machine functionality, but when your data size increases beyond a certain point the string type dependency can be a real PITA.

Storing states: String vs. Integer vs. Enum

You have 3 options when you choose your data storage type for a state column: String (VARCHAR), Integer and Enum (Not all databases are supported).

Strings

Storing strings is by far the easiest implementation you can choose. As shown earlier when you store states as strings the code is readable, the database output is readable and you can add and remove states pretty easily (as long as those new states do not exceed your string column length LIMIT bar). As far as indexing strings, MySQL string indexes are bigger comparing to integer column indexes and comparing strings is a little bit slower than comparing integers so traversing a string index performance can drop a little bit.

Enums

ENUM column type is available out of the box for MySQL:

Enum column in mysql
1
2
3
4
CREATE TABLE users (
    ...
    status ENUM('active', 'retired')
);

And you can still query the ENUM column based on the string representation of the ENUM:

querying an ENUM column
1
mysql> SELECT * FROM `users` WHERE status = 'active';

Adding an ENUM fields to a Postgres database is a little bit tricky and requires a custom defintion for the ENUM column type

While simple to create, maintaining an ENUM column has its own PITA:

  • When you use an ENUM column, you’re technically moving data from where it belongs (in actual database fields), to somewhere it doesn’t (into the database metadata, specifically a column definition). This is different than putting constraints on the data which can achieve the same result.
  • When you create an ENUM column, you basically say “this is never going to change”. but when it does you’ll need to ALTER TABLE that can take a serious amount of time on big tables.
  • You’ll have to maintain the same list of available ENUM values for application level usage (e.g: a drop down) since you can’t easily extract a distict list of available ENUM values from the column itself.
  • Aside from not being reusable on the application layer, ENUM members aren’t reusable even in other tables.
  • Inserting an illegal value to an ENUM does not cause an error (like a constraint will do) - in most databases it will simply truncate the value to an empty string and insert the faulty data silently, yay!.

Integers

The great ancestor of everything, the integer type is the simplest of them all but carries an overweight of being completely unreadable and depends on conjunction with the application code to provide a textual representation for vogue number based values. It is pretty clear that

varchar mysql column for states, using MySQL
1
2
3
4
5
6
7
mysql> select id, username, status from users limit 1;
+----+--------------+---------+
| id | username     | status  |
+----+--------------+---------+
| 18 | zidane       | 1       |
+----+--------------+---------+
1 rows in set (0.00 sec)

doesn’t really makes it clear on which status zidane is.

Introducting: SimpleStateMachine

The SimpleStateMachine gem is now available at a alpha version and includes a very simple state machine implementation:

  • State definition
  • Transition setup
  • Callbacks
  • Enum representation of states

SimpleStateMachine does not makes it necessary to use ActiveRecord or any kind of ORM, it is a ruby free implementation. state enum indexes will be stored as integers in case you choose to use some ORM, and will be transformed back to strings.

The is still some work to be down, especially on the query interface so patches and pull requests are welcome.

Short, short example

SimpleStateMachine example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class User
  include SimpleStateMachine

  state_machine do |sm|
    sm.state_field "status"
    sm.initial_state "happy"
    sm.add_state "sad", :before_enter => :cry
    sm.add_transition :be_sad, :from => :happy, :to => :sad
  end

  def cry
    puts "*sniff sniff*"
  end
end

# Console
> sad_clown = User.new
> sad_clown.status
0
> sad_clown.human_status_name
"happy"
> sad_clown.be_sad
*sniff sniff*
> sad_clown.status
1
> sad_clown.human_status_name
"sad"