Two Nifty Bash Profile Aliases

Ever get tired of typing the same commands over and over? Try adding these aliases to your .bash_profile and then never look back.

1. dbreloadall

1
alias dbreloadall="rake db:migrate:reset && rake db:migrate:reset RAILS_ENV=test && rake db:seed && rake db:seed RAILS_ENV=test"

Reset (drop, create, then migrate) then seed.

Should’ve called this dbnuke. Or dbregen. Or dbreincarnate. Or dbdeebee (personal fav). Name it whatever you’d like.

2. schemareload

1
alias schemareload="rm db/schema.rb && rake db:migrate"

Use this when you’re doing a git rebase and there’s a schema conflict.


Ruby Array Set Intersection: A Rails Use-Case and Algorithm Analysis in C

Here’s some smelly code I wrote for the Flatiron School pre-work progress-tracker app.

Brittle Methods

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Progress < ActiveRecord::Base
  belongs_to :topic
  serialize :completed_lessons, Array

  def topic_complete?
    topic.total_lessons == completed_lessons.length
  end
end

class Topic < ActiveRecord::Base
  has_many :progresses
  serialize :lesson_order, Array

  def add_lesson(lesson)
    lesson_order << lesson.id
    total_lessons += 1
  end
end

add_lesson is brittle. A new coder on this project has to know to add lessons using this method; the design is therefore NOT transparent. There is also a duplication of data: calling lesson_order.length gives you the total number of lessons, making a total_lessons attribute a violation of the DRY principle. It should be a method instead.

topic_complete? also felt prone to false positives and negatives. If an admin replaces a lesson, is a completed progress still complete? What about if a lesson is removed? Our method would return false.

I needed a way to actually compare what was in completed_lessons and lesson_order.

Set Theory in Full Motion

Set theory review: The intersection of two sets is the set of all common elements. Ruby’s Array class has such a method which can be used like an operator: &.

The & operator is called the set intersection operator.

how-&-works
1
[ 1, 1, 3, 5 ] & [ 1, 2, 3 ]    #=> [ 1, 3 ]

Failed Solution, Learned Lesson

1
2
3
def topic_complete?
  completed_lessons & topic.lesson_order == topic.lesson_order
end

This failed because the order in which the comparison happens matters.

order-matters
1
2
3
4
5
6
7
8
9
10
11
lesson_order = [3, 2, 5]
completed_lessons = [5, 2, 3]

lesson_order & completed_lessons
   # => [3, 2, 5]

completed_lessons & lesson_order
   # => [5, 2, 3]

completed_lessons & lesson_order == lesson_order
   # => false

Win

1
2
3
def topic_complete?
  topic.lesson_order & completed_lessons == topic.lesson_order
end

BONUS: The Algorithm Behind the Set Intersection Operator

I took a a deep dive into Ruby source code to figure this one out. Full disclosure: I’ve never read or written C in my life, but with the help of the MRI Identifier Search, I was able to piece together how the algorith works.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
 static VALUE
rb_ary_and(VALUE ary1, VALUE ary2)
{
    VALUE hash, ary3, v;
    st_table *table;
    st_data_t vv;
    long i;

    ary2 = to_ary(ary2);
    ary3 = rb_ary_new();
    if (RARRAY_LEN(ary2) == 0) return ary3;
    hash = ary_make_hash(ary2);
    table = rb_hash_tbl_raw(hash);

    for (i=0; i<RARRAY_LEN(ary1); i++) {
        v = RARRAY_AREF(ary1, i);
        vv = (st_data_t)v;
        if (st_delete(table, &vv, 0)) {
            rb_ary_push(ary3, v);
        }
    }
    ary_recycle_hash(hash);

    return ary3;
}

The key to understanding this is in the for loop.

O(x+y)

In the worst-case scenario, the algorithm iterates through ary2 to create a table, giving us O(x) for x elements in ary2. Then, given there are y elements in ary1, we iterate through ary1 y times and look for the value in the hash table, giving us O(y). Therefore, in total, the algorithm has a big-o of O(x+y).


Testing Rails Callbacks: A Step in the Right Direction

Anyone who’s worked with tightly-coupled models in rails will likely know the pain of tracing the stack for an error triggered from a callback method. Such painful experiences have led developers to come to a consensus on general rules like “Use a callback only when the logic refers to state internal to the object”, according to Jonathan Wallace. More on this topic by Samuel Mullen in a great blogpost, The Problem with Rails Callbacks.

The following post revolves around an Invitation model I wrote with my team mates and then later debated the callback to use.

The Invitation model generates a unique token which it uses to identify users who have clicked on a registration link.

The callback I chose to use first was after_initialize because an invite object must always have a token.

invitation.rb
1
2
3
4
5
6
7
class Invitation < ActiveRecord::Base
  after initialize :add_token #NOTE my syntax highlighter dislikes the underscore. see below.

  def add_unique_token
    self.token = Tokenizer.Generate
  end
end

NOTE after initialize above should be after_initialize. For some reason my syntax highlighting plugin doesn’t like the underscore in the correct version.

The associated test:

invitation_spec.rb
1
2
3
4
5
6
describe Invitation do
  it "generates a random token on initialize" do
    invite = Invitation.new
    expect(invite.token).not_to be_nil
  end
end

However, another team mate suggested using before_create instead.

His model and test:

invitation.rb
1
2
3
4
5
6
7
class Invitation < ActiveRecord::Base
  before_create :add_token

  def add_unique_token
    self.token = Tokenizer.generate
  end
end
invitation_spec.rb
1
2
3
4
5
6
7
8
describe Invitation do
  it "generates a random token before create" do
    invite = Invitation.new
    expect(invite.token).to be_nil
    invite.save
    expect(invite.token).not_to be_nil
  end
end

Why does this subtle change matter?

First, using after_initialize could produce a false positive. While the after_initialize test does exactly what the it statement says, there’s a chance that we may miss, say, a validation error that happens before save. This test would pass, and so would the validations, but for some reason you might find that your invitations aren’t working properly. after_initialize allows us to omit saving from the test, which one could generally assume is always desired given that callbacks should only “refer to state internal to an object”.

Additionally, if you’re testing a callback that affects the internal state of an object, we must test for a state-change. Using after_initialize, we can tell that a new invite has a token. However, we cannot say that we’ve tested the non-presence and the presence of the token. The pivotal point again is around #save. #save triggers the change and we can say with certainty that we’ve not only seen a “zero to one” change happen, but that the change has persisted in our database.

UPDATE: Twitter comment from @scottadhoc: “what happens if you load that record from the db into a new object and then assert on that field?”

Good point! If I understand your suggestion correctly, here’s how I might refactor the after_initialize version of the spec:

invitation_spec.rb
1
2
3
4
5
6
7
8
describe Invitation do
  it "generates a random token #after_initialize" do
    invite = Invitation.new
    expect(invite.token).not_to be_nil
    invite.save!
    expect(invite.token).not_to be_nil
  end
end

This way, we can make sure that the data persisted. While this works, my feeling is that the implication of testing after_initialize allows one to omit the persistence part. I certainly fell into that trap and wouldn’t be surprised if I forget again while testing an after_initialize callback.

Lastly, if you use after_initialize with the underscore in a block of code in Octopress, your blog will refuse to generate. Coincidence? I think not.


RSpec Custom Matchers: A Deep Dive

What’s a Matcher?

expect(dan.current_zipcode).to eq(10002)

In the line above, the matcher is the method eq because RSpec will check to see if the value of dan.current_zipcode matches 10002.

Let’s say that this was a line you wrote for as part of a test for a location tracking app. You want your code to be as clear as possible for collaborators or even for your future self.

Wouldn’t it sound even better if you could write this?

expect(dan).to be_in_zipcode(10002)

This requires be_in_zipcode to be a method in RSpec’s matcher library. I’m going to briefly explain how to create this new matcher, but for alternate explanations, check out the Rspec Github Wiki and Chapter 7 of Aaron Sumner’s Everyday Rails Testing with RSpec.

My intention is not to write another Matcher tutorial (though this might inevitably happen). After reading Sumner’s chapter, I could imagine how matchers were created but ultimately, I felt a bit uncertain. With a little bit of Pry magic and doc-digging, I think I’ve finally put most of the puzzle together which I’m looking forward to sharing below! Constructive feedback and deeper insights are always welcome. :)

Match-Maker, Match-Maker, Make Me a Match!

Let’s write some code so that we can use the method be_in_zipcode(10002) from above.

General syntax at bare minimum first:

syntax.rb
1
2
3
4
5
6
7
8
9
10
RSpec::Matchers.define :new_matcher_name do |expected|
  match do |actual|
    # Lines of code you want this matcher to run
  end

  #optional description and failure message definition blocks
end

# From the describe block:
# expect(actual).to matcher_method(expected)

NOTE: Store your matcher files in spec/support/matchers/ and require them in spec/spec_helper.rb with the following line:

Dir[File.dirname(__FILE__) + "/support/**/*.rb"].each {|f| require f}

Now onto our example:

be_in_zipcode.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
RSpec::Matchers.define :be_in_zipcode do |zipcode|
  match do |friend|
    friend.in_zipcode?(zipcode)
  end

  # Optional failure messages
  failure_message_for_should do |actual|
    "expected friend to be in zipcode"
  end

  failure_message_for_should_not do |actual|
    "expected friend not to be in zipcode"
  end

  # Optional method description
  description do
    "checks user's current zipcode"
  end
end

Again, the failure messages and description are optional blocks. The match block is obviously mandatory. So now that we have our new matcher, let’s check out what’s going on behind the scenes.

Under the hood

RSpec::Matchers.define :be_in_zipcode do |zipcode|

The define method, extended from the DSL module, does the following:

define.rb RSpec::Matchers::DSL#define
1
2
3
4
5
6
7
8
9
10
def define(name, &declarations)
  matcher_template = RSpec::Matchers::DSL::Matcher.new(name, &declarations)
  define_method name do |*expected|    # => expected for us is the var zipcode
    matcher = matcher_template.for_expected(*expected)
    matcher.matcher_execution_context = @matcher_execution_context ||= self
    matcher
  end
end

# name = be_in_zipcode

The second line setting matcher_template initializes an RSpec::Matchers::DSL::Matcher object which stores the method name and the declarations block.

define_method then creates the be_in_zipcode method. The next line’s for_expected(*expected) method sets the zipcode as the expected value and your failure messages/description from the matcher template in the second line. The messages and description are stored in the @messages attribute hash.

The next line I’m not sure about since matcher_execution_context refers to an attribute accessor. I don’t see it in the Matcher module either. Any insight here would be great!

Moving onto the last piece: the match block.

1
2
3
  match do |friend|
    friend.in_zipcode?(zipcode)
  end

RSpec::Match#match stores the given block in @match_block, so essentially, @match_block = friend.in_zipcode?(zipcode).

Later, when you run your specs with the new matcher, the match_block will get called from the method RSpec::Match#matches?.

matches?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def matches?(actual)
  @actual = actual
  if @expected_exception
    begin
      instance_exec(actual, &@match_block)
      true
    rescue @expected_exception
      false
    end
  else
    begin
      instance_exec(actual, &@match_block)
    rescue Spec::Expectations::ExpectationNotMetError
      false
    end
  end
end

And that’s it!

So if I were to run the original line, expect(dan).to be_in_zipcode(10002), the following would happen:

  1. expect(dan) returns an RSpec::Expectations::ExpectationTarget object storing the object dan. Specifically: => #<RSpec::Expectations::ExpectationTarget:0x007fbf511e4c78 @target=#<User id: 1, name: dan>>
  2. the to method passes in be_in_zipcode(10002) and calls a method called handle_matcher
  3. … which checks if be_in_zipcode has been defined as a matcher.
  4. Since we’ve defined be_in_zipcode, matches? gets called our block, which we saw above.
  5. handle_matcher then figures out how to respond, given the messages you passed in when you defined the new matcher.

That’s our deep dive into RSpec’s Custom Matchers. Again, feedback always welcome. Hope you enjoyed the complexity as much as I did! :)


ActiveRecord, Equality, and Rogue

Almost done with my model specs! I ran into a couple of interesting problems that I posted on Stack Overflow, one of which I’ll cover briefly below. Oh yeah, and Rogue! I’ll explain that in the last section of this post.

The problem I ran into (from my StackOverflow post)

If you just read the whole post, skip to the last section of this blog post.

I’m testing chats between users in my app. I’m using RSpec and FactoryGirl

The test that’s not passing:

chat_spec.rb
1
2
3
4
5
6
it "creates a chat if one does not exist" do
  bob = create(:user, username: "bob")
  dan = create(:user, username: "dan")
  new_chat = Chat.create(user_id: dan.id, chatted_user_id: bob.id)
  expect(Chat.where("chatted_user_id = ?", bob.id).first).to equal(new_chat)
end

The failure message says:

1
2
3
4
5
6
7
8
9
Failure/Error: expect(Chat.where("chatted_user_id = ?", bob.id).first).to equal(new_chat)

   expected #<Chat:70120833243920> => #<Chat id: 2, user_id: 2, chatted_user_id: 3>
        got #<Chat:70120833276240> => #<Chat id: 2, user_id: 2, chatted_user_id: 3>

   Compared using equal?, which compares object identity,
   but expected and actual are not the same object. Use
   `expect(actual).to eq(expected)` if you don't care about
   object identity in this example.

Why is my query returning a different object id?

Answer from Simone Carletti on StackOverflow

equal checks object identity. The objects you are testing are two objects (instances) referencing the same record, but they are actually different objects from a Ruby virtual machine point of view.

You should use

1
expect(Chat.where("chatted_user_id = ?", bob.id).first).to eq(new_chat)

To better understand the problem, look at the following example

1
2
3
4
2.0.0-p353 :001 > "foo".object_id
 => 70117320944040
2.0.0-p353 :002 > "foo".object_id
 => 70117320962820

Here I’m creating two identical strings. They are identical, but not equal because they are actually two different objects.

1
2
3
4
2.0.0-p353 :008 > "foo" == "foo"
 => true
2.0.0-p353 :009 > "foo".equal? "foo"
 => false

That’s the same issue affecting your test. equal checks if two objects are actually the same at the object_id level. But what you really want to know is if they are the same record.

So what’s happening under the hood?

The “where” query returns an array of objects matching the query, and those objects must be Active Record objects that just handle the data. If I make another query, Active Record must be creating another set of objects to handle each row of data.

I wanted to confirm this before throwing it out on the web as “fact”, so here’s what an Active Record object is from Martin Fowler himself:

An object that wraps a row in a database table or view, encapsulates the database access, and adds domain logic on that data.

Think of an Active Record object sort of like Rogue (see above). For those of you who don’t remember, Rogue absorbs the memories and abilities of any person she touches.

Similarly (sorta), an Active Record objects takes the data (memories) of a certain row and then takes on the behavior (abilities) of the object type matching the table name.

Ok, so that was really just an excuse to get Rogue somewhere on my blog. While I’m at it, here’s one of her destroying Ororo (Storm).

So there you have it! Moral of the story: generally remember to use eq to test equivalence in RSpec with ActiveRecord.

Oh, and here’s a bonus resource: equal, eql, eq, and == in RSpec.


How and Why I Began Testing in Ruby on Rails

Comments

Testing is one of those things that feels a bit like eating well or going to the gym. Since it’s already ten days into the New Year and I have yet to commit to any resolutions, let this be it: no more will I make excuses for skipping testing.

What does this really mean to me though? Am I eating broccoli just because someone told me it’s good for me? Here are three reasons that have driven me to make the change.

TDD Reason 1: Manual Testing Sucks; TDD Saves Time

My friend told me about the job she left before coming to the Flatiron School and one of her responsibilities looked a bit like this:

What. A. Nightmare. There should be a -phobia for this.

Don’t get me wrong — I’m sure this could be pretty efficient if the scope of the UX is very small, or you don’t know-how to implement TDD yet. However, how can you be sure your site won’t expand, or why not learn how to use a gem like Capybara to save you time in the future? Used in conjuction with Guard, you could really save some serious time while coding!

TDD Reason 2: Code Confidence

Now that I’ve had a bit more experience writing web apps (still without TDD), I’ve had the pleasure of seeing them grow and change throughout development. On the other hand, everytime I implemented a new feature, I had less and less confidence that the rest of my code base would work fine. At some point, the prospect of a new feature started feeling like a laundry list or the guarantee of a new bug. Testing will certainly help me find bugs as I introduce them, and therefore boost my confidence in in my code.

TDD Reason 3: Thinking Before Doing

Writing tests before writing production code forces you to clearly iterate what it is you want your application to do before you start coding. Ideally, you want your tests to last as long as possible as well, so it will force you to make important decisions about design as well. The prospect of having well thought-out code with lasting design excites me! I’ve had several experiences already of trying to refactor poorly designed code and it’s quite painful.

My Next Steps

I’ve decided to begin my TDD exploration by refactoring my first app, Little Red Wagon (LRW). My current reads are Aaron Sumner’s book, Everyday Rails Testing with RSpec, and Practical Object-Oriented Design in Ruby by Sandi Metz. I’m about 30% through each book right now and have written a few model specs already for LRW. I’ve decided not to use Factory Girl since it will slow down testing and Rails’ built-in fixtures are satisfactory for what I currently have. In any case, the grand plan is to get some hands-on practice writing tests using the TDD book first and then taking on the much larger challenge of redesigning my code base. Thanks for reading, and more updates to come on TDD!


Big O Notation

Big O notation might look something like this: O(log N)

1
2
3
N = number of input data
O( ) = function notation for Big O (just tells you we're using Big O notation)
log N = the function that describes the efficiency of the procedure or algorithm as N grows bigger and bigger

Programmers use Big O Notation as a way to express how efficient an algorithm or procedure is at performing its designated task (e.g. sorting, searching). The function tells us the maximum number of “actions” it will take for the algorithm to achieve its goal, given N data.

Example 1: O(N)

O(N) is the linear case. Given 10 data, the worst-case scenario is that it will take 10 “actions” to search for a target, or sort the data, or whatever the goal of the algorithm is. Given 100 data, it will take 100 actions. Given 2000 data, it will take 2000 actions. Given N data, it will take N actions.

A real life example would be if you were looking for a book on your friend’s bookshelf.

Lets say you want your program to find any given book. You could tell the program to start checking books one by one from the left. If you wanted Harry Potter 3, it would only take 3 actions (or tries) because it’s the third book in. However, Big O tells us the worst-case scenario. What book would cause the algorithm to take the most possible actions? Harry Potter 7 — it’s the last book so the algorithm would have to run all 7 times. Given 7 books, it took the algorithm 7 actions / tries to find the book in the worst case scenario.

What if we ran this algorithm on 10 books? The worst case would take all 10 tries. How about 1,000,000 books? It would take 1 million tries.

So, TL;DR — O(N) tells us that given N data, it will take an algorithm N actions / tries to accomplish a task.

Example 2: O(1)

What does O(1) tell us? The function inside is simply “1”. This means that no matter what number N is, it will always accomplish the task in 1 try. It represents the most efficient algorithm possible.

Does this exist? Sure it does! But it might not give you exactly what you’re looking for. Here’s an algorithm that can be described with O(1).

1
2
3
def efficient_but_useless
  return true
end

TL;DR algorithms with O(1) always finish in 1 try.

Example 3: O(N2) or any higher exponent

Is this an efficient algorithm? In other words, if I input a large amount of data for N, what does that tell me about the algorithm? If I input N=1, then no matter what, it will take 1 try. But if I put in N=10, then it will take 102 = 100 actions / tries to complete the algorithm (worst case scenario). Imagine if the exponent were higher, or if N were higher. Then the algorithm would be super inefficient!

What kinds of algorithms would have O(N2)? Simply put, algorithms with nested loops usually exhibit this behavior. If you have to run a procedure on each item, and each item has to check itself against every other item in some way, then worst case is that you run the procedure N2 times. In other words, it takes N2 actions.

Example 4: O(log N)

This time the function is logarithmic.

Imagine that the x-axis represents N, how much input data you have, and the y-axis tells us how many actions or tries it takes the algorithm to finish (worst case scenario, as usual). As N grows larger, the number of tries grows, but slowly.

Let’s try calculating a few examples. Keep in mind that when programmers write log N, it refers to log base 2. Also, since the number of actions has to be a whole number, if you get a decimal, the rule of thumb is to round up. So given 10 data, it will take log(10) = 3.32 which rounds up to 4 actions, max. This isn’t that much more efficient than O(N), but what if N=1,000,000? It would take your algorithm log(1000000) = 19.93 which rounds up to 20 tries, max. For 1000000 data, that’s pretty efficient!

When would we use a logarithmic function to describe the efficiency of an algorithm? One example is the binary search. The binary search algorithm finds a target value within a set of sorted data by repeating the process below until you’re left with one value.

  1. Find the middle value. If there is more than one, take the higher value.
  2. Compare the target value with the middle value. If it’s equal, then we’ve found the target. Otherwise, if the target > middle value, then eliminate all values to the left of the middle (lower than the middle); we know the target is in the half with greater values. If the target is <= middle, eliminate the right.
  3. Repeat steps 1 and 2 on the remaining side. Continue until you are left with one value; this is your target.

So let’s say this is your data, and you’re looking for the value, 4.

1
2
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
target = 4
  1. Find the middle value. There are 10 values, thus two middles. Take the higher: 5.
first_action
1
2
3
4
5
middle = 5
[0, 1, 2, 3, 4]; [5, 6, 7, 8, 9]

# left side = [0, 1, 2, 3, 4]
# right side = [5, 6, 7, 8, 9]
  1. Compare target with middle. Is it equal to 4? No. Since target <= middle (4 <= 5), eliminate the right side.

  2. Repeat steps 1 and 2 on the left side. We’ll keep doing this until we find the target value, 4.

second_action
1
2
3
4
5
middle = 2
[0, 1, 2, 3, 4]

# left side = [0, 1]
# right side = [2, 3, 4]

Check: is target = middle? No. So onto step 2, compare: target > middle (4 > 2). Use the right side.

third_action
1
2
3
4
5
middle = 3
[2, 3, 4]

# left side = [2]
# right side = [3, 4]

target = middle? No So compare again. target > middle (4 > 3). Use the right side.

fourth_action
1
2
3
4
5
middle = 4  # note, we always take the higher of two as the middle for an even set.
[3, 4]

# left side = [3]
# right side = [4]

Target = middle? Yes. We’ve found the target and it took 4 tries!

Now let’s make sense of this as a logarithmic function: O(log N). We saw above that the calculations work, but why is it log base 2, and why log? If you think of logs as the inverse of exponents, and exponents as repeated multiplication, then logs are repeated division. Where are we dividing in the above procedure? Every step! We’re cutting each set in half over and over again until we reach the target, hence repeated division. Why is it log base 2? Log base 2 simply means that we keep dividing by 2, which makes sense because we’re cutting the data set in half.

I chose the target number 4 in the binary search method above because it represents the worst case scenario for a set of 10 data. It took 4 tries / actions, and log(10), again with an assumed base 2 since we’re in the world of programming, gives us 3.32 which rounds up to 4. Tada!

If you want to see the binary search method or linear methods in action, here’s a rather straight-forward video demonstrating what I showed above.

Conclusion and Resources

I hope that this post helps clarify big O notation for those of you who are brand new to the idea. A few other resources I feel are really well-done: this visual, real-world explanation and this much more technical explanation. I didn’t include code in this post because these algorithms are all solved problems already and with some Google sleuthing, you will repos full of them in a multitude of languages. One more resource I like is the big o cheat-sheet.

Happy coding!


Self-Referential Associations, AKA Self Joins

Week 2 of learning Rails. After 6 weeks at Flatiron, I feel like a beginner again, but not quite like I’m stranded on a deserted island — more like I’m Link, and although I wield a wooden sword, new items abound and I’m eager to learn how to use them.

Although the Rails Guides have been invaluable, trying new constructs myself is how I internalize a concept. This blog post will be about associations, particularly self joins.

Mini-Project: Self-Joins

I created a basic rails app and deployed it on Heroku here. Try it out to see how the customer data behaves. Imagine what the schema (or “table”) might look like and then read on.

What is a Self Join?

Below is a self-join. While foreign keys are usually used to link data between two tables, you can use foreign keys to refer to data within the same table to create one that looks something like this:

(source site)

I originally tried the following associations below.

customer.rb BROKEN VERSION
1
2
3
4
class Customer < ActiveRecord::Base
  has_many :referrals, class_name: "Customer", foreign_key: "referring_customer_id", conditions: {:referring_customer_id => :id}
  belongs_to :referring_customer, class_name: "Customer", foreign_key: "referring_customer_id"
end

This didn’t work. After some serious research, stackoverflow, posting my own stackoverflow question (thanks Peter!!!), trial-and-error, and pry sessions, I refactored my code to:

customer.rb WORKING VERSION
1
2
3
4
class Customer < ActiveRecord::Base
  has_many :referrals, class_name: "Customer", foreign_key: "referring_customer_id"
  belongs_to :referring_customer, class_name: "Customer"
end

Let’s think about this for a second. Why does this work and what does it do? has_many and belongs_to provide Customers with a set of methods to access their associated data. The options passed often do not change the database since Rails tries to keep logic defined within the application. Why? I’m not exactly sure but I suppose this is partially why Rails easily supports a wide range of databases.

Here’s what the arguments do:

When else can I use self-joins?

Generally you can use self-joins anytime you need to create a hierarchy-like, “nested” structure.

Some other applications could be:

The Master Sword!

Just when you thought THAT was exciting, there are also plenty of GEMS that allow you to create these nested structures! Two from the Ruby Toolbox are Awesome Nested Set and Ancestry.


Ruby ‘Exceptional’ Knowledge

Foreward

Most of us have encountered the methods raise, rescue, catch, throw, and break in Ruby. We can generally understand what’s going on when we read code that uses it, but what exactly is the difference between them all? Here’s a quick guide plus some fun and useful facts.

If you remember just one thing…

Raise and Rescue

raise and rescue are used exclusively for handling errors. By default, raising an error will exit the program.

exception.rb
1
2
3
4
5
6
7
8
9
def test_rescue
  puts "This is before raise"
  raise "Raised an error"
  puts "This is after raise. It won't ever run."
end
test_rescue
# => This is before raise
# => test.rb:3:in `test_rescue': Raised an error (RuntimeError)
  from test.rb:9:in `<main>'

This happens unless there is a rescue statement which will run in case of an exception.

exception.rb
1
2
3
4
5
6
7
8
9
10
def test_rescue
  puts "This is before raise"
  raise "Raised an error"
  puts "This is after raise. It won't ever run."
  rescue
  puts "I'm rescued!"
end
test_rescue
# => This is before raise
# => I'm rescued!

The above Ruby code can be rewritten like so:

exception.rb
1
2
3
4
5
6
7
8
9
10
def test_rescue
  puts "This is before raise"
  raise "Raised an error"
  puts "This is after raise. It won't ever run."
  rescue Exception => e
  puts e
end
test_rescue
# => This is before raise
# => Raised an error

Ruby has many different types of exceptions (see documentation). Raise takes up to three parameters: * the exception type * an error message * an array of callback information.

All three are optional and Ruby knows that if you only pass in a string that it’s the message. Usually you don’t set the last parameter since Kernel#caller automatically creates that array.

Here are a couple of valid raise statements.

Message parameter given
1
raise "This is an error"
Error and Message parameters
1
raise StandardError "Most error subclasses extend StandardError"

Exceptional Ruby

Exception is the root of Ruby’s exception hierarchy. It’s the class from which all Exceptions descend. It is king. This has a very interesting consequence.

rescue Exception rescues from EVERYTHING, including syntax errors, load errors, and any of the following listed below.

1
2
3
4
5
6
7
loop do
  begin
    eval dinosaurs ru1ez the pl@netz!!! ROArR{bark}[:ARF].ENV
  rescue Exception
    puts "What meteor?"
  end
end

Break, Catch, and Throw

break, catch and throw are used in order to terminate execution early when no other work is needed. break leaves the current loop while the catch and throw combination can be used to break out of any number of loops at one time.

break example
1
2
3
4
5
6
array = ["brainfuck", "ruby" "befunge", "python", "perl"]
array.each do |language|
  puts "My favorite computer langauge is #{language}"
  break
end
#  => "My favorite computer langauge is brainfuck"
hypothetical throw and catch example
1
2
3
4
5
6
7
8
9
10
11
12
# => recipe_hash = {...}
def show_recipe_for(recipe_name)
  recipe = catch(:recipe) {
    recipe_hash.each do |meal_type, dish_hash|
      dish_hash.each do |dish, dish_recipe|
        if recipe_name == dish_recipe
          throw :recipe, dish_recipe
        end
      end
    end
  }
end

Notice that the two loops are enclosed in the catch block. This means that once the throw statement is executed, it will store the value of its second argument into :recipe and send it back to the catch statement. By doing so, it exits all the loops after finding the first recipe match. From there, the method finishes executing as normal.

Because my example is a bit contrived, I will post a real-life example from another blog by rubyist Avdi Grimm.

google search scraping example
1
2
3
4
5
6
7
8
9
10
11
12
13
def show_rank_for(target, query)
  rank = catch(:rank) {
    each_google_result_page(query, 6) do |page, page_index|
      each_google_result(page) do |result, result_index|
        if result.text.include?(target)
          throw :rank, (page_index * 10) + result_index
        end
      end
    end
    "<not found>"
  }
  puts "#{target} is ranked #{rank} for search '#{query}'"
end

Since loading pages over and over again can be an expensive process, the coder above uses a throwcatch to exit the loop when the first matching result is found.

Throw, Catch and Sinatra

An even more mind-blowing example from the same blog post reveals that Sinatra has a built-in catch for the #last-modified method. You might use this method to check a user’s cache for what version of a certain page the user has on his/her machine. Why would you do this? Simple! In order to cut out any expensive and unnecessary processing. If the page in the cache is old, then you’d update the page. Otherwise, just load from cache.

For your convenience, here’s the simplified code Grimm posted to demonstrate.

1
2
3
4
5
6
def last_modified(time)
  response['Last-Modified'] = time
  if request.env['HTTP_IF_MODIFIED_SINCE'] > time
    throw :halt, response
  end
end

When Ruby encounters the throw, it zips back up the call stack looking for a matching symbol, :halt. Where’s the catch block though? It’s clearly not in the same method as the throw. This means that it must be further up the stack. In other words, #last_modified was called within a catch block.

1
2
3
4
5
catch (:halt) do
  # code
  last_modified(time) # => the throw in this method sends :halt up to the encapsulating catch
  # code
end

ActiveRecord

I took a bit of a dive today and looked a Ruby on Rails presentation. Here’s what I gleaned from looking through all the slides.

Here are some concrete query comparisons between using SQLite3 versus Rails’ ActiveRecord.

Assume we have a table called “dogs,” each with a name, age, weight, and type. Each attribute would be a column, and each row would represent a different dog.

Table using SQLite3
1
2
3
4
5
6
7
CREATE TABLE dogs (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  name TEXT;
  age INTEGER;
  weight INTEGER;
  type TEXT
);

We can also imagine this as a class of Dog objects in Ruby. Here’s how one might create the barebones structure in Ruby.

Dogs as objects of class Dog in Ruby
1
2
3
class Dog
  attr_accessor :id, :name, :age, :weight, :type
end
Table initialized using Rails’ ActiveRecord
1
2
class Dog < ActiveRecord::Base
end

See how simple that was? Now we can make queries on the table.

SQLite3 query for first dog in table
1
SELECT * FROM dogs WHERE id=1;
Rails query for first dog in table
1
Dog.find(1)

That’s just the beginning! Check this out.

SQLite3 query appending conditions
1
SELECT * FROM dogs WHERE age=8 AND type="corgi";
Rails query appending same conditions
1
Dog.find_by_age_and_type 25, "corgi"

It seems like ActiveRecord must use certain keywords like ‘by’ and ‘and’ in order to dynamically use the attribute names that the user inputs.

What is really beautiful though is the idea of “configuration of convention” that Rails implements in order to shorten the amount of code you need to write. Slides 42-44 of the presentation gave a particularly good example of the idea. No wonder people love Rails!

The slides continue talking about the MVC


Recursive Arrays

Foreword

I feel a certain level of confidence declaring that every programmer has been in the situation where he/she has 10+ tabs open trying troubleshooting a problem and finally concludes, “There must be an easier way.” Or, if you’ve been ‘in the zone’ for long enough, you might start wondering what exactly got you in this troubleshooting hell to begin with.

In this post, I will share what I learned from one such experience.

Context and Problem

I was writing an automatic schedule-maker in Ruby. create_groups is a method that, takes an array of student names and returns an array of the students in a specified group sizes. The create_groups method also takes a parameter that allows you to specify a number of groups. The student groups should be randomized.

First, I defined the method create_groups, set up an array of grouped students that I can push groups into, and returned the value.

1
2
3
4
5
6
7
def create_groups(students, group_size, number_of_groups)
  groups = []  #=> store the groups of students in this array
  # Implement magic.
  groups  #=> Return groups
end
create_groups(students, 4, 20)
students = [.......] #=> This array contains 41 student names. Check my Github Gist linked below for the array I used in the completed program.

After considering several strategies, I decided that it would make the most sense sort the 40 students using what I dubbed the “card dealing method” — The first student goes to group 1, the second to group 2, the third to group 3, and so forth.

1
2
3
4
5
6
7
8
def create_groups(students, group_size, number_of_groups)
  groups = []
  normalized_list = normalize(students, group_size, number_of_groups)
  normalized_list.each_with_index do |name, i|
    # This part will sort students using the "card-dealing method"
  end
  groups
end

Since there are 20 groups of 4, I needed 80 students. On an abstract level, the number of students I need in order to sort (the “desired length”) is “# of groups” x “# of students per group”.

I decided to create another method which would normalize my list to this set amount.

The #normalize method will return an array of desired length by replicating the students a number of times, and then slicing out the desired number of students from the replicated array.

Here’s what I came up with at first:

1
2
3
4
5
6
7
8
9
10
11
12
def create_groups(students, group_size, number_of_groups)
  #...folded this code for now...
end

def normalize(list_to_norm, group_size, number_of_groups)
  desired_length = group_size*number_of_groups
  new_list = list_to_norm
  while new_list.length < desired_length
    new_list << list_to_norm
    new_list = new_list.flatten
  end
end

When I ran this given a student array though, I received the following error:

1
rb24:`flatten': tried to flatten recursive array (ArgumentError)

WAT? After many attempts at troubleshooting and searching the web for an answer, I decided to inspect what exactly I was trying to flatten.

1
2
3
4
5
6
def normalize(list_to_norm, group_size, number_of_groups)
  desired_length = group_size*number_of_groups
  new_list = list_to_norm
  new_list << list_to_norm
  puts new_list
end

The output was VERY telling. The ... line below is just a placeholder for names 4 through 40.

1
2
3
4
5
6
Name1
Name2
Name3
...
Name41
[...]

The clue was in the [...].

I then tried running new_list.object_id and list_to_norm.object_id and they turned out to be the same.

A-ha!

The [...] indicated that new_list is now a recursive array. More on recurive arrays here.

As it turns out, my problem was in the 2nd line of my normalize method.

1
new_list = list_to_norm

This line of code sets the new_list variable to point at the exact same object as the list_to_norm method. So later, when I called new_list << list_to_norm, I ended up pushing an Array object into itself. The image below

However, Ruby clearly didn’t like this, and I suspect that it has to do with the way Array.flatten works. At the very least, we can conclude that .flatten can only flatten two different Array objects.

So that’s what I learned in 40 minutes of head-banging against a figurative brick wall. I hope you learned something as well from this post!

Also, as promised, here is a link to my full gist.


(Not the) First Post

Comments

Level up!

This is my post using the Octopress framework. I’m excited to be able to start posting code without taking screen shots of my Sublime Text.

My original code blog on Wordpress.