Ruby Array Set Intersection: A Rails UseCase and Algorithm Analysis in C
Here’s some smelly code I wrote for the Flatiron School prework progresstracker app.
Brittle Methods
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 

add_lesson
is brittle. A new coder on this project has to know to add lessons using this method; the design is therefore NOT transparent. There is also a duplication of data: calling lesson_order.length
gives you the total number of lessons, making a total_lessons
attribute a violation of the DRY principle. It should be a method instead.
topic_complete?
also felt prone to false positives and negatives. If an admin replaces a lesson, is a completed progress still complete? What about if a lesson is removed? Our method would return false.
I needed a way to actually compare what was in completed_lessons
and lesson_order
.
Set Theory in Full Motion
Set theory review: The intersection of two sets is the set of all common elements. Ruby’s Array class has such a method which can be used like an operator: &
.
The & operator is called the set intersection operator.
1


Failed Solution, Learned Lesson
1 2 3 

This failed because the order in which the comparison happens matters.
1 2 3 4 5 6 7 8 9 10 11 

Win
1 2 3 

BONUS: The Algorithm Behind the Set Intersection Operator
I took a a deep dive into Ruby source code to figure this one out. Full disclosure: I’ve never read or written C in my life, but with the help of the MRI Identifier Search, I was able to piece together how the algorith works.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 

The key to understanding this is in the for
loop.
 ary1 and ary2 represent the arrays for which we’re finding the intersection. ary3 is what the function
rb_ary_and
returns.  ary2 is turned into a hash table so that elements from ary1 can be easily matched up.
 v represents
ary1[i]
, in Rubyspeak.  vv is somehow tied to that value.
st_delete
returns 1 ifvv
was successfully deleted fromtable
, which really refers to ary2, and 0 if it wasn’t found. If it gets deleted, then that means ary2 must have had an element in common; push that element into ary3.
O(x+y)
In the worstcase scenario, the algorithm iterates through ary2 to create a table, giving us O(x) for x elements in ary2. Then, given there are y elements in ary1, we iterate through ary1 y times and look for the value in the hash table, giving us O(y). Therefore, in total, the algorithm has a bigo of O(x+y).