“What” is more Important than “How”
June 10, 2010
The return on an investment in “How” is much less than the return from an investment in “What”.
Learning about “What” concerns itself with things like “what exists?”, “what do people want?” and “what can be done?”. Learning about “How” concerns itself with “how is this done”, “how does that work”, “how do they do that”. Knowledge about “What” leads to answers about “when” and “why”. These kinds of answers lead to leverage which can lead to profit. Learning about “How” is easy but leads only to competition. Anyone can learn “How” to do something, but if they don’t know when or why, then they’ve only positioned themselves to be used like a tool.
Tools are interchangeable so the ones who know “How” are easily swapped around leaving them with no real power over their destiny. Tools can also become obsolete and get replaced by something better. If the only goal is acquiring knowledge about “How” but no research into “What” has been done then good decisions about picking the “How’s” to invest in can’t be made.
To be really useful as an individual or company, it’s knowledge about “What” that is most important.
Safari 5 – Reader feature
June 8, 2010
The newest version of safari came out today and I really like the new “Reader” feature which allows the user to better focus on the article that they are trying to read. I’m not sure how sophisticated it is yet but it’s supposed to detect multi-page articles and merge them into one page and also provides some additional controls for sharing. It removes the clutter of adds and other surrounding content to make for a better reading experience. Yesterday I found this article which suggests that the busyness of the internet makes it hard to focus in and understand what we are reading. Maybe this new safari feature will help with this. I’ve attached a video demonstrating the Reader feature in Safari 5.
I was looking through some old code I had written and found this small and cryptic method for converting a number of seconds into a string containing of this format “hh:mm:ss” and I will describe how it works. First, the code:
def seconds_to_time(secs)
hms = [3600, 60].inject([secs]) {|x, y| x += x.pop.divmod(y)}
return hms.map {|y| "%02d" % y}.join(":")
end
This method leverages the behavior of the Enum.inject method and the Numeric.divmod method to creatively convert the seconds into an array containing hours, minutes and seconds. It then utilizes the Enum.map method to join these values into a recognizable time string. Here is some sample output:
puts seconds_to_time(63628) 17:40:28
The inject method is passed in a starting value and on each iteration the return value of the previous iteration is passed again. This parameter is commonly called an accumulator. The accumulator shows up in the provided block as the first block variable called x. The initial value of this accumulator is an array containing the original seconds. The second parameter to the block (y) is the value from the original array that is the focus of the current iteration. In this example y starts out as 3600.
In the first iteration this value is pop’d out of this array and the divmod method is called passing in the first value from the array that inject is being called on. This value is 3600 and represents the number of seconds in one hour. The return value of the divmod method is an array with the integer part as the first position and the remainder as the second position [17, 2428]. Since the original seconds were pop’d from x, x will now be empty and += will cause x to fill up with the 2 values that were returned from divmod. At the end of the first iteration of the inject method the accumulator variable will contain [17, 2428].
In iteration 2 y is set to 60 which is the number of seconds in a minute while x is set to the value of the accumulator from the previous iteration [17, 2428]. 2428 will be pop’d from this array and divmod will be called on this value with a parameter of y. The result of this call is the array [40, 28] which represents the number of minutes in 2428 seconds and the remainder. This array is then appended to x causing it to become [17, 40, 28] which is the end of our second iteration.
Since this is our last iteration the return value for the inject call will be the array [17, 40, 28]. This array is then map’d into an array of strings formatted as 2 digit integers. This step is needed to pad single digit positions with a leading zero. The final step is to call the join method to glue these pieces together with a colon.
I’ve decided to start a new Series of blog posts focused on the analysis of existing code that all developers depend on, and in some cases, take for granted. I know this exercise will uncover useful programming practices and “tricks” to help me become a better programmer. My hope is that by sharing this information in this blog, other developers will be helped as well. My goal is to keep the length of these posts small enough that they are easy to digest in a few minutes of reading but still provide useful insights to the reader.
Installment 1 – ActionView::Base.word_wrap
The word_wrap method within ActionView is a very useful method for reformatting text to a specific column width (default is 80 characters). A high level description of how it works is that it breaks up the original text into chunks based on any existing EOL characters found in the text. For each chunk or paragraph, additional EOL characters are inserted at appropriate locations so as to limit the sentence width to the desired number of columns without cutting any words in half. The resulting chunks are then glued back together with additional EOL characters.
A simple example using a line_width of 8 characters is as follows:
word_wrap('Level Five Solutions', :line_width => 10)
# => Level Five\nSolutions
Lets look at how this is implemented within the TextHelpers.rb class bundled with ActionView:
def word_wrap(text, *args)
options = args.extract_options!
unless args.blank?
options[:line_width] = args[0] || 80
end
options.reverse_merge!(:line_width => 80)
text.split("\n").collect do |line|
line.length > options[:line_width] ?
line.gsub(
/(.{1,#{options[:line_width]}})(\s+|$)/,
"\\1\n").strip :
line
end * "\n"
end
I had to do some creative “wrapping” of the code to make it fit our blog layout. How ironic!
The relevant parts of this method begin on line 8 where it splits the original text on any existing “\n” characters and then operates on each “line” within the body of the block past in to the collect method. A simple ternary operator separates the lines that are longer than the line_width from those that are not. The lines that are longer are “embedded” with EOL characters through a call to gsub with a very creative regular expression:
/(.{1,#{options[:line_width]}})(\s+|$)/
The first pair of parenthesis identify this as a capturing regular expression. The period says to match any character up to the number identified by the following curly brace expression. The curly brace section is greedy in that it tries to match up to the line_width first and then starts falling back from there. It decides to fall back based on the trailing look-ahead expression which requies that the next character is a space or the end of the line. After the regular expression matcher extracts a particular subset of the text it is replaced by the gsub method with the following expression:
"\\1\n"
This expression builds a new string containing the substring captured by the regular expression (\\1) and a trailing EOL character (\n). The “g” in gsub means global so that the substitution is applied across the entire string. This means that the regular expression is repeated as many times as it takes to fully modify the original text to include EOL characters at all the “appropriate” locations based on the line_width provided.
The final tail end of this method “multiplies” the result of the collect statement by a string containing the EOL character (\n). The result of the collect method is an array containing each “line” from the original text that was broken by the call to split. The multiplication operator (*) on the Array class has special logic when working with strings. It concatenates each element of the array with the string provided as the second parameter and additionally concatenates the entire list into one long string. It does not add a final trailing occurrence of the EOL character. Here’s an example to clarify this point:
['1','2','3'] * 'a' 1a2a3
Notice that the ‘a’ character only shows up between elements ’1′ and ’2′. It does not show up at the end.
It looks like I’ve gotten to the end of this method, and therefore, the end of this blog post. Some of the lessons I’m taking away from this post include additional regular expression knowledge and a better understanding of how the multiplication operator of the array class behaves when operating on strings.
I have not yet decided what my next post will cover (or when I’ll get it done) but I hope to continue the series by diving into other useful code looking for interesting techniques and cool “tricks”. If anyone reading this series has an idea for some code that would make a good topic, just post it to the comments and I’ll try to cover it in a future post.
Breaking up a list into random groups of size n
April 28, 2010
class Array
# Useful for breaking down one large array into
# smaller randomly organized arrays of size 'number'
# How it's used: list.random_groups_of(4) =>
# returns [[...],[...]...]
# When the last left over group is "small" and
# no_small_groups was set to true then add these
# items to the other groups.
def random_groups_of(number, no_small_groups = false)
randomized = sort_by{ rand }
groups = []
if (number && number > 0)
randomized.each_slice(number) do |slice|
groups << slice
end
if (groups.size > 1 &&
groups[-1].size < (number/2.0).round &&
no_small_groups)
groups.pop.each_with_index do |e, i|
groups[i % groups.size - 1] << e # round robin
end
end
end
groups
end
end
Here is an example of how to use the random_groups_of method
names = ['one',
'two',
'three',
'four',
'five',
'six',
'seven',
'eight',
'nine',
'ten',
'eleven',
'twelve',
'thirteen',
'fourteen',
'fifteen',
'sixteen',
'seventeen',
'eighteen',
'nineteen',
'twenty',
'twentyone',
'twentytwo']
names.random_groups_of(4, true).
each_with_index do |g, i|
puts "#{i+1}. #{g.join(", ")}"
end
The output from this example:
1. thirteen, two, eleven, eighteen, seventeen 2. twenty, three, twentyone, eight, one 3. nine, fifteen, six, ten 4. seven, four, twelve, fourteen 5. sixteen, nineteen, twentytwo, five
Here is the output when the :no_small_groups option is left off or set to false:
1. fifteen, twenty, nineteen, eight 2. two, one, ten, eighteen 3. fourteen, sixteen, three, four 4. thirteen, seven, twentytwo, nine 5. twentyone, twelve, six, seventeen 6. eleven, five
Safari Browser Tip
April 9, 2010
The Safari browser has a shortcut bar where you can dock websites that you visit frequently and I figured out that you can activate these links using the keyboard. If you press command+1 (apple key plus the number 1) you will activate the shortcut in the first position. Other shortcuts can be activated using a different number. In the past I’ve activated this feature by accident and have been annoyed that it happened. But now that I’ve had time to digest this feature, I’ve figured out a good way to use it. One of the websites I visit most often is my todo list in basecamp. If I put a shortcut to this page in position 1 I can easily get to my todo list with a quick keystroke. Here’s a short video that does a better job of explaining this tip.
Still living with the original iPhone
March 31, 2010
I was in a store a week ago and someone in the store commented about how cool my iphone was. I too think it’s still very cool! The original iphone is just so darn useful and durable that I find it hard to justify replacing it with a newer version. The touch screen form factor has proven very durable and the battery continues to deliver adequate power for my needs. I’ve never used a case and all the original functions still work. There are very cool features in the newer models that I’m “missing out on” but to say that this phone is no longer relevant because of these missing features would be a gross over statement. When people ask me about my phone I’m proud to say that I’m still using the original iphone which is now almost 3 years old. If and when I do replace it I plan to jailbreak it and set it up as my google voice phone for use around the house.
Using Rails ActiveRecord to incrementally update a database when a long running update statement simply won’t work.
March 30, 2010
If you’ve ever tried to use sql to perform various operations on database tables with millions of records you’ll know first hand how frustrating it can be waiting hours and even days for a single update statement to return. If you should lose network connectivity or if the server should crash in the middle of one of these long statements, the database nicely rolls back the transaction that it’s been working on for the past 10 hours. Also there is no way to track the progress of the operation in order to predict how long it will take to execute. Using Rails ActiveRecord and a small amount of ruby code (in the form of a rake task), these same operations can be performed incrementally, with the added ability to stop, continue and monitor progress. The database updates may take longer to run but that is a fair tradeoff given the above benefits. The ruby code will quietly jug away updating records little by little until they are all done.
Here is some sample code demonstrating this technique:
...
task 'zip9toRes' => :environment do
desc "populate res_count in zip9"
sql = ActiveRecord::Base.connection();
#start at a specific db id
start_id = 701407
zip7s = Zip7.find(:all,
:select => 'id, zip',
:order => 'id',
:conditions => ['id > ?', start_id])
zip7s.each do |z|
sql <<SQL
update zip9 z set res_count =
(select count(*) from residential
where zip9 = z.zip)
where zip7 = '#{z.zip}'
SQL
sql.update sql
show_progress();
if @@stop
puts "stopped at #{z.id}"
break;
end
end
end
...
The above task is looping through 1.1 million zip7 records and for each one it’s telling the zip9 table to populate a res_count column for all zip9 rows in that zip7. There are roughly 60 million zip9 records and for each one we’d like to know how many homes there are. The residential table contains 124 million address records that are counted to populate the res_count column.
The show_progress method is a neat way to give some indication that the process is still running:
...
def show_progress()
wheel = ["|", "/", "-", "\\"]
moveleft = "\033[D"
print wheel[@@progress_counter % 4], moveleft
@@progress_counter += 1
if @@progress_counter % 100 == 0
print @@progress_counter, ".."
end
$stdout.flush()
end
...
The @@stop variable is initially set to false and a couple of traps are setup to trigger this variable to true which causes the long running task to gracefully stop.
...
@@stop = false
trap("INT") {
stop()
}
trap("TERM") {
stop()
}
def stop()
@@stop = true
end
...
Here is a brief video demonstrating the progress indicator: