In my 3n + 1 post rickhg12hs suggested I use the erlang ets module to memoize my solution.
To time the code before and after changes I found this little gem from David King
-module(timer).
-export([timer/3]).
timer(Mod,Fun,Args) ->
Start=erlang:now(),
Result = apply(Mod,Fun,Args),
Stop=erlang:now(),
io:format("~p~n", [time_diff(Start,Stop)]),
Result.
time_diff({A1,A2,A3}, {B1,B2,B3}) ->
(B1 - A1) * 1000000 + (B2 - A2) + (B3 - A3) / 1000000.0 .
Using this I found that my old code for 1 -> 1,000,000 took 13.1 seconds..
I upgraded the code to use an ets ordered_set like this:
-module(ets3np1).
-export([tnpo_range/2]).tnpo_range(Min, Max) ->
Tib = ets:new(x, [ordered_set,private]),
Result = lists:max([tnpo(Tib, X) || X <- lists:seq(Min, Max) ]),
ets:delete(Tib),
Result.
tnpo(_, 1) -> 1;
tnpo(Tib, Num) ->
tnpo( ets:lookup(Tib, Num), Tib, Num ).
tnpo([], Tib, Num) ->
case( Num rem 2) of
1 -> NewNum = (Num * 3) + 1;
0 -> NewNum = Num div 2
end,
Val = tnpo(Tib, NewNum) + 1,
ets:insert(Tib, { Num, Val }),
Val;
tnpo([{_,Val}], _, _) -> Val.
which ran at the improved time off 8.8 seconds.
Initially I had forgotten to delete the ets tables so my lappy started grinding after a few iterations as it thrashed the swap. Ah the joys of learning a feature via the API reference…
One nagging question I had was the speed of using the lists:seq and the list comprehension. So I swapped to an explicit loop via parameter matching, which ran slightly faster. Luckily I realised the code smell: the original could be mapped across many processors with little effort, but the latter was hard coded looping.
So here’s that bad code:
tnpo_range_b(Min, Max) ->
Tib = ets:new(x, [ordered_set, private]),
Result = tnpo_range_b(Tib, Min, Max, 0),
ets:delete(Tib),
Result.
tnpo_range_b(Tib, Num, Max, Best ) when Num =< Max ->
Result = tnpo(Tib, Num),
tnpo_range_b(Tib, Num + 1, Max, lists:max([Result,Best]));
tnpo_range_b(_, _, _, Best ) -> Best.
After that I then wondered what the difference between set (hash-table) and ordered_set (b-tree), so I altered my code again.
So the final results are (averaged over three runs):
Input 1 -> 1,000,000
Original: 13.1 seconds
Ordered_Set: 8.8 seconds
Bad-Code: 8.7 seconds
Set: 8.9 seconds
Input 1 -> 2,000,000
Original: 27.8 seconds
Ordered_Set: 18.6 seconds
Bad-Code: 18.2 seconds
Set: 18.8 seconds
Of funny coincidence, the day after doing the coding, I read the ets section in Programming Erlang, which turns out to cover the topic nicely.