Based on profiling, significant time was spent in growSlice() and memMove() as
a result of the per-number slice append(). Consolidating the allocations resulted
in significant speed and memory usage improvements.
Benchmarking command:
time ./ch1-breeding-numbers -generations 13 -silent
go tool pprof ch1-breeding-numbers 13gen-memprofile-gen12.mprof
top25
Before:
71.978u 3.726s 0:31.79 238.0% 0+0k 40+75io 353pf+0w
Showing nodes accounting for 3.70GB, 100% of 3.70GB total
After:
38.303u 2.233s 0:23.41 173.1% 0+0k 40+62io 354pf+0w
Showing nodes accounting for 2.29GB, 100% of 2.29GB total
// right sets. This covers all the reduced form permutations. Any longer
// permutations will be similar to one of the reduced forms.
func permuteExistingNumbers(generation int, universe surrealUniverse) []surrealNumber {
// right sets. This covers all the reduced form permutations. Any longer
// permutations will be similar to one of the reduced forms.
func permuteExistingNumbers(generation int, universe surrealUniverse) []surrealNumber {
- var numbers []surrealNumber
+ // Profiling indicates that most calls to growSlice() and memMove()
+ // originate in this function. Allocating a sufficiently large slice
+ // upfront results in 2x speed improvement and 1/3 reduction in heap usage.
+ numberOfPermutations := ((universe.cardinality()+1) * (universe.cardinality()+1)) - 1
+ numbers := make([]surrealNumber, 0, numberOfPermutations)
// First build permutations with 1 element in each set.
for i := 0; i < universe.cardinality(); i++ {
// First build permutations with 1 element in each set.
for i := 0; i < universe.cardinality(); i++ {