the crystal programming language always inlines blocks, which is great for performance but trades off space for speed. using blocks effectively means keeping this in mind.

somewhere along the line, i learned the habit of passing a block to a function as a means of customizing the behavior of the function. if the function that takes the block is large, it's important to remember that the body of the function is inlined where the function is called, which may not be what you are expecting. if you call the function multiple times, you even get multiple copies.

i just committed code that fixes an egregious example of this problem. in this case these ~30 lines of code replace the blocks with procs (which aren't inlined) and cut ~24mb (that's megabytes) off the executable (over a third of its size).

i regularly shoot myself in the foot trying to be clever, so i don' t know how prevalent this problem is in practice, but it's definitely something to keep in mind, especially if you see compile times and executable sizes growing!

@toddsundsted interesting, can you feed me some numbers, like compilation time and executable size for each version?

@toddsundsted I mean, this is expected, but I want to see how much impact this had in your case.

@beta in my case the executable went from 62,767,016 bytes to 38,063,896 bytes. i'll rerun the builds the get the timings. the code was in a macro that was used to wire models together, so it was used a lot!

it might be worth pointing out in the documentation that not only are blocks inlined but the functions to which the blocks are passed are, as well. in my case i had a function which was lengthy that takes a block which was short—possibly the opposite of how most blocks are used, especially in iterators. in any case, as near as i can tell, both the function taking the block and the block are inlined at the point in the code that the function is invoked.

@beta fwiw this also improved the situation with the stack that i posted about a while back. stack usage is still high, but it's about 20-30x less than it was before, and enough to avoid problems. to close the loop on that i've attached a backtrace from before and after the change, using Exception::CallStack.print_backtrace.

# before

[0x1011b8f9b] *Exception::CallStack::print_backtrace:Nil +107 in /Users/tsundsted/.cache/crystal/crystal-run-spec.tmp
[0x102d5e0f5] *School::Domain#each_match<Array(School::BasePattern+), Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), &Proc(Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), Nil)>:Nil +149 in /Users/tsundsted/.cache/crystal/crystal-run-spec.tmp
[0x103636fb8] *School::Domain#each_match<Array(School::BasePattern+), Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), &Proc(Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), Nil)>:Nil +9277272 in /Users/tsundsted/.cache/crystal/crystal-run-spec.tmp
[0x102f4c952] *School::Domain#each_match<Array(School::BasePattern+), Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), &Proc(Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), Nil)>:Nil +2025714 in /Users/tsundsted/.cache/crystal/crystal-run-spec.tmp
[0x101e9d348] *School::Domain#each_match:trace<Array(School::BasePattern+), (School::TraceNode | Nil), &Proc(Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), Nil)>:Nil +3101976 in /Users/tsundsted/.cache/crystal/crystal-run-spec.tmp
[0x101ba7c79] *School::Domain#run:School::Domain::Status +633 in /Users/tsundsted/.cache/crystal/crystal-run-spec.tmp

# after

[0x101eec99b] *Exception::CallStack::print_backtrace:Nil +107 in /Users/tsundsted/.cache/crystal/crystal-run-spec.tmp
[0x102979db5] *School::Domain#each_match<Array(School::BasePattern+), Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), &Proc(Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), Nil)>:Nil +149 in /Users/tsundsted/.cache/crystal/crystal-run-spec.tmp
[0x1029c6dd2] *School::Domain#each_match<Array(School::BasePattern+), Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), &Proc(Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), Nil)>:Nil +315570 in /Users/tsundsted/.cache/crystal/crystal-run-spec.tmp
[0x10298a145] *School::Domain#each_match<Array(School::BasePattern+), Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), &Proc(Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), Nil)>:Nil +66597 in /Users/tsundsted/.cache/crystal/crystal-run-spec.tmp
[0x1028f7866] *School::Domain#each_match:trace<Array(School::BasePattern+), (School::TraceNode | Nil), &Proc(Hash(String, Bool | Char | Float32 | Float64 | Int32 | Int64 | School::DomainType | String | Symbol | Nil), Nil)>:Nil +114742 in /Users/tsundsted/.cache/crystal/crystal-run-spec.tmp
[0x1028db679] *School::Domain#run:School::Domain::Status +633 in /Users/tsundsted/.cache/crystal/crystal-run-spec.tmp

in the before case, there are stack frame as large as 9,277,272 bytes, which  drops to 315,570 bytes after the change (which still seems large, but i'll tackle that next). i'm assuming the inlining of the block added to the values on the stack, though i admit it's hard to see what could cause that much!

A Mastodon instance for programming language theorists and mathematicians. Or just anyone who wants to hang out.