Topic: 2GB executable limit under ARM64
In migrating my scientific code to a Mac using an M2 chip I ran into ugly problems with dynamic library files not being found, problems that had never occurred on my Intel machine. I had the problem when compiling straight in gfortran 12.2 and also when compiling as a Simply Fortran project. I finally found the cause, explained in detail in a post by an Apple developer named Tom Duff. I wasn't allowed to post the link so here it is:
PASTE ON
Here's what's going on. TL;DR — on arm64 macs, executables turn out to fail if they are larger than 2GB.
macOS, for efficiency reasons, removed standard libraries from the disk, such as libSystem.B.dylib (which contains the c runtime), libc++, etc. Instead, they live in a “shared cache” that’s loaded into memory and then linked by dyld when the executable is run. However, if your executable is large enough that it extends into the address range where the shared cache is stored (it’s always loaded at a fixed address range), then the shared cache is no longer available, because you’ve overwritten part of the shared region it lives in. In that case, the linker refuses to use the shared linker cache, which means you can't link against the c runtime in libSystem, leading to the error you received.
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/mach/shared_region.h says what address the shared region begins at, and its size. Look for SHARED_REGION_BASE_ARM64 and SHARED_REGION_SIZE_ARM64: these say the size and starting address of the shared region containing the shared linker cache. On ARM64, the region starts at address 0x180000000, which is 6GB above 0x0. But on arm64, your executable gets loaded starting at 0x100000000 (4GB above 0x0), to ensure for security reasons that if you accidentally store a pointer in a 32-bit variable, when it gets converted back to 64 bits, the converted result is less than 4GB above address 0x0, which will crash the app. This leaves just 2GB for an executable to fit between the starting address and the start of the shared region.
Your example above uses global variables to extend the executable size to exceed 2GB, and that triggers the error. Reducing the array size enough will lower the executable size enough that it does not overlap with the shared region, which means dyld will use the shared cache and thus will be able to link against the c runtime in libSystem, letting the program run normally.
This problem won't occur on Intel Macs, because on intel the SHARED_REGION_BASE is at a much larger address (0x7FFF00000000).
My problem is that for my numerical-relativity code, it's not so clear how to get our executable size below 2GB.
So my follow-up question to your original post is the following: is there any way to run an executable larger than 2GB in size on M1 Macs? I.e., is there a workaround to the issue your example uncovered?
Anyway, I hope this helps! I'd be happy to discuss further if you have any questions
PASTE OFF
I've known for a long time that my code, which uses a lot of large fixed arrays, was abusing RAM but it always worked on my Intel machines so I have put off learning to allocate space as needed. With the array dimensions I was using previously the executable was using 2.4 GB of RAM ("size" shell command). After reducing the dimensions of the worst offenders a couple times I got to a state where I was using only 1.8 GB, and that now runs as before (but with somewhat lower program limits, obviously).
I think it would be much appreciated if Simply Fortran would check for this issue and warn the user. I would expect that the gfortran developers will eventually put in a hook for this as well.
Eric