> You will also need to be aware of minor differences between the Darwin ABI and other platform ABIs. A notable example is that the x18 register is reserved by the Darwin ABI and is explicitly zeroed on context switches in some cases. This register is also reserved on Android, but not on GNU/Linux or Alpine.
x18 is "the platform register", reserved for the OS. The ISA manual says not to touch it unless you know what you're doing. Also, I don't know but I could believe that android and non-googly linux use different ABIs (but probably not because everyone uses pretty much the same ABI on aarch64 from what I've seen), but surely Alpine is linux and has the same ABI as other linuxes.
You know, it always rubs me wrong when I'm reading an ISA manual and it tells me how I am supposed to use general-purpose registers. Why do ISA designers even believe they're in a place to design the user-level ABIs? Like, sure, you've hardwired BL and RET to use x30, that's fine. But every other register? If I want to pass return values in x21 and x23, that's none of your business.
> It just requires being aware of a few differences between the Mach-O and ELF ABIs, as well as knowing what Apple-specific syntax extensions to avoid.
I have so many ifdef __APPLE__ hacks in my assembler/compiler. And they are much harder to fix when I had no Apple Silicon available, only github actions and some models and gas sources.
But basically it's similar to elf vs coff. windows also uses several modern alignments and a shadow stack to make life easier. But arm has so much more scratch registers, much more fun.
Right. I am saying there is a difference between portable and non portable assembly code. If you interacted with the machine via call 05h interface, it was portable. If you accessed computer’s video memory buffer directly it wasn’t.
Good portable assembly would stub the system stuff off, anyway, and once that was done for the cpu class in focus, it was very possible to have a thin HAL and write portable code. A great deal many successful products of the era were written in pure assembly this way.
In any case, you could also get high performance multiplatform video/io assembly libraries on the market, soon enough, back in the day .. it begat a lot of Delphi units too, I seem to recall ..
> The good news is that it is very easy to write assembly which targets Apple’s computers as well as the other 64-bit ARM devices running operating systems other than Darwin.
If you can understand what someone means when they talk about a “small elephant”, then you can understand what they mean when they talk about “portable assembly”. In this case, the relevant point is that you can write ARM64 assembly routines that do useful work (e.g optimized matrix multiplication, or something like that) in such a way that they’ll work correctly on a number of different ARM64 platforms.
There are portable (or not) software applications, portable algorithms code, non-portable algorithm code.
Assembly based software for the Motorola 68000 /and/ the Amiga will not run on a 68000 Mac. A "polynomial based packed fastSin()" subroutine written using the XMM x64 registers will work on most x86 CPUs. The same written for the ZMM registers will not work on a number of x86 CPUs.
Surely 40 years ago we could easily assume talking about specific implementations of software applications; clearly today we see problems as being optimizable on some classes of platforms (e.g. the vectors in ARM work differently from the vectors in x64 CPUs).
x18 is "the platform register", reserved for the OS. The ISA manual says not to touch it unless you know what you're doing. Also, I don't know but I could believe that android and non-googly linux use different ABIs (but probably not because everyone uses pretty much the same ABI on aarch64 from what I've seen), but surely Alpine is linux and has the same ABI as other linuxes.
And completely ignoring PE and Windows on ARM.
But basically it's similar to elf vs coff. windows also uses several modern alignments and a shadow stack to make life easier. But arm has so much more scratch registers, much more fun.
Even the PC clones didn't had something like portable Assembly if you ventured outside 0x10h and 0x21h interrupts.
In any case, you could also get high performance multiplatform video/io assembly libraries on the market, soon enough, back in the day .. it begat a lot of Delphi units too, I seem to recall ..
Never ‘touching the hardware’ was attainable for a great deal many assembly programs.
You could do a lot with 0x10h and 0x21h on DOS.
> The good news is that it is very easy to write assembly which targets Apple’s computers as well as the other 64-bit ARM devices running operating systems other than Darwin.
Sometimes you do seem to make negative comments just for the sake of it.
Assembly based software for the Motorola 68000 /and/ the Amiga will not run on a 68000 Mac. A "polynomial based packed fastSin()" subroutine written using the XMM x64 registers will work on most x86 CPUs. The same written for the ZMM registers will not work on a number of x86 CPUs.
Surely 40 years ago we could easily assume talking about specific implementations of software applications; clearly today we see problems as being optimizable on some classes of platforms (e.g. the vectors in ARM work differently from the vectors in x64 CPUs).