chevron_left chevron_right
Login Register invert_colors photo_library

Stay updated and chat with others! - Join the Discord!
Thread Rating:
  • 0 Vote(s) - 0 Average

Tutorial Phy's ARM Fun Facts [Episode 1: Stack Magic] filter_list
Phy's ARM Fun Facts [Episode 1: Stack Magic] #1
So, I constantly find myself posting random fun facts about ARM to the discord, where pretty much nobody sees them. Well, now they have a home. Every time I figure out a fun fact that's worth sharing, I'll post it in one of these threads, starting with tonight's batch:

1. PUSH and POP don't actually exist
Any seasoned assembly programmer is very familiar with the stack and using the PUSH and POP instructions, but not an ARM programmer, because they don't actually exist. You may have been aware that it's actually possible to push more than one register to the stack in the same instruction, like this:
PUSH    { R0, R1 }
but did you know why you can? Don't worry, neither did I when I started out. You can find many of my programs that start and end something like this:
func:  PUSH    R4
       PUSH    R5
       PUSH    R6
       PUSH    R7

       ; code that uses these 4 goes here

       POP     R7
       POP     R6
       POP     R5
       POP     R4
       BX      LR      ; return
But in reality, it only needed to look like this, since we can push and pop multiple registers at the same time (in fact, we can do it with all of them)
func:  PUSH    { R4, R5, R6, R7, LR }

       ; code goes here

       POP     { R4, R5, R6, R7, LR }
       BX      LR
and in fact,  there are 2 more tricks we can do as well.
1. since ARM stores all of the registers in order (by their reg number), we can push RANGES of registers
2. since we can push and pop all registers like this, we can pop the link register (LR / R14) directly into the program counter (PC / R15)
So we end up with something that looks SOOOOO much cleaner
func:  PUSH    { R4-R7, LR }
       ; code goes here
       POP     { R4-R7, PC }
Now, that looks a lot cleaner than the first one, but how does it work? Well it turns out that PUSH and POP don't actually exist: they're pseudo-mnemonics. They actually translate to an ARM store multiple (STM) or load multiple (LDM), in a special format. I won't explain the entirety of these instructions, here's a graphic from the ARM manual:
[Image: JwbtSSI.png]
[Image: C4GxjVg.png]
So, a PUSH instruction is actually just a STM (store multiple) instruction with the following set:
  1. The base register is R13 (SP / Stack Pointer)
  2. The direction is decrement (descending)
  3. Write back is enabled
  4. Register always points to an empty cell
and this means that the following two instructions are IDENTICAL
PUSH    { R4-R7, LR }
STMDB   SP!,    { R4-R7, LR }
And, for POP it follows the same rules, except that the direction is increment (because POP is the opposite of PUSH), so these are identical as well
POP     { R4-R7, PC }
LDMIA   SP!,    { R4-R7, PC}
Ok, that's my rambles on those operations. Just a couple final words about how it can be applied:
STMDA, STMIB, STMIA, LDMIB, LDMDB, LDMDA also exist, allowing you to have stacks in both directions, and stacks that always point to the last used cell
Because those instructions exist, it is insanely easy to reverse a stack
And finally, because you can specify whatever register you want in place of SP, you can have as many stacks as you want. This is really useful if your program implements say 5 stacks, but you don't want to keep changing the value of SP or if your program also depends on the regular stack.

2. Pseudo-ops for stacks exist
So I just briefed you on the fact that PUSH and POP are fake instructions that translate to STMDB and LDMIA respectively, but there are actually more pseudo-mnemonics for building stacks. The base for all of these is always LDM and STM, but it has 4 different two-chatacter postfixes. Here's an ARM graphic describing them:
[Image: je5qV1H.png]
Now, these seem to be really useful if you don't want to remember which postfix goes with which direction stack, the assembler does that for you, you just need to know if your stack is descending (the default for the system) or ascending and use FD or FA respectively.
Interestingly, there's also the ED and EA postfixes, which are the ones that would cause the stack to always point to an empty cell rather than a used one. I find it a little neat.

3. There are 260 MILLION instructions you can use if you write in machine code
I won't go too deep into this (maybe it can be the topic of another tutorial if you actually voice that you want to learn about it), but there is an extremely awesome part of the ARM instruction set: the ability to have instructions that the CPU alone can't process. Here's a graphic that describes the ARM instruction set and the categories that the decoder uses
[Image: Tzl7Yud.png]
Now all of those make sense except for 2:
1. Software interrupt
2. Undefined
The first one may be new to you, since x86 uses some weird as hell interrupt mechanism. The short answer is, interrupts are no-ops that when executed change the execution mode to SVC (supervisor / service) and then cause a jump to address 0x00000008. The OS places a branch (jump) instruction there to some code in the kernel that then handles the interrupt and at the end it sets itself back to user mode and branches (jumps) back to your program.
The second is actually really similar, except it has a little bit more to it. Undefined instructions in x86 just cause complete and total failure of the program, and you want to avoid them at all cost, but thats not the case for ARM. These instructions are actually some of the most useful instructions ever made. Here's the process the CPU follows when it hits these:
First, it stops the pipeline
Next, it offers the instruction to the first coprocessor, if it accepts it then the CPU waits for the coprocessor to complete, then unlocks the pipeline and continues on
If it rejects it, it does the same for the second coprocessor
And so on until it runs out of co-processors to try. At that point, it
1. Changes the operating mode to UND (undefined execution)
2. Branches (jumps) to address 0x00000004
At that address, there is normally an instruction like this:
BX [PC, #4]
And what this does is trigger an instruction fetch abort, which is total failure of your program. However, you can make it do something entirely different. You can replace that instruction with something like
MOVS PC, =software_instruction
and then write a function (named software_instruction) that looks back at the undefined instruction that triggered the exception, masks off the processor-related bits, and you're left with a number. You can use that number to implement your own instructions in software. This means that you could implement things like cryptography instructions in software or even emulate having a specific coprocessor installed when it isn't. You can do the second one because the ARM CPU actually talks to certain coprocessors exclusively through these instructions (of course, there are coprocessor registers and such, but again thats easy to emulate). It just happens that these instructions range from (in machine code)
08 00 00 6x
where x is the 4 bit condition that's applied to all instructions. If you add up this range, you get a total of 260,046,840 undefined instructions. Each and every one of them can be used to talk to a coprocessor, emulate hardware, or implement your own instructions.

Thanks for hanging around for the first episode of Phyrrus9's ARM Fun Facts I hope you enjoyed these. Feel free to ask me any questions, or let me know if you want a tutorial based on any or any part of these fun facts. See you next time, whenever that is.


Users browsing this thread: 1 Guest(s)