chevron_left chevron_right
Login Register invert_colors photo_library


Stay updated and chat with others! - Join the Discord!
Thread Rating:
  • 0 Vote(s) - 0 Average


Tutorial ARM [Part 2: Writing basic programs] filter_list
Author
Message
ARM [Part 2: Writing basic programs] #1
Ok, it's been a long while since I started this series, and it's time for another round. This time we're going to go over your first ARM programs, using just basic instructions. This practice will form the foundations of every program you will ever write for ARM. For the most part, you're only going to be using a couple instructions, and we'll discuss how to optimize them in a later part. For now, let's just go over what they are.

So, As I discussed in Part 1 of this series--ARM, like all RISC processors, uses a load-store architecture. This is a fundamental change to how you write assembly code if you have any experience with an x86, z80, or similar architecture. What that means is that all of the instructions that do work can ONLY do work on register arguments. For example, something like the following would be valid for x86:
Code:
ADD EAX, [ESP]
which would mean
add the value of EAX with the value from a variable held at the top of the stack, and place it in EAX

This is invalid for ARM. Our add instruction is not able to interface with memory. So that brings us to the first fundamental difference.

Loading and Storing
So, since ARM can't interface directly with memory in working instructions, we need a way to be able to get data from memory, and put it back. We do that using the LDR and STR instructions. They're pretty simple, and they're one of the very few instructions for ARM that takes 2 operands. Remember that ARM normally takes 3 operands: the destination, the source, and op-n.
The LDR instruction (short for "load register") will retrieve a value from memory and place it in the destination register. To accomplish the same task as we did above, we'd do the following:
Code:
LDR R1, R13 ; remember, R13 is the stack pointer. You can also use SP in some assemblers
ADD R0, R0, R1 ; R0 = R0 + R1
Ok, that was pretty simple. And yes, it is one more instruction over the x86 variant, but that doesn't mean it's slower. Please keep in mind when writing ARM programs that each line/instruction you write will take exactly 1 clock cycle, wheras x86 takes anywhere from 1 to 320 clock cycles.
STR - the STR instruction is the reverse of LDR, and it's short for "store register". This is the second half to the load-store architecture. Let's look at the following x86 program:
Code:
ADD EAX, [ESP]
MOV [ESP], EAX
What that does is it adds the value of EAX to the variable held at the top of the stack. In x86, this is a pretty short, but potentially confusing (to some) sequence of instructions. It takes around 7 clock cycles to complete.
We can use our LDR and STR instructions now to make the same snippet for ARM:
Code:
LDR R1, R13
ADD R0, R0, R1
STR R1, R13
This snippet takes 3 clock cycles.

Now, let's talk about another fundamental difference with ARM:

Preserving arithmetic values
Since ARM is a 3-operand system, we can actually preserve the value of R0/EAX when we do our addition. To do this for x86, we would need to do a bunch of crazy stack operations:
Code:
PUSH EAX
ADD EAX, [ESP -4]
MOV [ESP - 4], EAX
POP EAX
This uses at least 11 clock cycles, and it looks messy as all hell. We need to do this, because x86's instructions all act as <op>=, meaning ADD is actually += on the first operand. ARM doesn't work that way, it's destination = source + op-n. With this, we can preserve R0 without having to use memory in the process:
Code:
LDR R1, R13
ADD R1, R0, R1 ;the only thing changed is the first operand
STR R1, R13
Now, R0 has been preserved, because instead of R0 = R0 + R1, we now have R1 = R0 + R1, effectively R1 += R0. This is extremely useful if we want to use the value of R0 just a couple lines down, we don't have to store it in a different register or push it to the stack in order to save its value. Our snippet still uses exactly 3 clock cycles. This means, that in contrast to the x86 example, our code will run 8 cycles faster, which at 1ghz = 125us of saved time. Those microseconds will add up. This is why your ARM processor in your phone is able to run very intensive programs very quickly, even on a 1-2ghz processor, whereas your desktop computer might be a little laggy at 3.5ghz for medium load applications.

Now, some of you might have been wondering why I was using R13 instead of SP. That's because I wanted to get the thought in your mind that aliases exist for registers. This brings us to another key difference between RISC and x86 (CISC).

Lots of registers
ARM has 15 general purpose registers. This means that you can put whatever you want in them, and your program will still run (for the most part). Our x86 counterpart only has 4. This means that for x86, you will need to put a lot more data into RAM when you don't need it this second, and you waste all of that time having to push and pull it as you work. Each of these registers is 32 bits long. For the sake of consistency I'm basing this series on ARM 32-bit, since it's more common, and x86 rather than AMD64. So, that means with x86, you have a total of 128 bits, or 16 bytes of register storage on the chip. That's 16 bytes of data you are limited to at any given time, without having to wait for memory to get more data. In ARM, since you have 15 general purpose registers, you have 480 bits, or 60 bytes of data you can work with. That's 275% MORE data! This means less time waiting for memory, and less memory used for your programs. Remember that the stack is a place in memory. Of course, it's not advised for you to use all 15 of these registers, though you can. At any given time, you should only modify 14 of them, and I'll tell you why.
With x86, you have those 4 registers (EAX, EBX, ECX, EDX) that you can play with, and they will work with the majority of the instructions. You can put data into EDI, EBP, and ECS if you want to, but they won't actually work with all of the instructions, making them only useful for temporarily storing something rather than pushing it (though you still need to move it back and forth, which costs CPU time). Here in ARM-land, we don't have that limitation. We can use 15 of our 16 total registers with any instruction we like. We can use the 16th register with at least half of the instructions. However, there is a catch. ARM treats all registers (except R15) as a user register. This means that all of those registers can operate on data, but they might have other purposes. Exactly 3 of the 16 registers have a primary purpose other than storing computations. These are done by register aliases. Here's what they are:
R13 => SP (the stack pointer)
R14 => LR (the link register, we'll get to this one shortly)
R15 => PC (the program counter, aka the instruction pointer)
So, like I said. You can put data in all of these, with the exception of R15 (if you put a computation in that, bad things will happen). But, you really don't ever want to overwrite the stack pointer, so you'd want to leave that one alone. This direct access to these registers makes it very easy to do some awesome hacks with your code. For example, you can write your program in such a way that when you make the program counter out of (4-byte) alignment by one byte, the program changes to another valid program that you want. This means that you can write a program that only uses 1/4 of the memory it might normally need! You would just increment R15 by one and jump back to the start each time, and it would run over the same bytes, but in a slightly different order each time.
I do want to note, that R0 - R12 have no other primary purpose. At any time, you can modify these to whatever you feel like. This makes up 52 bytes of truly free storage on the CPU die.

Now, I mentioned something in the last section about the link register, so let's talk about that for a bit:

The link register (R14)
With x86, when you call a subroutine, the CPU pushes the address of the next instruction onto the stack for you. This takes up time, and it also means that you have to have very careful control over your stack or your program will break (and it leaves room for attackers). However, it does provide you with the means to "return" (RET) out of your subroutine, and you'd store your return value in EAX. ARM is a little bit different. The ABI for ARM not only doesn't store this value on the stack (saving a little memory and some time), but actually allows you TWO return values (R0 and R1). The second key bit is because ARM has lots of registers, you don't need to push your arguments onto the stack either. You supply them by registers (up to 12 arguments), saving you even more time and memory.
So, how does ARM know where to return to?
Short answer: it doesn't, the programmer is in charge of that. ARM doesn't actually have a call and return setup, but it uses a series of branches. In their base form, a branch is identical to a JMP in x86. However, there are 2 special forms of this: the branch with exchange (BX) and the branch with link (BL). It's the second one we're interested in at the moment. That's more like x86's CALL. Here's a sample x86 program:
Code:
start:
 MOV EAX, 5
 PUSH EAX
 ADD EAX, [ESP - 4]
 CALL write
 XOR EAX, EAX
 RET
write:
 MOV [ESP - 8], EAX
 RET
This program does exactly the same thing as our above examples, but it uses a subroutine (like a function) to write the added value back to memory. Let's take a look at how we write this in ARM:
Code:
start:
 MOV R0, 5
 LDR R1, R13
 ADD R1, R0, R1
 BL write ; branch with LINK
 XOR R0, R0, R0
 BX R14 ; this is how we return
write:
 STR R0, R13 ;notice how there's no crazy stack math
 BX LR

Now, these programs should look pretty much the same. Ok, so we've now covered all of the basics! Let's get to writing our first programs!

We'll start with the classic x86 linux hello world:
Code:
section     .text
global      _start                              ;must be declared for linker (ld)

_start:                                         ;tell linker entry point
   mov     edx,len                             ;message length
   mov     ecx,msg                             ;message to write
   mov     ebx,1                               ;file descriptor (stdout)
   mov     eax,4                               ;system call number (sys_write)
   int     0x80                                ;call kernel
   mov     eax,1                               ;system call number (sys_exit)
   int     0x80                                ;call kernel

section     .data
msg     db  'Hello, world!',0xa                 ;our dear string
len     equ $ - msg                             ;length of our dear string
Note: code copied from here

Oh boy....every time I see that code I cringe a little. Not because it's terribly different from the ARM variant, but just because it looks so yucky. Let's start writing the ARM version and I'll explain the differences as we go.
First of all, since ARM has had the NX bit for ages (do some research on it), we don't need to differentiate our sections. The OS won't allow us write and execute permission on the same memory. So we already get to skip our section .text.
Also, since ARM has a large register file, we pass everything in as registers.
Code:
.global _start ; we still have to declare this, that's just basic linker stuff
_start:
 MOV R7, #4 ;R7 is our syscall number. This gives us 7 parameters we can pass to system calls via registers.
 MOV R0, #1 ;file descriptor
 MOV R1, =hello ;buffer
 MOV R2, #12 ;count
 SWI 0 ; call the operating system
 MOV R7, #1 ;syscall for sys_exit)
 SWI 0 ;call the operating system
.data
string:
 .ascii "Hello worldn"
If you've ever wondered how we know all of these numbers, we don't. This is the published Linux syscall interface. Find it here. We follow the C ABI to interface with it.
Code:
ssize_t sys_write(unsigned int fd, const char * buf, size_t count);

The next thing I want to talk about is how we called the OS. As you'll see, with x86 we call int 0x80, but for ARM we're calling SWI 0. This makes them basically the same, but it wasn't always like that. In fact, using Software Interrupt 0 is a relatively new EABI feature, and you shouldn't count on it to always be that. With other kernels (like XNU), you don't load R7 at all, you would call SWI 4 in that case. ARM is a very great system in that it's very easy to determine the interrupt number that was called. For example:
Code:
SWI 8
This would jump into OS code, and notice how we didn't provide the OS with any info, we just called the interrupt? Well, we can make use of the link register to make sense of it all. This means, that you can write your own syscalls if you wanted to, and not interfere with the OS (since linux now uses SWI 0 for everything). To do this you would have a handler, and your handler would load a register with the value at R14 - 4, then mask off the first byte. That would leave you with the interrupt number, and you can go about your business from there.

Ok, everything else in that program was basically the same, and for this stage in the tutorial series it pretty much will always be the same. I want to spend a little bit of time to show you a list of some other basic instructions you'll be using:
Here's a graphic that lists out ALL of the instructions that do data processing
[Image: NOuGJ4X.png]
You may have seen this before in our CYFA tutorial series (if you haven't, go take a look). These are very basic instructions, and I don't think I need to explain them to you, however do note that the mnemonics are different from x86 in some places.

So, you saw the hello world program, now let's write a program (using subroutines) that lets us write "Hello World" to a file. We'll write our very own version of strlen for this, because hard coding the value just isn't right. Let's start with that (strlen).
I'm going to use the same definition of strlen as in manual chapter 3
Code:
size_t strlen(const char *s);

Let's start off with a basic subroutine structure:
Code:
; strlen - calculate the length of a string
; R0 => string
; R0 <= length
strlen:
 BX LR ;note how I used LR here rather than R14. They're the same thing

Now, for this we're really only going to need 1 variable, so we'll shove that in R1. For the sake of making the programmer's life easier we'll also preserve that register, and note it in our comments.
Code:
; strlen - calculate the length of a string
; R0 => string
; R0 <= length
; Preserves all registers used
strlen:
 PUSH R1 ;save register
 XOR R1, R1, R1 ;zero it
 MOV R0, R1 ;after we calculate the length, move in the return value
 POP R1 ;restore it when we're done
 BX LR

From this point forward, I'm not going to leave in the previous comments, just so it doesn't look so messy. Ok, so at this point we have our basic structure, all we need is our loop. We're going to loop as long as the current value isn't 0 (the NULL terminator). With ARM this is really easy if we use the S-bit (if you've forgotten what that was, reread the first part of this series).
Code:
; strlen - calculate the length of a string
; R0 => string
; R0 <= length
; Preserves all registers used
strlen:
       PUSH    R1
       XOR     R1, R1, R1
; our loop goes here
       MOV     R0, R1
       POP     R1
       BX      LR

I aligned the columns so you can see better. Let's start our loop
Code:
; strlen - calculate the length of a string
; R0 => string
; R0 <= length
; Preserves all registers used
strlen:
       PUSH    R1
       XOR     R1, R1, R1
.ltop: LDRB    R2, [R0]        ; load one byte from R0
       TST     R2              ; check if R2 is 0
       ADD     R0, #1          ; increment the string pointer
       ADDNE   R1, #1          ; if not 0, add 1 to count
       BNE     .ltop           ; if not 0, jump up
       MOV     R0, R1
       POP     R1
       BX      LR
Ok, that's a pretty simple ARM subroutine for strlen. One thing to notice here, is that we need to push and pop one more register (R2). Now we can do it like it's x86 and just add another PUSH and another POP instruction, but ARM has a trick up its sleeve for cases like this. The push multiple. With ARM, the PUSH instruction doesn't actually exist either, it's just an alias for STMDB using R13 as the base register. It just looks better to do it using the PUSH. So, let's make that change
Code:
; strlen - calculate the length of a string
; R0 => string
; R0 <= length
; Preserves all registers used
strlen:
       PUSH    {R1, R2}        ; push both R1 and R2
       XOR     R1, R1, R1
.ltop: LDRB    R2, [R0]
       TST     R2
       ADD     R0, #1
       ADDNE   R1, #1
       BNE     .ltop                              
       MOV     R0, R1
       POP     {R1, R2}        ; pop both R1 and R2
       BX      LR
Notice the changes there. Also notice how the order isn't reversed. It doesn't have to be, since ARM stores the registers in order based on their number, it knows how to pop them out.

Perfect, our strlen function is done. Now, let's make our open and close file subroutines. We'll start with close, since it's the easiest. Again, start with our base:
Code:
; close - closes a file descriptor
; R0 => fd    
; Preserves all registers used
close:
Now, we need to go back to our handy linux syscall reference and figure out the syscall number and parameters for sys_close
Code:
#6
sys_close(unsigned int fd);

Alright, now we know which registers we'll need. Looks like all we need to save is R7, because we don't have to set any arguments and sys_close doesn't return anything. So, let's push it, pop it, and don't bother with clearing R0.
Code:
; close - closes a file descriptor
; R0 => fd    
; Preserves all registers used
close:
       PUSH    R7      ;syscall num here
       POP     R7      ;restore it
       BX      LR      ;return

Now, the rest of this is pretty simple. We just need to move integer 6 into R7 and call the OS. Since the user supplied us with the fd in R0, we don't need to change it at all.
Code:
; close - closes a file descriptor
; R0 => fd    
; Preserves all registers used
close:
       PUSH    R7
       MOV     R7, #6  ;syscall for sys_write
       SWI     0       ;call the OS
       POP     R7
       BX      LR

Now, we need to work out how to open the file. We'll do this in a very basic way. We'll hard code some of the options so that the file always opens in read-write mode, creating the file if it doesn't exist, truncating it if it does, and opening with permissions 666 (read write all). This is the same as opening with fopen with the mode string being "wb". Let's first figure out our syscall number and argument list, then figuring out our hardcoded values.
Code:
#5
int sys_open(const char * filename, int flags, int mode);
Constants (from fcntl.h) - http://unix.superglobalmegacorp.com/Net2...ntl.h.html
Code:
O_RDWR => 0x002
O_CREAT => 0x200
O_TRUNC => 0x400
Ok, so to get the flags, we just logical OR all of those 3. Rather than doing that in code, we'll just do it by hand (since it's hard coded).
Code:
flags => 0x602
mode => 0x29A

Alright, let's get started with our barebones subroutine
Code:
; open - opens a file
; R0 => path
; R0 <= fd
open:
       BX      LR
Now, I do want to make a point here, we don't want to run it like this. We need to actually call everything first, since we're relying on sys_open to handle our errors for us.
Now, let's figure out what registers we need. We know we're going to need R7 for the syscall, and then we need 2 registers for our hard coded arguments. So, we need to preserve R1, R2, R7 and note that we destroy R0 (which we already did)
Code:
; open - opens a file
; R0 => path
; R0 <= fd
open:
       PUSH    {R1, R2, R7}    ; save our parameter registers
       MOV     R7, #5          ; our syscall for sys_open
       POP     {R1, R2, R7}    ; restore before returning
       BX      LR
Alright, now all we need to do is mov our parameters in, and call the OS
Code:
; open - opens a file
; R0 => path
; R0 <= fd
open:
       PUSH    {R1, R2, R7}
       MOV     R7, #5
       MOV     R1, #0x602      ; O_RDWR | O_CREAT | O_TRUNC
       MOV     R2, #0x29A      ; mode 666
       SWI     0               ; call the OS for sys_open
                               ; this overwrites R0 with the fd
       POP     {R1, R2, R7}
       BX      LR

Perfect! At this point, we have everything we need, aside from our main subroutine. Your code should look like this right now
Code:
; strlen - calculate the length of a string
; R0 => string
; R0 <= length
strlen:PUSH    {R1, R2}
       XOR     R1, R1, R1
.ltop: LDRB    R2, [R0]
       TST     R2
       ADD     R0, #1
       ADDNE   R1, #1
       BNE     .ltop
       MOV     R0, R1
       POP     {R1, R2}
       BX      LR
; close - closes a file descriptor
; R0 => fd    
close: PUSH    R7
       MOV     R7, #6
       SWI     0
       POP     R7
       BX      LR
; open - opens a file
; R0 => path
; R0 <= fd
open:  PUSH    {R1, R2, R7}
       MOV     R7, #5
       MOV     R1, #0x602
       MOV     R2, #0x29A
       SWI     0
       POP     {R1, R2, R7}
       BX      LR
I like to keep my labels actually on a code line, it makes the code look a little less like spaghetti to me. I also removed all of the "preserves all registers" lines, because the ABI is that we need to preserve them anyways.


Now, let's quickly pseudocode our main subroutine. It will be a pretty simple one now that we have all these nice wrappers for us.
Code:
1. open file
2. get length of string
3. sys_write to file
4. close file
5. clean up and return to OS

For now, let's pretend we have already defined the symbols string and file. We can start with the easiest part of it, step 5.
Code:
start:
       ;5 - clean up and return to OS
       XOR     R0, R0, R0      ;return status = 0
       MOV     R7, #1          ;sys_exit
       SWI     0

Now, let's work on #1. We know that it takes an argument (R0) that is a pointer to our file string, and returns a number (R0) that is the fd it opened. This is as simple as moving the arg in and branching to it (with a LINK!)
Code:
start:
       ;1 - open file
       MOV     R0, =file       ;path of our file
       BL      open            ;call the subroutine
       ;2 - get length of string
       ;3 - sys_write to file
       ;4 - close file
       ;5 - clean up and return to OS
       XOR     R0, R0, R0
       MOV     R7, #1
       SWI     0

Great, but now we have a different issue, when we go to get the length of the string, we'll end up overwriting our fd with that value. No worries, this is where we get to take advantage of ARM's massive register file. From now on, let's store the fd in R1. For the sake of making this fewer steps, we're going to store the length of the string in R2 (this way we don't have to move it). We'll then need to save R1, R2, and R7 (for the sys_write syscall).
Code:
start:
       PUSH    {R1, R2, R7}    ;save our used registers
       ;1 - open file
       MOV     R0, =file
       BL      open
       MOV     R1, R0          ;copy the value so we have it later
       ;2 - get length of string
       ;3 - sys_write to file
       ;4 - close file
       ;5 - clean up and return to OS
       POP     {R1, R2, R7}    ;restore our registers
       XOR     R0, R0, R0
       MOV     R7, #1
       SWI     0

Great, now let's work on step 2. This one is nearly identical to step 1, so I won't comment it. We're just moving string to R0, branching (with link!) to strlen, then moving it to R2.
Code:
MOV     R0, =string
BL      strlen
MOV     R2, R0

Now, on to step 3. This one is slightly different, because we're calling the syscall directly. Remember our sys_write stuff from hello world?
Code:
#4
ssize_t sys_write(unsigned int fd, const char * buf, size_t count)
So, for this we just need to
R7 <= 4
R0 <= fd
R1 <= string
R2 <= string_len
Isn't it great that we already did the last step in that list Tongue
Code:
MOV     R7, #4
MOV     R0, R1
MOV     R1, =string
SWI     0

And now for our last step (that we still have to write), we need to close our file. This takes the fd in on R0 and doesn't return anything, so it's simple
Code:
MOV     R0, R1
BL      close

Sweet! Our finished start subroutine should look like the following:
Code:
start:
       PUSH    {R1, R2, R7}
       ;1 - open file
       MOV     R0, =file
       BL      open
       MOV     R1, R0
       ;2 - get length of string
       MOV     R0, =string
       BL      strlen
       MOV     R2, R0
       ;3 - sys_write to file
       MOV     R7, #4
       MOV     R0, R1
       MOV     R1, =string
       SWI     0
       ;4 - close file
       MOV     R0, R1
       BL      close
       ;5 - clean up and return to OS
       POP     {R1, R2, R7}
       XOR     R0, R0, R0
       MOV     R7, #1
       SWI     0

That means that our code is completely finished! All we need to do now is add the little bits of linker fluff around it. We know already that we need to have
Code:
.global start
above our start subroutine. Let's go ahead and do the data section
Code:
.data
string: .ascii "Hello, World!n"
file:   .ascii "/tmp/hello.txt"

And we're done! Our finished ARM assembly program should look like the following (I've removed the step comments as well)
Code:
; strlen - calculate the length of a string
; R0 => string
; R0 <= length
strlen:PUSH    {R1, R2}
       XOR     R1, R1, R1
.ltop: LDRB    R2, [R0]
       TST     R2
       ADD     R0, #1
       ADDNE   R1, #1
       BNE     .ltop
       MOV     R0, R1
       POP     {R1, R2}
       BX      LR
; close - closes a file descriptor
; R0 => fd
close: PUSH    R7
       MOV     R7, #6
       SWI     0
       POP     R7
       BX      LR
; open - opens a file
; R0 => path
; R0 <= fd
open:  PUSH    {R1, R2, R7}
       MOV     R7, #5
       MOV     R1, #0x602
       MOV     R2, #0x29A
       SWI     0
       POP     {R1, R2, R7}
       BX      LR
.global start
start: PUSH    {R1, R2, R7}
       MOV     R0, =file
       BL      open
       MOV     R1, R0
       MOV     R0, =string
       BL      strlen
       MOV     R2, R0
       MOV     R7, #4
       MOV     R0, R1
       MOV     R1, =string
       SWI     0
       MOV     R0, R1
       BL      close
       POP     {R1, R2, R7}
       XOR     R0, R0, R0
       MOV     R7, #1
       SWI     0
.data
string: .ascii "Hello, World!n"
file:   .ascii "/tmp/hello.txt"

A hello world printed to a file program in pure ARM assembly using only 51 lines!
Now, I do want to note, this is NOT the most optimized form of this program, there are a couple ways to shave 2-7 lines out of this, but I wanted to keep it as simple as possible for you guys for now. I'll hand out rep or NSP to anyone who does make it more efficient.

10 NSP and +2 rep to the first person who can tell me exactly how long this program will take to run on a 1ghz CPU! (you can treat the software interrupts like a NOP for this.

I hope you enjoyed this one, it took me like 4 hours to type all of this up for you. @Ender I know you wanted to read this one, so here you go. In part 3, we'll talk about how to plan out these programs so they aren't as inefficient as this one is, and in part 4 we'll do some hard core optimizations of our code and really show those x86 idiots that ARM is king!

Reply

RE: ARM [Part 2: Writing basic programs] #2
Well that was uhhm... long... Nice though, thanks for writing this
I expected it to end at "Hello World", but nope, you dove into file I/O.
I'll get that NSP later today (I hope), and I'll also read part 3. Time to break out a Raspberry Pi, hell, maybe I'll even write an ARM kernel to learn more about this.
I find it interesting how ARM chose a 3-argument format, you generally see only 2.
I'm considering learning a few assemblies for the hell of it, MIPS, 6502, Z80, whatever else. Z80 would be an easy one though, I already know 8086...
(This post was last modified: 03-18-2018, 02:13 AM by Blink.)


(11-02-2018, 02:51 AM)Skullmeat Wrote: Ok, there no real practical reason for doing this, but that's never stopped me.

Reply

RE: ARM [Part 2: Writing basic programs] #3
(03-18-2018, 02:08 AM)Ender Wrote: Well that was uhhm... long... Nice though, thanks for writing this
I expected it to end at "Hello World", but nope, you dove into file I/O.
I'll get that NSP later today (I hope), and I'll also read part 3.  Time to break out a Raspberry Pi, hell, maybe I'll even write an ARM kernel to learn more about this.
I find it interesting how ARM chose a 3-argument format, you generally see only 2.
I'm considering learning a few assemblies for the hell of it, MIPS, 6502, Z80, whatever else.  Z80 would be an easy one though, I already know 8086...

They actually did some crafty stuff so that using 3-arg instructions and 32-bit fixed width wouldn't prevent you from doing the same things that you can with variable width.
[Image: wOZGRtu.png]
The example here being the very wise use of the barrel shifter, which iirc only ARM has. It allows you to effectively do
Code:
R0 <= R2 + (R1 << 25)
all in one instruction, in one step:
Code:
ADD R0, R2, R1, LSL #25

[+] 1 user Likes phyrrus9's post
Reply






Users browsing this thread: 1 Guest(s)