CYFA - Creating Your First Assembler - Single Data Transfers 10-16-2017, 10:27 AM
#1
Ok, first thing I want to do before I start this off is make a slight correction from last time. In the previous part I implied that the rotate field on the immediate section was a number of bits to rotate, but I was wrong. In fact, it works by first zero extending to 32 bits and then rotating by two times that field. Secondly, in the last part I said I would start working on this part after I had gotten 5 replies, that just didn't happen. I decided to write this anyways because im fucking bored to death and I actually want this series to get finished. fucking reply to these threads god dammit! Alright, with that taken care of, let's start to think about part 4: the single data transfer.
Side note, this is my 100th thread (in nearly 5 years
). Woot, throw a party, ive already got an 8 drink head start. I know the number is low, but I don't write threads unless its good info.
First off, we gotta know what these instructions do. The single data transfer type of instruction has one and only one purpose, to move data between registers and the memory controller. We don't care what happens to it before or after that. Here's a graphic that explains the layout of this one:
![[Image: vK4LLFZ.png]](https://i.imgur.com/vK4LLFZ.png)
Ok, I'm really not going to go into too much detail on what all of these bits do, the graphic explains it well enough. For our assembler we will be using the following configuration of bits that will be static:
Ok, so that leaves us with bits U and L, which are really all we need. There's a lot of other stuff you can do with these instructions, but we won't.
NOTE: our assembler has to check that R15 is not Rd.
Now, the U bit: this one is important to touch on. In most every programming language, integers are signed by default, meaning that you can have the following:
This is exactly backwards from assembly, if you interpreted it in the literal sense:
you would actually get
0xFFFFFFFF (4294967295).
So, negative numbers aren't possible then?
No, they are. 4294967295 and -1 are actually the same number in a computer (32-bit). This works by using 2's compliment (we won't get into it, you can google though). Because of that, positive and negative numbers share the same space (there isn't a negative bit sign). So, to get a negative number, you just interpret it as if it were a signed number, and the CPU doesn't do this at all. To the CPU, these aren't even numbers. You have to be the one who interprets them.
Ok, how does this apply?
Well, our 12-bit immediate field is an unsigned field (like all fields), so what if we want to address something that's 4 bytes below us (like say we want to read the instruction we're executing). Well, in assembly, we'd write this as -4, but it would actually come out as positive 4, and we would set the U bit. That tells the CPU to address the memory in inverse order, effectively subtracting the offset.
Next, we have the L bit. This one is pretty simple, it differentiates between a load (LDR) and a store (STR) instruction. These will be the only 2 instructions we support (no MRS or any of that shit, feel free to read about them though), so really this will be one of the easiest fields to populate. We simply have to set 2 bits, an offset (which will likely be 0), and know if its a setter or a getter. Pretty nice huh?
Ok, cool. Since this one is really short, I want to work through a couple examples as well as explain another bit, particularly the I bit.
The I bit tells us what type of offset we're dealing with, these can get really complicated, so we're not going to bother too much with it, but note that with it set to 0, it means that we are using relative mappings. The following 2 lines do not reference the same spot in memory:
Each of those lines would reference itself, but not the others. Another neat example, would be a self perpetuating program:
these two lines would fill memory with itself until it ran out of memory, PC would eventually hit 0, and the CPU would fault (exception vectors).
Since we have the I bit set, we always have to access ram relatively, we can't do this:
but we can do this, if we aren't within 12 bits on another register:
Alright, that's probably all for this one, I tried to keep it REALLY basic. Next time we're going to talk about branches, and I'll likely get into the condition fields at that point.
FUCKING REPLY DAMMIT!
Side note, this is my 100th thread (in nearly 5 years
![Tongue Tongue](https://sinister.ly/images/smilies/set/tongue.png)
First off, we gotta know what these instructions do. The single data transfer type of instruction has one and only one purpose, to move data between registers and the memory controller. We don't care what happens to it before or after that. Here's a graphic that explains the layout of this one:
![[Image: vK4LLFZ.png]](https://i.imgur.com/vK4LLFZ.png)
Ok, I'm really not going to go into too much detail on what all of these bits do, the graphic explains it well enough. For our assembler we will be using the following configuration of bits that will be static:
Code:
I = 0
P = 1
B = 0
W = 0
Ok, so that leaves us with bits U and L, which are really all we need. There's a lot of other stuff you can do with these instructions, but we won't.
NOTE: our assembler has to check that R15 is not Rd.
Now, the U bit: this one is important to touch on. In most every programming language, integers are signed by default, meaning that you can have the following:
Code:
int i = -1;
Code:
mov r0, -1
0xFFFFFFFF (4294967295).
So, negative numbers aren't possible then?
No, they are. 4294967295 and -1 are actually the same number in a computer (32-bit). This works by using 2's compliment (we won't get into it, you can google though). Because of that, positive and negative numbers share the same space (there isn't a negative bit sign). So, to get a negative number, you just interpret it as if it were a signed number, and the CPU doesn't do this at all. To the CPU, these aren't even numbers. You have to be the one who interprets them.
Ok, how does this apply?
Well, our 12-bit immediate field is an unsigned field (like all fields), so what if we want to address something that's 4 bytes below us (like say we want to read the instruction we're executing). Well, in assembly, we'd write this as -4, but it would actually come out as positive 4, and we would set the U bit. That tells the CPU to address the memory in inverse order, effectively subtracting the offset.
Next, we have the L bit. This one is pretty simple, it differentiates between a load (LDR) and a store (STR) instruction. These will be the only 2 instructions we support (no MRS or any of that shit, feel free to read about them though), so really this will be one of the easiest fields to populate. We simply have to set 2 bits, an offset (which will likely be 0), and know if its a setter or a getter. Pretty nice huh?
Ok, cool. Since this one is really short, I want to work through a couple examples as well as explain another bit, particularly the I bit.
The I bit tells us what type of offset we're dealing with, these can get really complicated, so we're not going to bother too much with it, but note that with it set to 0, it means that we are using relative mappings. The following 2 lines do not reference the same spot in memory:
Code:
ldr r0, [pc, -0x4]
ldr r0, [pc, -0x4]
Code:
ldr r0, [pc, -0x4]
str [pc, 0x0], r0
Since we have the I bit set, we always have to access ram relatively, we can't do this:
Code:
ldr r0, 0x5005
Code:
ldr r0, 0x5005
ldr r0, [r0, 0x0]
Alright, that's probably all for this one, I tried to keep it REALLY basic. Next time we're going to talk about branches, and I'll likely get into the condition fields at that point.
FUCKING REPLY DAMMIT!