Tutorial CYFA - Creating Your First Assembler - Instruction Set Design

phyrrus9 · 10-11-2017, 12:21 PM

So, the introduction to this series seemed to get quite a lot of buzz about it, so I'm going to write these a little quicker. Hopefully I'll be able to get them all out in a couple of weeks.

So, part 2, this is when we get to go into the cool details without getting into the complicated shit. This one will feature the instruction set, sort of. We'll be talking about instruction set design, that is what makes up an instruction set (we won't look at more than a few instructions).

Ok. For most processors, assembly mnemonics correspond directly to an operational code, or opcode for short. This opcode is what tells the CPU what to do, and is usually followed by a couple bytes of encoded parameters. In a quick example, that might look like this:

opcodes:

Code:
0 - add

1 - sub

2 - mul

3 - div

4 - neg

...

A - mov

B - ldr

C - str

....

and then maybe some register encodings like this:

Code:
0 - r0

1 - r1

2 - r2

...

So, if we have the following program:

Code:
unsigned int *a = 0x1F;

*a += 5;

it would become the following in assembly

Code:
MOV R2, 0x0 ; R2 = 5

LDR R1, 0x1F; R1 = contents of memory at 0x1F

ADD R0, R1, R2 ; R0 = R1 + R2

STR R0, 0x1F ; memory in 0x1F = R0

which our assembler would then translate to the following hex code (binary)

Code:
A2 00 B1 1F 10 12 C0 1F

It's pretty simple in operation, the assembler knows what each token resolves into, and memory addresses or constants always resolve to themselves.

But how do we know what the programmer will write?
We don't. The programmer has to know what we expect, and that's what makes assembly a language that's not for the weak of heart.

The ARM instruction set:
In our little example there, we used what's called a static 16 bit fixed instruction set. That is, instructions are always encoded in the same way, and all instructions are 16 bits (2 bytes) wide. ARM uses a variable 32 bit fixed instruction set. This means that most instructions get encoded in the same way, but some are different (like instructions that need more space for parameters, so they have less space for registers).

Here's a graphic that explains how they are put together:
[Image: instr.gif]

For this tutorial series, we will only be working with the data processing, single data transfer, and branch categories. Using these 3, we can build usable software, and for the most part you will be able to test this software on your own.

Let's get into what each of these mean. First of all, you'll notice that the only one that contains an opcode is the data processing category. That's because there are more data processing instructions than anything else. The other instruction categories are controlled by flags, we'll get into that as we go along. These DP instructions are things like add, subtract, carry, etc. They get used the most. We'll go into depth on these in part 3 of this series.

Secondly, the single data transfer category. These instructions allow the CPU to communicate with the memory controller. Don't be fooled into thinking that these interface over a known bus, they don't. They ONLY talk to the memory banks. You can't read data from a hard drive with these (you would use a software interrupt or an undefined instruction for that, we won't talk about those). For our example, these make up the LDR and STR instructions, and we'll talk a lot about these and their uses in part 4.

Finally, the branch category. Technically this type of instruction can be accomplished with a data processing instruction, but it's less CPU intensive to know that it's a branch before hand. Branches are jumps in CPU level terms. You can relate these to goto, break, continue, return, etc in C. This will be part 5 of the series, and will likely be a short one, as these can get really deep and I want to avoid spending a week working on branches. We will likely be talking about branches a lot throughout the series though, so pay attention.

Now, let's have a talk about the first field in every one of those categories, the condition. This is specific to ARM/RISC processors, and is amazing to have. For an intel processor, the CPU actually has different instructions for if something is greater, less than, or equal to. ARM only has one instruction, but adds a condition. For intel, only jumps/branches can be conditional, with ARM all instructions can be. Here's an example of an intel increment/decrement until 0 block

Code:
.top:

  TST RAX, 0x0 ; see if value is true

  JE .done ; RAX is 0

  JL .lt

  ADD RAX, 0x1 ; add 1

  JMP .top

.lt:

  SUB RAX, 0x1 ; subtract 1

  JMP .top

.done:

  RET

Now, let's do the same with ARM assembly:

Code:
.top:

  TST R0, 0x0

  B.EQ .done

  ADD.LT R0, R0, 1 ; add 1 if less than 0

  SUB.GT R0, R0, 1 ; subtract 1 if greater than 0

  B  .top

.done:

Not only is that 2 instructions shorter, but it's also 5 clock cycles shorter (on a 1hz processor, the ARM one will finish 5 seconds sooner).

These conditionals are something that we will have to program our assembler to understand, but since we're using ARM, it means that we have to program it to understand fewer instructions and opcodes. This is kick ass!

That's the end of this one, I hope you have a little bit more understanding of what we're about to go through. I'm taking this slow for a reason, I want everybody to understand this. I'm basically teaching you assembly language and compiler theory at the same time.

Please let me know if I need to slow down, speed up, or go back and touch on something! YOUR FEEDBACK IS CRITICAL!

Blink · 10-12-2017, 12:06 AM

That was a pretty good post.

ARM is a lot more efficient than x86, as x86 is CISC, and ARM is RISC. (you only mentioned that ARM is RISC, not the other way)

x86 is a lousy design, and was made pretty quickly. However, it seems that since the Pentium Pro, Intel CPUs have a RISC core, with CISC instructions that are broken down into smaller instructions (still not great). A well designed CISC CPU could have some advantages over RISC-ones (but RISC is still better in the long run), but x86 is not well designed. Intel did try to make a CPU line that uses neither RISC nor CISC, called Itanium (IA), but that is a complete failure (yeah, it's still around), as everyone already targets ARM and x86, and nobody is targeting IA-64/IA-32.

phyrrus9 · 10-12-2017, 03:51 AM

(10-12-2017, 12:06 AM)Ender Wrote: That was a pretty good post.

ARM is a lot more efficient than x86, as x86 is CISC, and ARM is RISC. (you only mentioned that ARM is RISC, not the other way)

x86 is a lousy design, and was made pretty quickly. However, it seems that since the Pentium Pro, Intel CPUs have a RISC core, with CISC instructions that are broken down into smaller instructions (still not great). A well designed CISC CPU could have some advantages over RISC-ones (but RISC is still better in the long run), but x86 is not well designed. Intel did try to make a CPU line that uses neither RISC nor CISC, called Itanium (IA), but that is a complete failure (yeah, it's still around), as everyone already targets ARM and x86, and nobody is targeting IA-64/IA-32.

Very interesting that you know how all that works. That process is called microcoding. You can see hints of this in the linux kernel, it contains a sort of map of the individual things it can do.

Inori · 10-12-2017, 04:55 AM

I'm trying to get up to speed with all the threads I missed and I'm really tired so a lot of this went over my head. As is pretty standard from your tutorials, I actually learned a bunch from what I grasped. Definitely going to read this again when I've had my coffee tomorrow. Nice job!

phyrrus9 · 10-12-2017, 04:57 AM

(10-12-2017, 04:55 AM)Inori Wrote: I'm trying to get up to speed with all the threads I missed and I'm really tired so a lot of this went over my head. As is pretty standard from your tutorials, I actually learned a bunch from what I grasped. Definitely going to read this again when I've had my coffee tomorrow. Nice job!

I don't know if you saw the thread, but I put up a page with a list to all of them. Helps a lot to have them in one place. Link in my signature.

I'm going slow with this series because to teach someone to make an assembler they have to know the assembly language in question. Let me know if I go too far over your head.