CYFA - Creating Your First Assembler - Data Processing Instructions 10-12-2017, 04:35 AM
#1

Again, it seems like my last iteration of this series got some good attention (Instruction Set Design), so I'm going to try to hammer this one out today. I've come up with a plan I'm going to use for planning these. Once the latest part of the series gets 5 replies, I will start working on the next one. Hopefully we can keep this one alive and active until it's done.

Alright, as discussed in the last one, we will be discussing data processing instructions. These are the most common of the model we talked about (see graphic below), but this iteration will go quickly.

Data processing instructions are used to manipulate data. That means doing basic math, moving data around in the CPU, and some boolean logic. We'll be focusing on just a few of the possible operational codes within this tutorial, but here is a list of all of them for your reference:

For this tutorial series, we will probably just use the following:

Ok, so let's ignore the condition field for right now, we'll learn about those later. The instruction actually starts at bit 25. There are also 2 forms of this, bit 25 is called the Immediate bit, and it tells the CPU what to do with the Operand2 parameter:

These instructions follow this format:

Note: the rotate value isn't entered by the user. For the above example, the rotate would be 0xF, and the value would be 0x10. We need to calculate that at the time of assembly.

Ok, so let's talk about some of these bits:

The S bit

The S bit, aka the set condition codes bit, will be something new to you if you're an intel assembly programmer. In Intel, every instruction you execute effects the flags register, in ARM it's the other way around, the only instructions that set it by default are

However, you can have them set on any data processing instruction. For example, the TST instruction is simply a shorthand for this:

This format saves you some time. In a previous thread, we talked about a loop, which used a test, but we can rewrite that so that it doesn't execute the test all the time like this:

This lets us write extremely dense code, and remember that each instruction takes up EXACTLY one clock cycle. This code will take a minimum of 4 cycles to execute, but each iteration of the loop will only take 3, rather than 7 like our intel example, or 4 like the original ARM.

The I bit

This looks like a binary 1 in the diagram, but it is not. This bit is actually relatively simple, it tells the CPU if the instruction has 3 registers, or 2 registers and a constant value (known at assemble time). This is probably the most useful bit in the entire instruction set. With intel instructions, any operation always overwrites one of the operands, which sucks if you're chaining math operations. With ARM, you get to pick where it goes (as long as I=0), which means that you can also do a MOV in the same cycle. MOV can actually be replaced by ADD instructions like so:

NOTE: the least significant bit of the operational code also tells the CPU what to do with Operand2. If this bit is set (TST all the way to MVN), that part of the instruction is ignored. Those instructions can only take 2 parameters.

If the I flag is set, then 8 significant bits are copied in, and rotated by 3 bits. This means you only get a 12-bit immediate value, so you can't load integer 256 with 1 instruction!

Ok, well this one went by quicker than I thought, but I think I covered it pretty well. Let me know what you guys think. I'll be updating the Master List Page with this thread. I've also added the titles of the next segments. The faster this thread starts a discussion is the faster those will get written.

PLEASE REPLY TO THIS THREAD, DISCUSSION IS KEY!

Alright, as discussed in the last one, we will be discussing data processing instructions. These are the most common of the model we talked about (see graphic below), but this iteration will go quickly.

Data processing instructions are used to manipulate data. That means doing basic math, moving data around in the CPU, and some boolean logic. We'll be focusing on just a few of the possible operational codes within this tutorial, but here is a list of all of them for your reference:

Code:

`0000 - AND`

0001 - EOR (exclusive or)

0010 - SUB

0011 - RSB (reversed subtract)

0100 - ADD

0101 - ADC (add with carry)

0110 - SBC (subtract with carry)

0111 - RBC (reverse subtract with carry)

1000 - TST (test with AND)

1001 - TEQ (test with EOR)

1010 - CMP (compare)

1011 - CMN (add and set flags, no result stored)

1100 - ORR (inclusive or)

1101 - MOV

1110 - BIC (bit clear - AND NOT)

1111 - MVN (move not)

Code:

`ADD, SUB, AND, EOR, ORR, ADC, SBC, MOV, MVN`

Ok, so let's ignore the condition field for right now, we'll learn about those later. The instruction actually starts at bit 25. There are also 2 forms of this, bit 25 is called the Immediate bit, and it tells the CPU what to do with the Operand2 parameter:

Code:

`I=0: PPPP S NNNN DDDD 22222222222`

I=1: PPPP S NNNN DDDD RRR MMMMMMMM

I = Immediate bit

P = operational code

S = Set bit (see below)

N = Rn (first operand register)

D = Rd (destination register)

2 = Operand2

R = Immediate Rotate

M = Immediate Value

Code:

`<operation>{cond}{S} Rd, Rn, Operand2`

<operation>{cond}{S} Rd, Rn, #0x1000

Ok, so let's talk about some of these bits:

The S bit

The S bit, aka the set condition codes bit, will be something new to you if you're an intel assembly programmer. In Intel, every instruction you execute effects the flags register, in ARM it's the other way around, the only instructions that set it by default are

Code:

`TST, TEQ, CMP, CMM`

Code:

`AND.S r0, r1, r2 ; same as TST R1, R2`

Code:

`TEQ r0, #0x0`

.top:

ADD.LT.S r0, 0x1

SUB.GT.S r0, 0x1

B.NE .top

The I bit

This looks like a binary 1 in the diagram, but it is not. This bit is actually relatively simple, it tells the CPU if the instruction has 3 registers, or 2 registers and a constant value (known at assemble time). This is probably the most useful bit in the entire instruction set. With intel instructions, any operation always overwrites one of the operands, which sucks if you're chaining math operations. With ARM, you get to pick where it goes (as long as I=0), which means that you can also do a MOV in the same cycle. MOV can actually be replaced by ADD instructions like so:

Code:

`ADD r0, r1, #0x00`

is the same as

MOV r0, r1

If the I flag is set, then 8 significant bits are copied in, and rotated by 3 bits. This means you only get a 12-bit immediate value, so you can't load integer 256 with 1 instruction!

Ok, well this one went by quicker than I thought, but I think I covered it pretty well. Let me know what you guys think. I'll be updating the Master List Page with this thread. I've also added the titles of the next segments. The faster this thread starts a discussion is the faster those will get written.

PLEASE REPLY TO THIS THREAD, DISCUSSION IS KEY!