Bit masking and shifts 08-30-2015, 08:31 PM
#1
One of the hardest concepts to grasp with C (or most GPPLs) is the concepts of masking and shifting (and likewise rotating with and without carry).
Well, I am going to break it all down for you.
Bit masking
Bit masking is a way to extract 0 or more bits from a number (though you really don't have a reason to extract 0 bits from anything...). Think of it like this: the number 3 (hex 0x3, binary 00000011) is actually made up of 2 and 1. Now this number: 5 (hex 0x5, binary 00000101) is made up of 4 and 1. What do the two numbers share? the 1. We can determine this pragmatically with bit masks. Though, this is not really the intended purpose. Say we have a function that allocates memory for us (through malloc), and has options to zero the memory, secure erase the memory (humor me), set the full contents to its max value, or securely set the contents to its maximum value.
This means that we must ALWAYS pass 5 arguments into the function....even if we just want to allocate regular memory....One way to solve this (if you are using C++) is default arguments. But, if you are using C, you have no such option. We instead use bitmasks:
Now, I hope that makes some sense...now we only pass 2 arguments to the function but get the exact same functionality. We can also combine them...by say defining only
ZERO = 1
MAX = 2
SECURE = 3
then we can check for opts & zero, opts & max, opts & (zero | secure), and opts & (max | secure) to do the same thing.
Bit shifts
Alright, in the last example we defined our options as values 0, 1, 2, 4, and 8 to mask off bits NA, 0, 1, 2, and 3 respectively. But that isn't very readable...we solve this with the LEFT shift.
in binary if we have 00000001 and we left shift that by 1, it moves every bit over to the left 1 (the bit farthest left is lost, and a 0 is put in bit 0's place) and we get 00000010. This is equivalent of multiplying by 2 and uses the operator << in C.
MUCH easier to read...
The same is true for right shifts, except that bit 0 is lost and bit 7 is now 0.
But how do we keep this data?
Bit rotations without carry
There is no C operator for this (but there is an assembler opcode for it), but there exists a bit rotate. For example a left rotate instead of bit 7 being lost, it is put in the place of bit 0.
10000000 ROTL 1 would become 00000001 and that ROTR would be 10000000
Bit rotations with carry
This is a really difficult concept to grasp, but basically there are 9 bits involved here, using the carry bit (in the status register) as an extension.
10000000 rotated left through carry by 1 bit would become 00000000 and doing the same operation again would become 00000001. I won't go into this concept any further because it would fit great in an assembly tutorial later.
Well, I am going to break it all down for you.
Bit masking
Bit masking is a way to extract 0 or more bits from a number (though you really don't have a reason to extract 0 bits from anything...). Think of it like this: the number 3 (hex 0x3, binary 00000011) is actually made up of 2 and 1. Now this number: 5 (hex 0x5, binary 00000101) is made up of 4 and 1. What do the two numbers share? the 1. We can determine this pragmatically with bit masks. Though, this is not really the intended purpose. Say we have a function that allocates memory for us (through malloc), and has options to zero the memory, secure erase the memory (humor me), set the full contents to its max value, or securely set the contents to its maximum value.
Code:
void *ourmalloc(unsigned int size, char zero, char securezero, char max, char securemax)
{
void *ret = malloc(size);
if (NULL == ret) return NULL;
if (zero)
memset(ret, 0, size);
else if (securezero)
{
memset(ret, 0, size);
memset(ret, 0xff, size);
memset(ret, 0, size);
}
else if (max)
memset(ret, 0xff, size);
else if (securemax)
{
memset(ret, 0xff, size);
memset(ret, 0, size);
memset(ret, 0xff, size);
}
return ret;
}
This means that we must ALWAYS pass 5 arguments into the function....even if we just want to allocate regular memory....One way to solve this (if you are using C++) is default arguments. But, if you are using C, you have no such option. We instead use bitmasks:
Code:
#define OURMALLOC_REGULAR 0x00
#define OURMALLOC_ZERO 0x01
#define OURMALLOC_SZERO 0x02
#define OURMALLOC_MAX 0x04
#define OURMALLOC_SMAX 0x08
void *ourmalloc(unsigned int size, char opts)
{
void *ret = malloc(size);
if (NULL == ret) return NULL;
char zero = 0, securezero = 0, max = 0, securemax = 0;
//parse the options
if (opts & OURMALLOC_ZERO) //mask off with mask=0x1 (extract only bit #0)
zero = 1;
else if (opts & OURMALLOC_SZERO) //mask off bit #1
securezero = 1;
else if (opts & OURMALLOC_MAX) //mask off bit #2
max = 1;
else if (opts & OURMALLOC_SMAX) //mask off bit #3
securemax = 1;
//do stuff
if (zero)
memset(ret, 0, size);
else if (securezero)
{
memset(ret, 0, size);
memset(ret, 0xff, size);
memset(ret, 0, size);
}
else if (max)
memset(ret, 0xff, size);
else if (securemax)
{
memset(ret, 0xff, size);
memset(ret, 0, size);
memset(ret, 0xff, size);
}
return ret;
}
Now, I hope that makes some sense...now we only pass 2 arguments to the function but get the exact same functionality. We can also combine them...by say defining only
ZERO = 1
MAX = 2
SECURE = 3
then we can check for opts & zero, opts & max, opts & (zero | secure), and opts & (max | secure) to do the same thing.
Bit shifts
Alright, in the last example we defined our options as values 0, 1, 2, 4, and 8 to mask off bits NA, 0, 1, 2, and 3 respectively. But that isn't very readable...we solve this with the LEFT shift.
in binary if we have 00000001 and we left shift that by 1, it moves every bit over to the left 1 (the bit farthest left is lost, and a 0 is put in bit 0's place) and we get 00000010. This is equivalent of multiplying by 2 and uses the operator << in C.
Code:
#define OURMALLOC_REGULAR 0
#define OURMALLOC_ZERO 1 << 0
#define OURMALLOC_SZERO 1 << 1
#define OURMALLOC_MAX 1 << 2
#define OURMALLOC_SMAX 1 << 3
MUCH easier to read...
The same is true for right shifts, except that bit 0 is lost and bit 7 is now 0.
But how do we keep this data?
Bit rotations without carry
There is no C operator for this (but there is an assembler opcode for it), but there exists a bit rotate. For example a left rotate instead of bit 7 being lost, it is put in the place of bit 0.
10000000 ROTL 1 would become 00000001 and that ROTR would be 10000000
Bit rotations with carry
This is a really difficult concept to grasp, but basically there are 9 bits involved here, using the carry bit (in the status register) as an extension.
10000000 rotated left through carry by 1 bit would become 00000000 and doing the same operation again would become 00000001. I won't go into this concept any further because it would fit great in an assembly tutorial later.