Bits, Bytes, shifting and masking in Assembly (Yul)

Bits, Bytes, shifting and masking in Assembly (Yul)

Performing permutations on bytes strings in assembly

·

8 min read

Bit

A bit is the smallest unit of data in a computer and can be represented in only two patterns 1 0r 0, which can represent on/off , yes/no , true/false.

But a bit is too small to represent any meaningful data.

Byte

A byte is a collection of 8 bits.

A byte can represent 256 possible patterns, for example let’s look at how many patterns 2 bits and 3 bits can represent.

The base 10 values of the Binary also included

The base 10 values of the Binary also included

2 bits can represent up to 4 different patterns and 3 bits can represent up to 8 patterns, so the patterns the number of bits can represent is \(2^n\), n being the number of bits. 2 Bits = \(2^2\) , 3 Bits = \(2^3\) and 8 Bits = \(2^8\) .

An ethereum address is 20 bytes long, which means it can store 2^(20*8) possible patterns.

An ethereum contract stores data in the smart contract as 32 bytes hexadecimal data. What does that mean? It means that the the data is a collection of 32 bytes, which means 32 * 8 = 256 bits. The hexadecimal simply means the binary bytes representation are converted from base 2 to base 16.

Storage in EVM

Ex.

convert 13901371 in base 10 to a 32 bytes hexadecimal representation.

Convert 13901371 to base 2 = 110101000001111000111011

The base 2 representation above is 24 bits long, meaning it can fit into 3 bytes just fine.

Convert the base 2 to base 16 = d41e3b

Let’s see how this is represented in the EVM.

contract test{
    uint val = 13901371;

    function getVal() public view returns(bytes32 _val)
    {
        uint _slot;
        assembly{
            _slot := val.slot
            _val :=sload(_slot)
        }
    }

}
0x0000000000000000000000000000000000000000000000000000000000d41e3b

The EVM pads numbers to the right to complete 32 bytes.

Ex 2.

convert uint256 max value from base 10 to base 16.

Base 10 = 115792089237316195423570985008687907853269984665640564039457584007913129639935

Binary = 1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

Base 16 = ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff

The length of the binary is 256 long, which means 32 bytes.

contract test{

    uint public val = type(uint256).max;

    function getVal() public view returns(bytes32 _val)
    {
        assembly{
            _val := sload(val.slot)
        }
    }

}
0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff

It’s important to note that everything stored in a bytes32 is represented in binary, as this will come in handy in the next section.


Packed Data

As much as a single value can be stored in one slot, multiple values can be packed into a single storage slot,

contract test{
    uint16 public  home;
    uint24 public  apartments;
    uint104 public beach;
    uint104 public house;
    uint8 public skycrapper;
}

The above is just one slot. 16 + 24 + 104 + 104 + 8 = 256 bits, which makes up 32 bytes.

Representation in a single bytes32

contract test{

    uint16 public  home = 11;
    uint24 public  apartments = 291;
    uint104 public beach = 171;
    uint104 public house = 890;
    uint8 public skyscrapper = 39;

    function getSlots() public pure returns(uint _home, uint _apartments, uint _beach, uint _house, uint _skyscrapper)
    {
        assembly{
            _home := home.slot
            _apartments := apartments.slot
            _beach := beach.slot
            _house := house.slot
            _skyscrapper := skyscrapper.slot
        }
    }

   function getValues() public view returns(bytes32 values){

        assembly{
            values := sload(home.slot)
        }

   }

}

All the return values from getSlots return 0, which means the values are all packed in one slot, so in getValues() calling the slot of any of the variables will return same.

Let’s break down the values from base 10 numbers to their respective base16 and their representation in a single bytes32 string, which is in a single slot.

home: base10 = 11 base16 = b

apartments: base10 = 291 base16 = 123

beach: base10 = 171 base16 = ab

house: base10 = 890 base16 = 37a

skyScrapper: base10 = 39 base16 = 27

Note: values are packed from bottom to up, meaning the last value, in this context apartments is the first byte represented in the bytes32 string.

The function getValues() returns :

0x270000000000000000000000037a000000000000000000000000ab000123000b

From the returned bytes32 we can clearly see the values from each of the base16 numbers represented in the string, starting from skyscrapper. Remember earlier mentioned the EVM takes a bottom up approach in packing the values. Since the values all packed are integers, the padding starts from the right, padding here means the leading zeros.

Reading from packed Data

To recover values from a bytes32 string in yul, mask and shifting operations are used. Both method are used at different parts of the string, you use shift for bytes after the wanted number, what you will call a postfix, for example suppose a byte string 0xxxx444xxxx, the bytes after the wanted values of 44 are shifted.

Solve:

Shift

0x00004440000 >> 4 = 0x00000000444

Mask

0x00000000fff & 0x00000000444 = 0x444

uint16 public  home = 11;
uint24 public  apartments = 291;
uint104 public beach = 171;
uint104 public house = 890;
uint8 public skyscrapper = 39;

function getValues() public view returns(bytes32 _value, uint _slot, uint _offset)
{
    assembly{
        _slot := apartments.slot
        _offset := apartments.offset
        _value := sload(_slot)

    }
}

bytes32: _value 0x270000000000000000000000037a000000000000000000000000ab000123000b
uint256: _slot 0
uint256: _offset 2

Suppose now we want to fetch the value of apartments from the byte32 string, we would have to perform a few permutations using shifts and masks.

Right Shift to unset values after wanted Variable.

The bitwise operator we will be using here is right shift.

Ex.

1111111 >> 2 = 0011111

To get apartments, we would have to right shift the bytes32 string to clear everything after apartments, right shift simply means removing the values on the right and padding the left to compensate for the loss in bytes on the right. And padding here is always with zeros. So we need to right shift the bits that represent home so that apartment would be the rightmost value. To right shift we perform the operation with the assembly function shr(), and shifts in assembly works with bits, to get the value to be shifted we have to call the assembly property offset, offset fetches the offset of the variable apartment from the right, in bytes. Which in this case is 2, so 2 bytes * 8 bits = 16 bits.

pragma solidity ^0.8.19;

contract test{

    uint16 public  home = 11;
    uint24 public  apartments = 291;
    uint104 public beach = 171;
    uint104 public house = 890;
    uint8 public skyscrapper = 39;

    function getApartment() public view returns(bytes32 _shift)
    {
        assembly{
            let _value := sload(apartments.slot)
            _shift := shr(mul(apartments.offset,8), _value)
        }
    }
}
0x0000270000000000000000000000037a000000000000000000000000ab000123

Now that apartment is clearly the right most value we need to mask the byte32 string to unset the values before apartments.

Mask To get wanted Variable.

The bitwise operator we would be using here is & (and).

Ex.

100111 & 111111 = 100111

ReadMore

Mask

0x00000000fff & 0xb0000000444 = 0x444

Remember zeros before an integer is just padding.

Solve.

0x0000270000000000000000000000037a000000000000000000000000ab000123 & 0x0000000000000000000000000000000000000000000000000000000000ffffff

F represents the value we want to keep, and 0 discard, purely binary operations as described initially.

We use the assembly function and to mask the shifted value, now we have the apartment value as required.

pragma solidity ^0.8.19;

contract test{

    uint16 public  home = 11;
    uint24 public  apartments = 291;
    uint104 public beach = 171;
    uint104 public house = 890;
    uint8 public skyscrapper = 39;

    function getApartment() public view returns(uint _apartments)
    {

        assembly{
            let _value := sload(apartments.slot)
            let _shift := shr(mul(apartments.offset,8), _value)
            _apartments := and(_shift, 0x0000000000000000000000000000000000000000000000000000000000ffffff)
        }
    }
}
291

Writing to packed Data

To write values to a packed slot is pretty similar using the same strategies as reading.

To change apartment from 291 to 25, we need to mask out the bytes that apartment is assigned to.

The entire packed bytes32 storage slot returns the below

0x270000000000000000000000037a000000000000000000000000ab000123000b

and apartments as we know from the previous examples occupies the space below demarcated with brackets for clarity.

0x270000000000000000000000037a000000000000000000000000ab[000123]000b

Also from previous examples we know we can mask a bytes string and here we want to do just that, we want to reset the apartment bytes space to zero.

Note: 0 means discard this byte and f means keep this byte, so we use 0 where we want to reset apartments

0x270000000000000000000000037a000000000000000000000000ab000123000b& 0xffffffffffffffffffffffffffffffffffffffffffffffffffffff000000ffff

function setSingleValue() public view returns(bytes32 reformed) {
        assembly{
            let slot := apartments.slot
            let value := sload(slot)
            reformed := and(0xffffffffffffffffffffffffffffffffffffffffffffffffffffff000000ffff, value)
        }
    }
0x270000000000000000000000037a000000000000000000000000ab000000000b

So the space assigned to apartments has been reset, now it’s time to input the new value.

function setSingleValue(uint24 newVal) public view returns(bytes32 reformedVal) {
        assembly{
            let slot := apartments.slot
            let value := sload(slot)
            let reformed := and(0xffffffffffffffffffffffffffffffffffffffffffffffffffffff000000ffff, value)
            reformedVal := shl(mul(apartments.offset, 8), newVal)
        }
    }

newVal would be 25, note that the assembly code converts newVal to

0x0000000000000000000000000000000000000000000000000000000000000019 automatically, 19 is the hexadecimal representation of 25.

back to the setSingleValue function, reformedVal results to

 0x0000000000000000000000000000000000000000000000000000000000190000

this is after shifting the bits to the left by 8 * 2. shl() performs the same operation as shr(), only this time it shifts the opposite direction. Left.

Combine and store

The bitwise operator we would be using here is | (or).

Ex.

100111 | 111111 = 111111

So now we have to combine the two variables reformed and reformedVal to get the new slot bytes32 value.


  function getApartment() public view returns(uint _apartments)
  {
      assembly{
          let _value := sload(apartments.slot)
          let _shift := shr(mul(apartments.offset,8), _value)
          _apartments := and(_shift, 0x0000000000000000000000000000000000000000000000000000000000ffffff)
      }
 }

function setSingleValue(uint24 newVal) public {
        assembly{
            let slot := apartments.slot
            let value := sload(slot)
            let reformed := and(0xffffffffffffffffffffffffffffffffffffffffffffffffffffff000000ffff, value)
            let reformedVal := shl(mul(apartments.offset, 8), newVal)
            let result := or(reformedVal, reformed)
            sstore(slot, result)
        }
    }

The result of the combined hex should give us

0x270000000000000000000000037a000000000000000000000000ab000019000b

sstore() stores the new value in it the slot input. it collects two parameters, the slot and the data to store.

We see that when we call getApartment , it now returns the new value of apartments which is 25 (or what ever is passed to the setter)!

Resources

Bits and Bytes by Stanford

Bytes, numbers, and characters

Bitwise Operators | SolidityByExample

Yul official documentation