Dev's Den: Counter.sol in Yul

Unleash your Blockchain potential with our CTO's must-read article on how Solidity compiler leverages Yul for efficient bytecode generation.

1/ What / Why

Yul is what Solidity compiler transforms the abstract syntax tree into after it parses your Solidity code. Yul is used for intermediate optimizations before generating the bytecode and then performing more optimizations on the bytecode. It is useful to look at Yul to get a better understanding of the code you write.

How Solidity compiler works

2/ Let’s walk through the generated Yul

For reference, this is our Counter.sol Solidity code. One public storage variable, one function that takes one 32 byte argument, and one function that takes no arguments and increments the storage variable.

We shall use forge to inspect the yul representation. To do so, run:

forge inspect Counter irOptimized

Note that there are a lot of interesting <FIELD> values you can use in place of irOptimized. Some that are worth mentioning: abi, storageLayout, methodIdentifiers, gasEstimates. For the full list of available values: forge inspect --help.

Let’s focus on the deployed contract part:

This part:

initialises the free memory pointer. 64 is the decimal value of 0x40 which we are more used to seeing in assembly. So mstore(64, _1) is mstore(0x40, 0x80).

## lt(), calldatasize()

Next, lt(calldatasize(), 4) checks the size of the calldata. Here lt() stands for 'less than' and it's a built-in Yul function. It takes two parameters, compares them, and returns 1 if the first argument is less than the second, and 0 if it is not.

calldatasize() is another built-in Yul function that returns the size of the calldata in bytes. Calldata is the data that is sent along with a function call to the contract, and includes the function selector (first four bytes) and the encoded function arguments.

So, lt(calldatasize(), 4) will return:

We are then checking whether this expression is 0 with iszero(.). If that is the case we go into the body of the if clause and switch match on the function selector. Otherwise, we fall through to revert(0, 0).

## forge inspect <contract name> methodIdentifiers

I have mentioned earlier that forge has a nice <field> value for inspect command: methodIdentifiers. Let’s run it: forge inspect Counter methodIdentifiers.

Here is the output:

Notice how these selectors match identically with the yul case statements.

## calldataload(), shr()

We then have a switchcase statement. Let’s see what shr(224, calldataload(_2)) does. This is a Yul instruction that shifts right the calldata loaded at position _2 (which we defined as zero) by 224 bits.

  1. calldataload(_2): This operation loads 32 bytes (256 bits) of calldata from the position indicated by _2. Since _2 is previously initialized as 0, this loads the first 32 bytes of the calldata.

In the context of Ethereum smart contracts, function calls are encoded into calldata as the keccak256 hash of the function signature followed by the encoded parameters. The function signature hash is always 4 bytes, so loading 32 bytes gets the function selector (4 bytes) followed by the beginning of the function parameters (the next 28 bytes).

  1. shr(224, ...): The shr operation stands for "SHift Right". This takes the 256 bits loaded from the calldata, and shifts them right by 224 bits.

Since each byte is 8 bits and we know that function selector is 4 bytes (32 bits), the number 224 here is calculated by subtracting the size of function selector from the size of the loaded data: 256 bits — 32 bits = 224 bits.

By shifting right 224 bits, it effectively discards the 28 bytes after the function selector, leaving only the 4-byte function selector in the rightmost position of the 256-bit word. This 4-byte value can then be used directly in the switch statement to compare with the function selectors defined in the Solidity code. In summary, this line is used to extract the 4-byte function selector from the calldata of the function call.

Notice how each case branch contains the following two lines (the last line is a little different for functions that don’t take arguments)

Mentally, substitute _2 for 0 , since Yul defines let _2 := 0 before the switch line. Since none of the functions are payable , if callvalue() is non-zero, we should revert. The second line is more fun.

## slt(), not()

The not() function in Yul performs a bitwise NOT operation. This operation flips each bit in the binary representation of a number.

In Yul and the EVM, all numbers are represented as 256-bit integers. This means they can range from 0 to 2²⁵⁶ — 1.

A binary not operation involves flipping all the bits in the binary representation of the number. For example, the number 3 is represented in binary as 11 (ignoring leading zeros). The not operation flips each bit, so you get 00. However, since we're dealing with 256-bit numbers, there are another 254 bits that we've left out. All of these bits are assumed to be 0 in the number 3, and they all get flipped to 1 by the not operation.

The result of not(3) is a 256-bit number where all bits are 1, except the last two, which are 0. In decimal, this is equivalent to 2²⁵⁶ - 4.

Here’s an example with smaller 8-bit numbers to illustrate the idea:

The not() function can take any 256-bit integer as input (i.e., any integer from 0 to 2²⁵⁶ - 1). The output will also be a 256-bit integer, and can be any integer in the same range.

In a bitwise NOT operation, not(x) is equivalent to (2^256 - 1) - x when x is a non-negative integer less than 2²⁵⁶. This is because flipping all the bits in a binary number is the same as subtracting the number from the maximum possible value with the same number of bits.

Now, let’s look at slt(…, 32): the slt() function stands for “Signed LesS Than”. It interprets its operands as signed integers. Therefore, the result of add(calldatasize(), not(3)), where not(3) equals 2²⁵⁶ — 4, can be interpreted as a negative number if calldatasize() is less than 4. In this case, slt() checks if this negative number is less than 32. The condition checks if the calldata, after excluding the 4-byte function selector, has fewer bytes than a full 32-byte argument. We use the setNumber(uint256) function as an example here, where this check makes sense: to set a number, we need a uint256 argument, which takes up 32 bytes. Hence, if slt(add(calldatasize(), not(3)), 32) returns true, the function reverts.

Note, that even though we have already checked that we have at least four byte function selector in the isZero(lt(calldatasize(), 4)) , we also need to check that provided calldatasize() does not exceed 4 bytes if increment() or number() functions are called. This is performed with

If calldatasize() is 4 bytes long, then the add() would return 0, because we would wrap around the uint256 . We then check whether 0 is less than 0, which is false, and so we do not revert. However, note, what happens when calldatasize() is larger than 4 bytes (we cannot have it be less than four bytes because earlier we have checked that it is at least four bytes). If it is 5 bytes long, then we essentially have slt(1, 0). So for two functions (one of which is actually a public uint256 storage variable) we revert if calldatasize() is not exactly four bytes long, which is just the function selector. In this particular example, each function’s selector could safely be changed to lt , I think. This remains an open question for me why Yul is using slt in this case.

## sstore()

Next, let’s explore the sstore() call in the setNumber(uint256) function (that’s the first case in our switchcase statement).

In Solidity, sstore(key, value) is a low-level function that stores the value at the location specified by key in the contract's storage. Storage in Solidity is a key-value store where both the keys and values are 32 bytes (256 bits). The keys can be thought of as addresses in the storage space.

The calldataload(offset) function in Solidity returns the 32 bytes of function call data starting from offset bytes into the call data. The first four bytes of any call are used for the function selector, so the actual arguments to the function start from the fifth byte.

So, sstore(_2, calldataload(4)) is storing the first argument of the function call into the storage slot at address 0 (because _2 was initialized to 0). This corresponds to the setNumber(uint256 newNumber) function in your Solidity code, where newNumber is stored in the number state variable of the contract (which is located at the 0th storage slot).

Therefore, sstore(_2, calldataload(4)) is taking the first argument passed in the function call data (starting at byte 4), and storing it in the contract's storage at slot 0.

## mstore(), return() , eq(), shl()

mstore(0x80, sload(0)) — I have substituted for the actual values defined previously here. We are simply storing the value in zero-th storage slot into the free memory (recall we have initialized free memory pointer to 0x80 ). This is in preparation to return the value to the caller.

return(0x80, 32) — we are returning what we have just placed into the free memory. We are saying that the value starts in location 0x80 and is 32 bytes long (a 256 bit unsigned integer).

At this point we have covered every function but increment() . Let’s look at it now.

Let’s skip over the reverts since we have talked about these already.

Hope you learned something new from this article. I will be writing posts like this primarily to serve me as a reference point in the future.

Open Questions

References

Site by New Now

Copyright© 2023 reNFT