Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
Language Features:

Compiler Features:
* Constant Optimizer: More effiecent computed constants. Approximately a 2.9% reduction in optimized bytecode size, along with small improvement in average gas costs.

Bugfixes:
* NatSpec: Disallow `@return` tag in event documentation.
Expand Down Expand Up @@ -96,7 +97,6 @@ Compiler Features:
* EVM: Set default EVM Version to `prague`.
* NatSpec: Capture Natspec documentation of `enum` values in the AST.


Bugfixes:
* SMTChecker: Do not consider loop conditions as constant-condition verification target as this could cause incorrect reports and internal compiler errors.
* SMTChecker: Fix incorrect analysis when only a subset of contracts is selected with `--model-checker-contracts`.
Expand Down
63 changes: 62 additions & 1 deletion libevmasm/ConstantOptimiser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -253,7 +253,68 @@ AssemblyItems ComputeMethod::findRepresentation(u256 const& _value)
if (_value < 0x10000)
// Very small value, not worth computing
return AssemblyItems{_value};
else if (numberEncodingSize(~_value) < numberEncodingSize(_value))

// A constant having a single contiguous run of ones, common in masking,
// can be efficiently created by first filling with 1's, then shifting
// left and/or right as needed to make the sides full of zeros.
//
// onesEnd onesStart
// v v
// 0x000000000000000000000000000000000000ffffffffffff0000000000000000
//
// If the value is all zeros, both onesEnd and onesStart will be 256.

// Find the index of the lowest one 1 bit
unsigned onesStart;
for (onesStart = 0; onesStart < 256; ++onesStart)
if (((_value >> onesStart) & 1) != 0)
break;

// Find the index after the highest one 1 bit in the run of 1's.
unsigned onesEnd;
for (onesEnd = onesStart; onesEnd < 256; ++onesEnd)
if (((_value >> onesEnd) & 1) == 0)
break;

// Check that there are no ones after onesEnd
bool const isOnlyContiguousOnes = (onesEnd == 256 || (_value >> onesEnd) == 0);

bool const worthTrying =
_value != 0 && // defensive check
m_params.evmVersion.hasBitwiseShifting() &&
onesEnd - onesStart > 32 && // push would be more efficient otherwise
(onesEnd < 256 || onesStart > 16); // negation more effective for 0xFF..FFFF00

if (isOnlyContiguousOnes && worthTrying)
{
// Build up the code, starting with a negated 0 to produce all ones.
// 0x00 ! ==
// 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
AssemblyItems newRoutine = AssemblyItems{u256(0), Instruction::NOT};

// Shift right as needed to create the correct number of 1 bits.
// 0x0000000000000000000000000000000000000000000000000000ffffffffffff
// If left aligned, we only need a left shift, and skip the right shift.
if (onesEnd != 256)
newRoutine += AssemblyItems{u256(256 - (onesEnd - onesStart)), Instruction::SHR};

// If needed, shift left to position the bits in the correct place
// or to setup a left aligned mask
// 0x000000000000000000000000000000000000ffffffffffff0000000000000000
if (onesStart > 0)
newRoutine += AssemblyItems{u256(onesStart), Instruction::SHL};
return newRoutine;
}

// pure negation can sometimes produce bad results
// example: 0xff00000000000000000000000000000000000000000000000000000000000000
// 0xff at the most significant byte of u256
// without the extra condition: not(sub(shl(0xf8, 0x01), 0x01))
// the extra condition turns that into: shl(0xf8, 0xff)
if (
numberEncodingSize(~_value) < numberEncodingSize(_value) &&
(onesEnd < 256 || onesEnd - onesStart > 16)
)
Comment on lines +309 to +317

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting!
We could try to separate this change and see what effect we get from this small change.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this depends on the computed mask variables (onesEnd, onesStart), so separating it would require a chunk of code duplicated in both PRs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I am suggesting is to keep the computation of onesEnd and onesStart, but skip the computation of newRoutine. Do you expect that would still provide some benefit?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would provide a tiny benefit for the rare cases like the given example (easy to see in a test case, hardly move the needle for real contracts). IMHO, not worth a separate PR.

P.S. Did you notice yul optimizer doesn't need this? It's because it's structurally better than the libevmasm version (yul compares different computations; libevasm compares a single computation with other choices, like DATACOPY, so the single computation has to have extra conditionals).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. Did you notice yul optimizer doesn't need this? It's because it's structurally better than the libevmasm version (yul compares different computations; libevasm compares a single computation with other choices, like DATACOPY, so the single computation has to have extra conditionals).

I definitely see some differences. If yul version does something better, the evmasm version could be improved to do the same thing, no?

In any case, no need to separate this into a new PR.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the yul optimizer on top of the not(0) constant optimizer makes the yul code even more optimized

That's weird. The tests included in this repo's CI tell a different story. When I added the not(0) optimizations to the yul path, there were improvements. Compare
#15935 (comment)
and
#15935 (comment)

For example, "colony" went from -1.91 to -2.33, etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is using the not(0) constant optimizer as the baseline here, so it's comparing the not(0) constant optimizer PR alone (0 changes by definition) vs having both the not(0) constant optimizer and removing the yul constant optimizer (the spat version).

The not(0) constant optimizer is definitely a win in either case.

@DanielVF DanielVF May 22, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I now get it, and now I also understand why on the performance optimization side I've been seeing things that only speed up yul speed up the regular compiler too.

appendYulUtilityFunctions() is called by appendMissingFunctions(), and places yul code into contracts. Doing this runs the yul side, including yul optimizers, even on non ir-contracts.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@moh-eulith Here's the chart, with both against a baseline of the current development branch. This is probably what you were expecting.

image

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume "splat" is without the yul optimizer. I think the code this is benchmarking against matters. I just tried this on the my own contracts: removing yul-constant-optimizer increased the contract total size from 61,593 to 61,903 (I compile with --via-ir --optimize --optimize-runs 2 so presumably similar to your via-ir-low-runs).

@blishko 's PR shows the same thing (#16738 (comment)) where ir-optimize-evm+yul has some regressions and some improvements.

// Negated is shorter to represent
return findRepresentation(~_value) + AssemblyItems{Instruction::NOT};
else
Expand Down
55 changes: 55 additions & 0 deletions libyul/backends/evm/ConstantOptimiser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@ struct MiniEVMInterpreter
return exp256(args.at(0), args.at(1));
case evmasm::Instruction::SHL:
return args.at(0) > 255 ? 0 : (args.at(1) << unsigned(args.at(0)));
case evmasm::Instruction::SHR:
return args.at(1) >> unsigned(args.at(0));
case evmasm::Instruction::NOT:
return ~args.at(0);
default:
Expand Down Expand Up @@ -135,6 +137,59 @@ Representation const& RepresentationFinder::findRepresentation(u256 const& _valu

Representation routine = represent(_value);

// A constant having a single contiguous run of ones, common in masking,
// can be efficiently created by first filling with 1's, then shifting
// left and/or right as needed to make the sides full of zeros.
//
// onesEnd onesStart
// v v
// 0x000000000000000000000000000000000000ffffffffffff0000000000000000
//
// If the value is all zeros, both onesEnd and onesStart will be 256.

// Find the index of the lowest one 1 bit
unsigned onesStart;
for (onesStart = 0; onesStart < 256; ++onesStart)
if (((_value >> onesStart) & 1) != 0)
break;

// Find the index after the highest one 1 bit in the run of 1's.
unsigned onesEnd;
for (onesEnd = onesStart; onesEnd < 256; ++onesEnd)
if (((_value >> onesEnd) & 1) == 0)
break;

// Check that there are no ones after onesEnd
bool const isOnlyContiguousOnes = (onesEnd == 256 || (_value >> onesEnd) == 0);

bool const worthTrying =
_value != 0 && // defensive check
m_dialect.evmVersion().hasBitwiseShifting() &&
onesEnd - onesStart > 32 && // push would be more efficient otherwise
(onesEnd < 256 || onesStart > 16); // negation more effective for 0xFF..FFFF00

if (isOnlyContiguousOnes && worthTrying)
{
// Build up the code, starting with a negated 0 to produce all ones.
// 0x00 ! ==
// 0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
Representation newRoutine = represent(*auxHandles.not_, represent(0));

// Shift right as needed to create the correct number of 1 bits.
// 0x0000000000000000000000000000000000000000000000000000ffffffffffff
// If left aligned, we only need a left shift, and skip the right shift.
if (onesEnd != 256)
newRoutine = represent(*auxHandles.shr, represent(256 - (onesEnd - onesStart)), newRoutine);

// If needed, shift left to position the bits in the correct place
// or to setup a left aligned mask
// 0x000000000000000000000000000000000000ffffffffffff0000000000000000
if (onesStart > 0)
newRoutine = represent(*auxHandles.shl, represent(onesStart), newRoutine);
routine = min(std::move(routine), std::move(newRoutine));
}


if (numberEncodingSize(~_value) < numberEncodingSize(_value))
// Negated is shorter to represent
routine = min(std::move(routine), represent(*auxHandles.not_, findRepresentation(~_value)));
Expand Down
1 change: 1 addition & 0 deletions libyul/backends/evm/EVMDialect.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,7 @@ EVMDialect::EVMDialect(langutil::EVMVersion _evmVersion, std::optional<uint8_t>
m_auxiliaryBuiltinHandles.mul = EVMDialect::findBuiltin("mul");
m_auxiliaryBuiltinHandles.not_ = EVMDialect::findBuiltin("not");
m_auxiliaryBuiltinHandles.shl = EVMDialect::findBuiltin("shl");
m_auxiliaryBuiltinHandles.shr = EVMDialect::findBuiltin("shr");
m_auxiliaryBuiltinHandles.sub = EVMDialect::findBuiltin("sub");
}

Expand Down
1 change: 1 addition & 0 deletions libyul/backends/evm/EVMDialect.h
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ class EVMDialect: public Dialect
std::optional<BuiltinHandle> mul;
std::optional<BuiltinHandle> not_;
std::optional<BuiltinHandle> shl;
std::optional<BuiltinHandle> shr;
std::optional<BuiltinHandle> sub;
};
/// Constructor, should only be used internally. Use the factory functions below.
Expand Down
8 changes: 4 additions & 4 deletions test/cmdlineTests/ir_subobject_order/output
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ object "C_33" {
if callvalue() { revert(0, 0) }
let _2 := datasize("A_13")
let _3 := add(_1, _2)
if or(gt(_3, sub(shl(64, 1), 1)), lt(_3, _1))
if or(gt(_3, shr(192, not(0))), lt(_3, _1))
{
mstore(0, shl(224, 0x4e487b71))
mstore(4, 0x41)
Expand All @@ -44,11 +44,11 @@ object "C_33" {
returndatacopy(pos, 0, returndatasize())
revert(pos, returndatasize())
}
sstore(0, or(and(sload(0), not(sub(shl(160, 1), 1))), and(expr_address, sub(shl(160, 1), 1))))
sstore(0, or(and(sload(0), shl(160, not(0))), and(expr_address, shr(96, not(0)))))
let _4 := mload(64)
let _5 := datasize("B_7")
let _6 := add(_4, _5)
if or(gt(_6, sub(shl(64, 1), 1)), lt(_6, _4))
if or(gt(_6, shr(192, not(0))), lt(_6, _4))
{
mstore(0, shl(224, 0x4e487b71))
mstore(4, 0x41)
Expand All @@ -62,7 +62,7 @@ object "C_33" {
returndatacopy(pos_1, 0, returndatasize())
revert(pos_1, returndatasize())
}
sstore(0x01, or(and(sload(0x01), not(sub(shl(160, 1), 1))), and(expr_address_1, sub(shl(160, 1), 1))))
sstore(0x01, or(and(sload(0x01), shl(160, not(0))), and(expr_address_1, shr(96, not(0)))))
let _7 := mload(64)
let _8 := datasize("C_33_deployed")
codecopy(_7, dataoffset("C_33_deployed"), _8)
Expand Down
4 changes: 2 additions & 2 deletions test/cmdlineTests/optimizer_BlockDeDuplicator/output
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ EVM assembly:
0x00
dup1
sload
not(sub(shl(0x40, 0x01), 0x01))
shl(0x40, not(0x00))
and
/* "input.sol":201:206 fun_x */
or(tag_0_7, shl(0x20, tag_2))
sub(shl(0x40, 0x01), 0x01)
shr(0xc0, not(0x00))
/* "input.sol":179:210 function() r = true ? fun_x : f */
and
or
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ sub_0: assembly {
/* "input.sol":147:152 x = f */
dup1
sload
not(0xffffffffffffffff)
shl(0x40, not(0x00))
and
/* "input.sol":151:152 f */
tag_17
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,11 @@ tag_1:
/* "input.sol":93:98 x = f */
dup1
sload
not(sub(shl(0x40, 0x01), 0x01))
shl(0x40, not(0x00))
and
/* "input.sol":97:98 f */
or(tag_0_12, shl(0x20, tag_4))
sub(shl(0x40, 0x01), 0x01)
shr(0xc0, not(0x00))
/* "input.sol":93:98 x = f */
and
or
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
dup4
add
swap2
sub(shl(0x40, 0x01), 0x01)
shr(0xc0, not(0x00))
dup4
gt
dup5
Expand Down Expand Up @@ -104,7 +104,7 @@ sub_0: assembly {
jumpi(tag_26, callvalue)
jumpi(tag_26, slt(add(not(0x03), calldatasize), 0x00))
sload(0x00)
sub(shl(0xff, 0x01), 0x01)
shr(0x01, not(0x00))
dup2
eq
tag_14
Expand Down Expand Up @@ -350,7 +350,7 @@ sub_0: assembly {
dup4
add
swap2
sub(shl(0x40, 0x01), 0x01)
shr(0xc0, not(0x00))
dup4
gt
dup5
Expand Down Expand Up @@ -450,7 +450,7 @@ sub_0: assembly {
jumpi(tag_26, callvalue)
jumpi(tag_26, slt(add(not(0x03), calldatasize), 0x00))
sload(0x00)
sub(shl(0xff, 0x01), 0x01)
shr(0x01, not(0x00))
dup2
eq
tag_14
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -620,7 +620,7 @@ object \"C_54\" {
let programSize := datasize(\"C_54\")
let argSize := sub(codesize(), programSize)
let newFreePtr := add(_1, and(add(argSize, 31), not(31)))
if or(gt(newFreePtr, sub(shl(64, 1), 1)), lt(newFreePtr, _1))
if or(gt(newFreePtr, shr(192, not(0))), lt(newFreePtr, _1))
{
mstore(/** @src -1:-1:-1 */ 0, /** @src 0:79:510 \"contract C...\" */ shl(224, 0x4e487b71))
mstore(4, 0x41)
Expand Down Expand Up @@ -685,7 +685,7 @@ object \"C_54\" {
if callvalue() { revert(0, 0) }
if slt(add(calldatasize(), not(3)), 0) { revert(0, 0) }
let _3 := sload(0)
if eq(_3, sub(shl(255, 1), 1))
if eq(_3, shr(1, not(0)))
{
mstore(0, shl(224, 0x4e487b71))
mstore(4, 0x11)
Expand Down Expand Up @@ -1457,7 +1457,7 @@ object \"D_72\" {
let programSize := datasize(\"D_72\")
let argSize := sub(codesize(), programSize)
let newFreePtr := add(_1, and(add(argSize, 31), not(31)))
if or(gt(newFreePtr, sub(shl(64, 1), 1)), lt(newFreePtr, _1))
if or(gt(newFreePtr, shr(192, not(0))), lt(newFreePtr, _1))
{
mstore(/** @src -1:-1:-1 */ 0, /** @src 1:91:181 \"contract D is C(3)...\" */ shl(224, 0x4e487b71))
mstore(4, 0x41)
Expand Down Expand Up @@ -1529,7 +1529,7 @@ object \"D_72\" {
if callvalue() { revert(0, 0) }
if slt(add(calldatasize(), not(3)), 0) { revert(0, 0) }
let _3 := sload(0)
if eq(_3, sub(shl(255, 1), 1))
if eq(_3, shr(1, not(0)))
{
mstore(0, shl(224, 0x4e487b71))
mstore(4, 0x11)
Expand Down
Loading