delvingbitcoin
Libbitcoin for Core people
Posted on: November 30, 2024 07:33 UTC
The discussion highlights the impact of native SHA256 limitations on performance, particularly in environments with low thread counts.
Despite the CPU not being fully utilized, these limitations significantly influence Core's operations, underscoring the sequential nature of the process. This situation is further illustrated by the presence of various vectorizations like SSE4, AVX2, and AVX512 in Merkle tree constructions and message scheduling, although it's noted that the benchmark hardware lacks AVX512 support.
Furthermore, efforts to mitigate these constraints are underway, including the integration of SHANI for enhanced performance and the implementation of several SHA optimizations. These optimizations encompass strategies such as caching entire block paddings, rewriting functions for efficiency, and adopting vectorization-friendly practices for copying arrays. These measures collectively aim to optimize SHA256 handling, despite the inherent challenges posed by its native design and the specific hardware capabilities.