I am very much intrigued by the differences of the algorithms in the two code snippets.
The algorithm in the C code looks slower than the one in C# because it has 3 multiplications that depend on each other. The last two multiplications of the C# code are independent of each other.
I am also curious on why you did not use ‘val = (val * 2561) >> 8’ in the second last line of the C# code?
I have implemented both algorithms in Java, and measurements with JMH indicate that the C# algorithm is slower than the C algorithm. However when I use ‘val = (val * 2561) >> 8’ in the C# algorithm, it has the same speed as the C algorithm.
I have merged the two versions. Thank for your comment.
Todd Lehmansays:
Hi, Daniel. A couple things I just noticed in parse_eight_digits_swar… (1) It seems to be missing the declaration and loading of the variable val. (2) val was declared earlier as type int64_t but the return value of the function is uint32_t. This should be fine since $log_2(10^8) < 32$, but is likely to generate a compiler warning in the absence of an explicit cast, e.g., return (uint32_t)val;.
Yes. It is likely to generate a warning since there is an implicit cast.
It seems to be missing the declaration and loading of the variable val.
It is passed as a parameter to the function.
Todd Lehmansays:
It is passed as a parameter to the function.
It is now (thank you for fixing it!), but at the time I posted my comment, it was not. The only parameter being passed was const char *chars (I’m looking at it right now on my screen, which still has the old code in another window.)
Cheers!
It may be worth noting that this technique is not only useful for converting an 8-digit decimal integer to binary, but can also be used to load 8 digits at a time from a larger number, if you treat the 8 characters as a base-100,000,000 digit. So, say you were loading a 40-digit base-10 number with leading zeros into a uint128_t variable. It would require only five invocations of the 8-digit parsing function, with five additional multiplications (by the constant 100,000,000) and five additions accumulate the five base-100,000,000 digits.
Todd Lehmansays:
One other related thought for possible exploration is whether or not SWAR techniques can be used to efficiently count the number of ASCII base-10 digits (‘0’..’9′) appearing sequentially at a given position in a string or at a given memory location. If so, then one could construct 8 versions of the SWAR parser, one for each count of available characters, 1 to 8, and invoke the appropriate version, thus eliminating the requirement for leading zeros to be present.
Ioann_Vsays:
Hey, Dave, it’s mb interesting for u:
Here, 1+ year ago i wrote a publication (in Russian) about parsing decimal numbers, without using LUT:
Please put a HTML code element inside of HTML pre text
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/code#notes
I am very much intrigued by the differences of the algorithms in the two code snippets.
The algorithm in the C code looks slower than the one in C# because it has 3 multiplications that depend on each other. The last two multiplications of the C# code are independent of each other.
I am also curious on why you did not use ‘val = (val * 2561) >> 8’ in the second last line of the C# code?
I have implemented both algorithms in Java, and measurements with JMH indicate that the C# algorithm is slower than the C algorithm. However when I use ‘val = (val * 2561) >> 8’ in the C# algorithm, it has the same speed as the C algorithm.
I have merged the two versions. Thank for your comment.
Hi, Daniel. A couple things I just noticed in
parse_eight_digits_swar
… (1) It seems to be missing the declaration and loading of the variableval
. (2)val
was declared earlier as typeint64_t
but the return value of the function isuint32_t
. This should be fine since $log_2(10^8) < 32$, but is likely to generate a compiler warning in the absence of an explicit cast, e.g.,return (uint32_t)val;
.is likely to generate a compiler warning
Yes. It is likely to generate a warning since there is an implicit cast.
It seems to be missing the declaration and loading of the variable val.
It is passed as a parameter to the function.
It is now (thank you for fixing it!), but at the time I posted my comment, it was not. The only parameter being passed was
const char *chars
(I’m looking at it right now on my screen, which still has the old code in another window.)Cheers!
Thanks for your comment. You are correct.
It may be worth noting that this technique is not only useful for converting an 8-digit decimal integer to binary, but can also be used to load 8 digits at a time from a larger number, if you treat the 8 characters as a base-100,000,000 digit. So, say you were loading a 40-digit base-10 number with leading zeros into a uint128_t variable. It would require only five invocations of the 8-digit parsing function, with five additional multiplications (by the constant 100,000,000) and five additions accumulate the five base-100,000,000 digits.
One other related thought for possible exploration is whether or not SWAR techniques can be used to efficiently count the number of ASCII base-10 digits (‘0’..’9′) appearing sequentially at a given position in a string or at a given memory location. If so, then one could construct 8 versions of the SWAR parser, one for each count of available characters, 1 to 8, and invoke the appropriate version, thus eliminating the requirement for leading zeros to be present.
Hey, Dave, it’s mb interesting for u:
Here, 1+ year ago i wrote a publication (in Russian) about parsing decimal numbers, without using LUT:
https://habr.com/ru/post/522820/
It s also using this trick, which u named SWAR :]
There’s a typo which is a little confusing.
1000000*(10*b1+b2) + 100*(10*b3+b6)
should read:
1000000*(10*b1+b2) + 100*(10*b5+b6)
Thank you!