Reply to post: Re: Ugh!

Keen to go _ExtInt? LLVM Clang compiler adds support for custom width integers

ST Silver badge
Linux

Re: Ugh!

> Extra code for bitshifting is a given, but do you really want to be the one to write it?

I never said that. I said it has no real benefits.

> If it's in a structure with alignment turned off via a compiler option, pragma, or what have you, and you string a bunch together, it will save memory.

That's nonsense.

For one, you can't turn off alignment in the compiler. In some compilers you can brute-enforce alignment for structs/classes/unions that is larger than the natural alignment would be, via compile-line flag. But turning alignment off nope, you can't do that.

For two, misaligned reads/writes of scalars or vectors (read: loads and stores) always incur a giant run-time performance penalty.

For example, let's say that loading a properly aligned 64-bit integer into a register might cost 2 cycles - this is fact on some ISA's, some others might require 3 cycles. The value is loaded into the register atomically. Loading a misaligned 64-bit integer into a register decays into sequential byte-by-byte loads, each one costing 2 cycles. And the load is not atomic.

A 64-bit integer has 8 bytes. 8 bytes * 2 cycles per byte load that's 16 cycles for loading a misaligned 64-bit integer instead of 2 cycles for the properly aligned case. So, here we are: we're loading a misaligned 64-bit integer into a register for 16 cycles, and to that we add another 8 cycles overhead for shift + mask for the bitfield.

Is that the performance improvement you were aiming for?

Here's how you can trigger a misaligned read of a 64-bit integer:

unsigned char B[8]; // char/unsigned char have no alignment

(void) memset(B, 0, sizeof(B));

uint64_t* U = (uint64_t*) B; // this is misaligned

uint64_t V = *U + 119UL; // Boom!

(void) fprintf(stderr, "U=%lu V=%lu\n", *U, V);

This property is independent of hardware and ISA. It's the same on x86_64, SPARC, SPARC64, ARM64, ARM, PPC64, etc.

SPARC and SPARC64 used to trap and send SIGBUS to the process whenever they encountered a misaligned read or write. They finally gave up on that restriction with the last two versions of SPARC M7 and M8. But the performance penalty is still there.

The only reason you don't see SIGBUS that often these days is because most processors disable trap-on-misaligned by default. You can re-enable it by tickling the right registers at program startup.

Bitfields don't save you memory, and they don't save you cycles either.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Biting the hand that feeds IT © 1998–2021