User topics

Article topics

Log in Sign up

Keen to go _ExtInt? LLVM Clang compiler adds support for custom width integers

Erich Keane, a compiler frontend engineer at Intel, has committed a patch to the LLVM Clang project that enables custom width integers - such as 31 bit, 3 bit or 512 bit. The assumption of power-of-two integer sizes is baked into computing and into the C language. "Historically these types have been sufficient for nearly all …

COMMENTS

Post your comment

House rules Send corrections

Add to 'My topics'

Friday 24th April 2020 10:21 GMT Andy Non

Sounds like a good idea

No point wasting time processing unused bits. Reminds me of my early days programming in the 70's and 80's when RAM and disk space was at an absolute premium and I (everyone really) used all manner of weird and wonderful ways to compress data to the minimum. It did of course eventually lead to problems like Y2K with an assumed "19" or "20" depending on whether the year was more or less than 80 for example. It did make handling other people's poorly documented code a nightmare though, especially if they munged multiple values into one integer variable (using higher bit positions) to hold boolean or other data; no bits wasted. I've come across some real pigs to debug later when the bits "unexpectedly" overflowed into each other.

3 5 Reply
1. Friday 24th April 2020 10:48 GMT Wilseus
  
  Re: Sounds like a good idea
  
  "I've come across some real pigs to debug later when the bits "unexpectedly" overflowed into each other."
  
  That can be avoided by only ever setting the bit(s) using carefully written macros that mask out the untouchable bits.
  
  5 0 Reply
  1. Friday 24th April 2020 11:04 GMT Andy Non
    
    Re: Sounds like a good idea
    
    The one instance that sprang to mind when I wrote the above was when the original programmer was storing a value that never went negative, so he'd used the highest bit to store something else. Until the inevitable happened and it did store a negative value.
    
    You can also have lots of fun compressing alpha-numeric characters. If the user's input data can only consist of A-Z, 0-9, comma, full stop and Space - a total of 39 characters, you can encode this as binary in 5 bits (0 - 38 decimal). Leaving 3 bits free per byte. Luxury! So you can start your next character using the remaining 3 bits from the previous byte and so on. It was a relief when RAM and disk space increased and such binary gymnastics were no longer required.
    
    5 0 Reply
    1. Friday 24th April 2020 11:16 GMT Pier Reviewer
      
      Re: Sounds like a good idea
      
      You’ve basically described how security arise. Make assumption. Assumption is invalidated. Shit happens.
      
      It’s also why we (should) unit test for such things before pushing to prod. But hey, testing is boring so we don’t do it right?
      
      As to using unused bits - plenty of tech still does that. The Deflate also, ASN.1 PER etc. It’s not going away.
      
      5 3 Reply
      1. Friday 24th April 2020 11:22 GMT Andy Non
        
        Re: Sounds like a good idea
        
        Of course it should be tested! I never said it shouldn't. I'm saying the idea in principle sounds good and should be looked into, not just dismissed out of hand. There could easily be issues found that make it a non-starter. We won't know until it is properly researched and tested.
        
        3 0 Reply
      2. Friday 24th April 2020 20:23 GMT Red Ted
        
        Re: Sounds like a good idea
        
        It also describes the mechanism by which the first Ariane 4 launch ended in a loud BANG!
        
        They reused one of the systems from 3, but the parameter it measured was larger than it was on 3, so it overflowed the type conversion before it was sent to another flight computer. The overflow then caused the first system to generate error messages that were then interpreted by the second system as valid data.
        
        2 1 Reply
    2. Friday 24th April 2020 12:17 GMT Jonathan Richards 1
      
      Re: Sounds like a good idea
      
      > compressing alpha-numeric characters
      
      That rang a bell, and nobody's doing anything useful, so I got off the shelf a 1980 diary which I used as a notebook during the subsequent year or so. On the page for 4 February I have hand-compiled a 31-byte routine for the MCS 6502 which unpacks 4 6-bit wide characters from a 3-byte package, and on the next page a 27-byte routine to do the reverse, saving 25% on RAM. I'm typing on a machine with 8,589,934,592 bytes of RAM, though, so I don't think I'll fire up the emulator to check my hand-coding!
      
      5 0 Reply
      1. Sunday 26th April 2020 19:30 GMT Anonymous Coward
        
        Re: Sounds like a good idea
        
        The PDP-8 used 6 bit bytes, and base 64 encoding has hung around for a long time. I confess to having used it in anger before passing guaranteed A-Z/a-z/0-9 data into and out of a link, back in the days when baud rates were low. It's left as an exercise for the reader to work out which byte values, if any, can never occur and so can be used as markers.
        
        1 2 Reply
    3. Friday 24th April 2020 12:33 GMT Warm Braw
      
      Re: Sounds like a good idea
      
      a total of 39 characters, you can encode this as binary in 5 bits (0 - 38 decimal)
      
      I think it might be difficult to represent 39 values using only 5 bits. Perhaps you're thinking of Rad-50 (which is really Rad-40 in decimal).
      
      3 0 Reply
      1. Friday 24th April 2020 13:58 GMT Andy Non
        
        Re: Sounds like a good idea
        
        Sorry, yes, you need 6 bits to encode 39 characters. My excuse is that it was a very long time ago in a universe far, far away. Well late 1970's anyway.
        
        2 0 Reply
        
        Friday 24th April 2020 17:52 GMT Claptrap314
        
        Re: Sounds like a good idea
        
        Actually, this is a nice demonstration of why this is a BAD idea.
        
        3 0 Reply
        
        Friday 24th April 2020 23:48 GMT martinusher
        
        Re: Sounds like a good idea
        
        The origiinal Baudot code was 5 bits. It used a shift character to switch betwen letters and numbers, a bit like how we switch keyboards on a phone.
        
        Fitting shorter variables into a longer space would be done using bit fields these days. Some types of processors have bit manipulation instructions which the compiler's code generator can take advantage of.. (Cue discussion of the merits of bit fields -- for the record, I don't like them but then I didn't write all the code I've worked on.)
        
        0 1 Reply
    4. Saturday 25th April 2020 00:23 GMT G.Y.
      
      50 Re: Sounds like a good idea
      
      DEC's RADIX50 (using base 40) was a better idea; hence .OBJ rather than .OB
      
      0 0 Reply
  2. Friday 24th April 2020 12:57 GMT Dan 55
    
    Re: Sounds like a good idea
    
    That can be avoided by only ever setting the bit(s) using carefully written macros that mask out the untouchable bits.
    
    Or bit fields, which are easier, nicer, and let the compiler do the hard work.
    
    5 1 Reply
    1. Friday 24th April 2020 15:02 GMT Wilseus
      
      Re: Sounds like a good idea
      
      "Or bit fields, which are easier, nicer, and let the compiler do the hard work."
      
      They have their own problems though, unfortunately.
      
      1 0 Reply
  3. Friday 24th April 2020 17:50 GMT bombastic bob
    
    Re: Sounds like a good idea
    
    C already has bit-size designators within a structure, like
    
    struct thingy
    
    {
    
    unsigned int nine_bit:9;
    
    };
    
    etc. - it'll get padded out to a power of 2 [probably native word size] but you can modify that with packing and so on.
    
    Thing is, as I understand this can cause a bit of trouble with endian-ness so it's almost a YMMV kind of thing. As a result I end up hard-implementing the non-standard integer types with macros so that it's consistent regardless of integer size or endian-ness. [portable structure definitions that compile on x86, amd64, ARM, _and_ an Arduino, using those binary structures to transfer data back/forth between all of those]
    
    having the designated _ExtInt support would probably help a LOT.
    
    1 0 Reply
2. Friday 24th April 2020 12:01 GMT Anonymous Coward
  
  "No point wasting time processing unused bits"
  
  Only if the system architecture (CPU. memory, ecc.) supports it.
  
  When you have fixed size registers, the CPU will act on the whole register size regardless of how many bits you define as "used". Sometimes using the smaller registers (like the 8-bit ones in the Intel architecture) may mean slower performances. Actually if say you define a 11 bit integer, the compiler will need to add more instructions to properly work on them, as CPU native instruction will expect a different format.
  
  It may use less memory (remember the "packed" specifier in Pascal?), but once again most modern architectures have a performance hit when data are not aligned along some boundary - and CPU caches will anyway move blocks of some size anyway.
  
  I fully understand the value of saving bits in architectures like FPGA where resources are contrained - on others it may just lead to new "Y2K" problems in the future, without any real saving.
  
  5 1 Reply
3. Friday 24th April 2020 19:04 GMT Anonymous Coward
  
  Re: Sounds like a good idea
  
  Actually the exact opposite. Not using a natural size integer for your platform can slow things down as the compiler may have to generate extra code to keep it in bounds of the reduced size.
  
  So for say a 30bit integer the compiler might have to keep masking off the top 2 bits on a 32bit system.
  
  7 0 Reply
  1. Friday 24th April 2020 22:31 GMT Anonymous Coward
    
    Re: Sounds like a good idea
    
    Not sure why you posted AC.
    
    But you're stating the obvious. :-)
    
    While it may sound inefficient to round up to a power of 2its actually more efficient.
    
    1 1 Reply
4. Saturday 25th April 2020 01:18 GMT Anonymous Coward
  
  Re: Sounds like a good idea
  
  The domain where _ExtInt would be applicable is very narrow, and that was clearly explained in the blog entry.
  
  This applies, and is useful to, FPGA's. It is of no use to general-purpose computers that run on CPU's that lack integer registers smaller than 32 bits.
  
  Coincidentally, X86 and X86_64 ISA's have registers smaller than 32 bits. Which is why Intel has been pushing this for a while. RISC CPU's don't have small registers, and don't care. On RISC machines you get 32-bit integer registers or 64-bit integer registers. On RISC-V - in theory - you also get 128-bit registers. Haven't seen it yet on a real RISC-V CPU.
  
  So, the whole theory about saving massive amounts of memory won't apply to a RISC machine. A 16-bit short will still be loaded into a 32-bit register. The concept of a C short on a RISC machine is really just the lower 16 bits of a 32-bit register, with the upper ones zero'ed out. Simply because doing the entire sequence of shift + mask to store two 16-bit shorts into a 32-bit register is much more expensive computationally than using the whole 32-bit register.
  
  Technically, it is a very nice to have feature in clang, but it has a very narrow focus and applicability domain. No-one is going to re-write their software running on Linux X86_64 to use 4-bit integers anytime soon.
  
  2 0 Reply
Friday 24th April 2020 12:09 GMT Detective Emil

I get this feeling of deja vu …

Gets well-thumbed copy of Kernighan & Ritchie (1978 edition) off the shelf …

Ah, here it is on page 137, in section 6.6, Fields: "Fields behave like small unsigned integers, and may participate in arithmetic expressions, just like any other integer."

To be fair, nobody used them much at the time.

7 0 Reply
Friday 24th April 2020 12:31 GMT Anonymous Coward

Just like Mentor Graphics Handel-C compiler then except with an arbitrary limit on the width of 2^24. Fine in practice of course but worth being mindful of if doing arithmetic on those widths.

1 0 Reply
Friday 24th April 2020 12:39 GMT Dan 55

What am I missing?

typedef struct {
unsigned char flag:1; unsigned char nibble:4; unsigned char munch:2; unsigned int mouthful:18;
} someStructure;

What's wrong with that?

Edit: By the way, the code tag is a bit broken on El Reg.

4 0 Reply
1. Friday 24th April 2020 14:23 GMT Anonymous Coward
  
  Re: What am I missing?
  
  padding.
  
  5 0 Reply
  1. Friday 24th April 2020 14:44 GMT Dan 55
    
    Re: What am I missing?
    
    typedef struct {
    unsigned char flag:1; unsigned int pad1:0; unsigned char nibble:4; unsigned int pad2:0; unsigned char munch:2; unsigned int pad3:0; unsigned int mouthful:18; unsigned int pad4:0;
    } someStructure;
    
    Some people want it all on a plate.
    
    1 2 Reply
    1. Friday 24th April 2020 18:11 GMT Anonymous Coward
      
      Re: What am I missing?
      
      Nearly ;) You can't name zero width bit-fields
      
      typedef struct { unsigned char flag:1; unsigned int :0; unsigned char nibble:4; unsigned int :0; unsigned char munch:2; unsigned int :0; unsigned int mouthful:18; unsigned int :0;
      } someStructure;
      
      2 2 Reply
2. Friday 24th April 2020 17:52 GMT bombastic bob
  
  Re: What am I missing?
  
  "What's wrong with that?"
  
  you used K&R bracing style. (use Allman style instead - more readable!)
  
  1 2 Reply
  1. Sunday 26th April 2020 19:35 GMT Anonymous Coward
    
    Re: What am I missing?
    
    People are having mental health issues already, let's not start a flame war.
    
    1 1 Reply
3. Friday 24th April 2020 23:28 GMT Phil Endecott
  
  Re: What am I missing?
  
  What you’re missing is using that syntax anywhere other than in a field of a struct.
  
  E.g. int:6 foo(int p1:3, unsigned p2:15);
  
  And sizes > 32 or 64. I’d quite like a standardised 128-bit and maybe 256-bit int.
  
  3 1 Reply
  1. Saturday 25th April 2020 17:47 GMT Dan 55
    
    Re: What am I missing?
    
    Ok, I'll buy the first one off you (although you can work round it), but I'm not sure about the second one.
    
    Most CPUs generally don't have don't have 128-bit and 256-bit arithmetic operators and passing big integers isn't fully supported by Windows and UNIX function calling conventions, so everything's got to be done by software anyway. That and the article was about saving space on FGPAs, not having big integers. But yes, I guess they would be nice to have.
    
    0 0 Reply
    1. Sunday 26th April 2020 03:50 GMT Phil Endecott
      
      Re: What am I missing?
      
      The issue with e.g. 128-bit integers in C is that there’s no access to the carry flag, so what would be a sequence of 4 add-with-carry instructions (on a 32-bit CPU) ends up much more complicated. (If you’re lucky the compiler will spot what you’re doing and use add-with-carry, but do you know what you have to write in C for it reliably do that?)
      
      uint64_t a,b,c,d,e,f;
      
      ....
      
      c = a + b;
      
      bool carry = c<a;
      
      f = d + e + (carry ? 1 : 0);
      
      Is that correct? Does that produce the optimal two-instruction sequence?
      
      .... now write a version for signed arithmetic!
      
      0 0 Reply
Friday 24th April 2020 13:02 GMT Giovani Tapini

i am now quite worried

with the apparently easy access to 70s coding books and 80's diaries...

backs out quietly....

4 0 Reply
1. Friday 24th April 2020 13:29 GMT Gene Cash
  
  Re: i am now quite worried
  
  Eh, my current Android checkbook app has entries going back to 1998.
  
  This is because it started on a Palm device (something like 13 of them - lots of upgrades & warranty replacements) then was migrated to a Nokia N810, then to an Motorola Droid (AKA Milestone in the UK) and is currently on my Moto G6 Play.
  
  0 0 Reply
Friday 24th April 2020 13:19 GMT Arthur the cat

Anyone else remember PL/I?

Declaring variables with specific widths like BINARY(17) or DECIMAL(7). Or more likely the abbreviated forms BIN(17) and DEC(7). PL/I also had the wonderful "feature" that the checkout compiler handled a subtly different language from the one the optimising compiler supported so code deemed valid by checkout would fail to compile under optimise (or vice versa).

4 0 Reply
1. Tuesday 11th May 2021 03:08 GMT david 136
  
  Re: Anyone else remember PL/I?
  
  Also MODULA2/3
  
  0 0 Reply
Friday 24th April 2020 14:24 GMT Anonymous Coward

A one bit integer isn't an integer.

It's a bit.

0 8 Reply
1. This post has been deleted by its author
Friday 24th April 2020 15:06 GMT Chris Gray 1

Ugh!

For many years I've had the very strange hobby of creating programming languages and writing compilers for them. My first readily-available one ran on 8-bitters under CP/M. It had the ability to define and use integers with user-specified bitwidths. Thought it would be useful on the memory-constrained machines of the day. Tryed using them in one major-ish project I did. Bad idea. Never tried to use them again.

For programming FPGA's having various-sized fields is pretty basic. But why does that have to reflect itself back into something like the C programming language, which is intended for general-purpose programming? My gut tells me that they will be patching weird issues for years, and that any actual benefit will not be worth the overall cost.

4 0 Reply
1. Friday 24th April 2020 18:01 GMT bombastic bob
  
  Re: Ugh!
  
  "But why does that have to reflect itself back into something like the C programming language, which is intended for general-purpose programming?"
  
  It is highly likely, especially in the world of IoT, that an FPGA or microcontroller could define structures and/or data with custom bit sized integers that were defined that way to function with limited RAM or limited NVRAM or EEPROM on the target device.
  
  So the IoT device needs to have its data interpreted or 'firmly packed' before being sent to the device. It's much better if you can define the data structures and other things using the same C code for the FPGA or microcontroller *AND* the thing controlling it. Yeah, been there, done that. See my earlier post.
  
  And, yeah, IoT makes this even more important to consider.
  
  If LLVM implements it, gcc will no doubt follow. I use llvm with FreeBSD already, so good news for me if it gets in there this time. If it can be made a standard for the C language, that'd be awesome! [but yeah I'd expect some changes before that happens - committees need to "do things" after all]
  
  0 1 Reply
  1. Friday 24th April 2020 20:01 GMT Chris Gray 1
    
    Re: Ugh!
    
    Bob, I'm quite aware of using structs to overlay hardware resources - done lots of that. But, I recall comp.arch discussions of a few (several?) years ago saying essentially that using bitfields in C structs and expecting to produce correct portable code is not going to work well. The biggest issue was endianess, I believe. C doesn't say enough about how bitfields are layed out to make them safely usable across architectures.
    
    In my latest programming language, I've split the concepts apart - structs and "bits" types. In the latter, the endianess is, I hope, well enough defined to be usable. It's clearly usable for space-saving, but I've had no opportunity to try it on hardware interfacing.
    
    2 0 Reply
    1. Saturday 25th April 2020 12:59 GMT Anonymous Coward
      
      Re: Ugh!
      
      > [ ... ] using bitfields in C structs and expecting to produce correct portable code is not going to work well.
      
      Aside from the endianess problem, there's also the cost problem. Declaring
      
      unsigned int i : 3;
      
      does not create a bitfield of 3 bits. It creates an unsigned int, of which only 3 bits are usable. The remaining 29 bits are still there, they're just not accessible.
      
      Depending on how good your compiler is, it might or might not warn if you then write something like this:
      
      i = 19;
      
      The accessible 3 bits in this example are accessed via shift + mask. This adds CPU cycles.
      
      The cost of loading and storing the unsigned int is unavoidable. By using a bitfield, the compiler is forced to add shifts and and's to manipulate the bitfield, on top of the cost of loads and stores.
      
      I've seen a lot of code that does something like this:
      
      unsigned int isValid : 1;
      
      instead of using a bool, under the incorrect assumption that this saves a lot of memory. It doesn't save any memory, and it's more expensive than just using a bool.
      
      0 0 Reply
      1. Saturday 25th April 2020 14:09 GMT Dan 55
        
        Re: Ugh!
        
        Extra code for bitshifting is a given, but do you really want to be the one to write it? Personally I think it's the kind of stuff best left to the compiler.
        
        If it's in a structure with alignment turned off via a compiler option, pragma, or what have you, and you string a bunch together, it will save memory. Then again, behaviour is compiler dependent (VC tends to create bigger structures with unused bits between structure elements).
        
        0 0 Reply
        
        Saturday 25th April 2020 20:31 GMT Anonymous Coward
        
        Re: Ugh!
        
        > Extra code for bitshifting is a given, but do you really want to be the one to write it?
        
        I never said that. I said it has no real benefits.
        
        > If it's in a structure with alignment turned off via a compiler option, pragma, or what have you, and you string a bunch together, it will save memory.
        
        That's nonsense.
        
        For one, you can't turn off alignment in the compiler. In some compilers you can brute-enforce alignment for structs/classes/unions that is larger than the natural alignment would be, via compile-line flag. But turning alignment off nope, you can't do that.
        
        For two, misaligned reads/writes of scalars or vectors (read: loads and stores) always incur a giant run-time performance penalty.
        
        For example, let's say that loading a properly aligned 64-bit integer into a register might cost 2 cycles - this is fact on some ISA's, some others might require 3 cycles. The value is loaded into the register atomically. Loading a misaligned 64-bit integer into a register decays into sequential byte-by-byte loads, each one costing 2 cycles. And the load is not atomic.
        
        A 64-bit integer has 8 bytes. 8 bytes * 2 cycles per byte load that's 16 cycles for loading a misaligned 64-bit integer instead of 2 cycles for the properly aligned case. So, here we are: we're loading a misaligned 64-bit integer into a register for 16 cycles, and to that we add another 8 cycles overhead for shift + mask for the bitfield.
        
        Is that the performance improvement you were aiming for?
        
        Here's how you can trigger a misaligned read of a 64-bit integer:
        
        unsigned char B[8]; // char/unsigned char have no alignment
        
        (void) memset(B, 0, sizeof(B));
        
        uint64_t* U = (uint64_t*) B; // this is misaligned
        
        uint64_t V = *U + 119UL; // Boom!
        
        (void) fprintf(stderr, "U=%lu V=%lu\n", *U, V);
        
        This property is independent of hardware and ISA. It's the same on x86_64, SPARC, SPARC64, ARM64, ARM, PPC64, etc.
        
        SPARC and SPARC64 used to trap and send SIGBUS to the process whenever they encountered a misaligned read or write. They finally gave up on that restriction with the last two versions of SPARC M7 and M8. But the performance penalty is still there.
        
        The only reason you don't see SIGBUS that often these days is because most processors disable trap-on-misaligned by default. You can re-enable it by tickling the right registers at program startup.
        
        Bitfields don't save you memory, and they don't save you cycles either.
        
        0 0 Reply
        
        Saturday 25th April 2020 22:08 GMT Dan 55
        
        Re: Ugh!
        
        In that case it would be impossible to have bit flags packed together or have a short followed by an int without a two-byte space between them, and you can, and I do. I turn off alignment with #pragma pack.
        
        The resultant code generated by the compiler to access this structure is obviously more complicated, but don't care because I haven't claimed there are going to be performance gains because they aren't any.
        
        It is good for using less memory space (where did you get the idea non-aligned structures doesn't use less memory space?) which is in turn good for reading/writing records from binary files and good for sending binary data across a network yet the source code is kept relatively simple. The object code is more complicated, but as I'm not writing assembly language I don't care about that.
        
        0 0 Reply
        
        Monday 27th April 2020 07:48 GMT Mike 125
        
        Re: Ugh!
        
        >I turn off alignment with #pragma pack. The resultant code generated by the compiler to access this structure is obviously more complicated,
        
        You're forcing the compiler to use a non-native data representation. That is nearly always a bad idea.
        
        The resultant source code is subtly non-portable and dangerous, giving rise to one of the worst categories of bug: it can come and go depending on something as obscure as the exact alignment of the binary in memory. Even on the local machine, its runtime can vary according to alignment, causing further strange behaviour.
        
        Saving memory by sensible ordering of C structures, is a well explored and explained issue.
        
        The Lost Art of Structure Packing.
        
        1 0 Reply
        
        Tuesday 28th April 2020 13:47 GMT Dan 55
        
        Re: Ugh!
        
        Thanks for the link. I'm familiar with memalign, htons, and friends but an interesting read nonetheless.
        
        0 0 Reply
  2. Tuesday 11th May 2021 03:10 GMT david 136
    
    Re: Ugh!
    
    Not buying IoT as a justification. IoT stuff can and does run full kernels with full TCP/IP stacks.
    
    It's only the FPGA-like things that need this, for custom logic at high speed.
    
    Plain IoT devices will be using $0.25 SoCs that don't need this level of tweakage.
    
    0 0 Reply
2. Friday 24th April 2020 18:02 GMT heyrick
  
  Re: Ugh!
  
  Don't be so sure it'll be accepted. The standard never took on the x86 near and far peculiarities, and I think I can be fairly confident in saying that there was a hell of a lot more C written for x86 back then than there is ever likely to be for FPGAs.
  
  2 0 Reply
Friday 24th April 2020 17:59 GMT Claptrap314

This smells Cthulhuian

Masses of writhing bits, forced together by some unnatural force....

Seriously, if you are only using 5 bits out of eight, there is a good chance that your overflow won't hurt anyone. I'm betting there is a LOT of sloppy programming practice that is going to be exposed (over the course of decades) by going this route.

I appreciate that FPGA are bit-tight. But when you start doing arithmetic, (and not just logic) things tend to get messy.

I appreciate the desire to avoid wasting bits. I also appreciate the horrors that get created when we start bit-cramming.

0 0 Reply
Friday 24th April 2020 18:27 GMT Anonymous Coward

Been there, done that, don't do it

One of my first languages (in high school) was Autocoder for the IBM 1401. It allowed you to define character, base 10 integers from 1 to 1000 characters (floating point? we didn't need no stinking floating point!). The excuse was that it mimicked punch cards.

I later worked (in a telco, in a production environment) with call records that were defined at the bit level. The excuse was that the switches couldn't produce bigger records.

Neither was easy to maintain and the latter ended up costing the company a chunk of change due to poor maintenance (not mine).

Decades ago, space was at a premium, it's why Y2K happened.

Now it's just not worth it. The first rule of programming should be "write code that can be easily maintained" and part of that is"don't write weird code."

5 0 Reply
1. Friday 24th April 2020 23:11 GMT J27
  
  Re: Been there, done that, don't do it
  
  This only makes sense for tiny embedded hardware, where performance is still at a premium. Those definitely still exist, I work with them all the time.
  
  1 0 Reply
Friday 24th April 2020 19:53 GMT EveryTime

This is quite an old concept.

Almost 30 years ago I worked on a compiler for dbC (https://ieeexplore.ieee.org/document/279474). A key feature of this language variation was arbitrary length variables. The initial motivation was SIMD, then quickly FPGAs.

Since the hardware structure has no inherent word size, a language that allowed exactly the desired precision resulted in code that was smaller and faster. And for a certain class of problem that used modulus, it resulted in significantly clearly code.

The problem was that for all other types of problems, allowing arbitrary precision was a huge distraction. Programmers micro-optimized the range, and then were bitten by bugs or unexpected behavior. A 32 bit variable is a huge waste when you are typically iterating to 100, and still a huge waste when you change that to 500, but you don't have to worry about a u_int8 (or the equivalent of a u_int7) biting you in the ass when the change is made.

Back then it wasn't a stupid idea. It was a research project that happened to product a negative confirmation. Today...

3 0 Reply
Friday 24th April 2020 21:18 GMT ebyrob

Optimizing too early?

Donald Knuth can't be rolling over in his grave since he's still alive thank goodness.

Didn't these guys ever learn the old rule: "Premature optimization is the root of all evil" ?

2 3 Reply
Saturday 25th April 2020 03:45 GMT -tim

This is amazingly useful when it is needed

I think it should have Endianness included in there as well and I'm not sure it should be limited to integers as it could be fixed point. The implementation of pointers will get weird as a now a pointer to the 5th element of a 5 bit array will be larger than a pointer to 64 bit int in classical architectures as it needs to include a real memory base pointer and an offset as well as a size. It would also be useful to be able to tell the compiler what the base char, int and long sizes are. An option to set int=31 and crash on overflow conversions would be very useful for testing most C code.

0 0 Reply
Saturday 25th April 2020 03:58 GMT Ian Mason

"transistor layouts for FPGAs"

Produce "transistor layouts for FPGAs"? Hmm, I think somebody doesn't actually know how FPGAs work. A clue is in the name, "Field Programmable Gate Array". Any transistor layout was well and truly fixed when the original FPGA design was cast into silicon, the "field programmable" bit works by routing signals using switches (and their component transistors) that are already part of the physical design. Note that the erroneous description is Keane's, not the reporter's. Do we really want someone specifying a feature that is ONLY intended to be used for FPGAs when they don't know how an FPGA works?

0 2 Reply
1. Sunday 26th April 2020 06:51 GMT Anonymous Coward
  
  Re: "transistor layouts for FPGAs"
  
  Exactly.
  
  They won't be running compiled C on the FPGA so what's the point.
  
  The C could be used to generate the map for the FPGA so just mask the standard ints.
  
  Mind you a native FPGA C Compiler would be awesome :-)
  
  0 0 Reply
  1. Sunday 26th April 2020 06:51 GMT diodesign
    
    Re: Re: "transistor layouts for FPGAs"
    
    Xilinx (for one) sells a C compiler for FPGAs - you can absolutely write logic in C and compile into a design language using today's tools. Heck, you can even use Python these days (with nMigen).
    
    I know of one UK startup that's made a toolchain that compiles Go down to Verilog for FPGAs in Azure.
    
    C.
    
    2 1 Reply
    1. Sunday 26th April 2020 14:27 GMT Electronics'R'Us
      
      Re: "transistor layouts for FPGAs"
      
      The current high level synthesis tools supported by XIlinx are Vivado and Vitis which do indeed have support for C, C++ and System C.
      
      The problem I have with trying to introduce this into C is that it encumbers the language with something that outside of hardware optimisation for FPGAs (and ASICs) is not going to be useful and simply muddies the waters.
      
      When writing Verilog or VHDL we state the size of the fields required explicitly anyway; the problem arises when using high level syntax (in whatever language) in a HDL environment (with the inherent specifics of what data types are available) for something that was never originally intended by the language.
      
      There could be an extension for the FPGA tools where the necessary size could be stated; there is nothing stopping the FPGA tool vendors from adding such an extension (they would need to advertise it as C / C++ / SystemC / <insert your favourite language here> with hardware extensions).
      
      I understand why we use high level synthesis (take a look at the OpenSparc verilog source if you want to see how complex, and therefore bug prone, a large project can be) and on today's monsters such an approach is necessary.
      
      This does not, however, justify adding this formally to C (my view, obviously).
      
      1 0 Reply
Saturday 25th April 2020 04:24 GMT zb42

33bit time

This sounds like a great way to solve the year 2038 problem without wasting those precious extra bits in a 64bit variable. A single extra bit could extend unix time to the year 2106, by which time we will all be dead and it will be somebody else's problem. </sarcasm>

1 0 Reply
1. Tuesday 11th May 2021 03:14 GMT david 136
  
  Re: 33bit time
  
  "Our integers go to 33 bits!"
  
  - Nigel Tuffnel, in a future advertisement
  
  0 0 Reply
Saturday 25th April 2020 09:36 GMT gnasher729

Any custom width?

For desktop / mobile programming a 13 bit int isn’t very useful. But having 128, or 1024, or 25,000 bit integers supported directly in the language, that would be nice for some people. So I’m all for it.

Obviously some changes for compilers needed. I wouldn’t want to have to use long long long long long long int.

0 0 Reply
Saturday 25th April 2020 11:36 GMT Anonymous Coward

Intel? Excuse me?? INTEL????

Wow. People have short memories.

Intel has been responsible for many of the biggest unfixable security balls ups in recent history. Speculative execution is just the beginning of the utter mess they've made. And as far as I know, it's still not even close to sorted.

How about Intel fixing its own problems before spouting cr'p about fundamental changes to a language which has proved itself over nearly 50 years? Why the hell does Intel get any say whatsoever in C language design? It would be like allowing Boeing to design aircraft control systems. Insane!

C is a simple, portable, efficient, CPU-aware language. Its whole design philosophy is based on those attributes. If common CPU registers are powers of 2, then that's what its fundamental integer types should be. Much of C's optimisation and efficiency depend on that.

C was not created for FPGAs! It is utter nonsense to try to shoe horn it into a hardware description / design language. And before all the C# etc. managed-memory language spouters start off on here: *Use C, where appropriate*. And if you don't know what that means, then let someone who does know, do your job. C's only problem is its success, and idiots using it when they shouldn't.

/Rant done.

0 2 Reply
1. Saturday 25th April 2020 18:05 GMT Dan 55
  
  Re: Intel? Excuse me?? INTEL????
  
  The late 60s-early 70s CPUs had all kinds of weird and wonderful bit widths, what goes around comes around.
  
  0 0 Reply
2. Saturday 25th April 2020 19:44 GMT gnasher729
  
  Re: Intel? Excuse me?? INTEL????
  
  C isn’t created for FPGAs, but Clang is created for everything. Including graphics cards and why not FPGAs.
  
  0 0 Reply
Monday 27th April 2020 23:32 GMT Man inna barrel

An FPGA is not a CPU

When generating code for a typical CPU, it is generally more efficient to work with native integer sizes, rather than penny-pinching on word lengths. I believe the CPU has to do some work to align smaller types onto the favoured word boundaries. In an FPGA, I am assuming that there is no optimum word size, and allocating more bits than you need consumes some gate resources. So declaring a five bit integer type could be more efficient than using a standard eight bit type, because the larger type commits more gate resources, which are not needed in practice. I do not think this type of economy happens much in normal coding for CPUs rather than FPGAs. Packing data into smaller sizes is an optimisation for disk storage, perhaps. There might be some merit in packing data if one is concerned about RAM usage, but I have found that there is always some work unpacking the data before it can be used.

The important thing, of course, is that appropriate type declarations tell the compiler what you want to do, which then means that the constraints imposed by your stated intents can be checked at compile time, and optimised code is generated within these constraints.

0 0 Reply

POST COMMENT House rules

Not a member of The Register? Create a new account here.

The Register Biting the hand that feeds IT

About Us

Our Websites

Your Privacy

Situation Publishing

Copyright. All rights reserved © 1998–2024