+1 for the Headline and all to funny tag line!
Glad I wasn't drinking tea... Buffet overflow!
Here's a programming gremlin that caught our eye this week: a share price exceeded the 32-bit unsigned integer limit of a stock exchange's code. Berkshire Hathaway is an investment group known not just for being run by billionaire tycoon Warren Buffett, but also because its BRK.A stock is ridiculously priced: at time of …
Storing financial quantities in floating point will only give you a head ache; eg do not store a bank balance in pounds. Store it as an integer number of pennies (or whatever the smallest currency unit is). If you store in floating point you might get a rounding error of a penny and the auditors will go bananas looking for someone bacon or salami slicing.
Having said that: I do see people manipulating prices in Javascript where numbers are stored as double floating point (IEEE754); but the maximum integer value that can be safely held is 9,007,199,254,740,992 ~= 9×10^{15} - which can easily hold a BRK.A.
You might need to do some calculations in 1/100 of pennies - to keep the VAT people happy.
Maybe electronic trading allows fractions of a penny or a cent. But you can't get 1/3 of a penny out of a cash till. Sooner or later you have to make a decision about how to round to integer currency units - be it one penny of 2.220446049250313e-16 of a penny.
And while 1/3 + 1/3 + 1/3 happens to work 1/6+ 1/6 + 1/6 + 1/6 + 1/6 + 1/6 != 1.0 Should you truly need to work with rationals, use a class that implements them.
And it's a good job these rules do exist, otherwise it's a flawed system whereby money could be created out of thin air via rounding errors. One person's view of a debit in their currency would be different to the recipient persons view of the corresponding credit in their currency, given the agreed rate of exchange. That would mean there was an exploitable flaw, which would be exploited, effectively causing inflation which was not the result of the central banks printing more notes.
Ultimately its all about confidence in the global money system. Everyone needs to know that when someone has spent a dollar that the accounts show that a dollar was spent, and that they cannot persuade someone else they've spent only 99c.
The much maligned Imperial system of weights and measures had little islands of binary sanity. Apart from ha'pennies and farthings in currency there were pounds and ounces (binary ratios are particularly suitable for weighing) and stones, quarters and hundredweights. It was just the bridge between pounds and stones which was irrational.
12 inches = 1 foot
3 feet = 1 yard
22 yards = 1 chain
10 chains = 1 furlong (factor of 10?? Surely some mistake)
8 furlongs = 1 mile
16 ounces = 1 pound
14 pounds = 1 stone
8 stones = 1 hundredweight
20 hundredweights = 1 ton
12 pennies = 1 shilling
20 shillings = 1 pound
21 shillings = 1 guinea
All looks perfectly logical to me.
>>Maybe electronic trading allows fractions of a penny or a cent. But you can't get 1/3 of a penny out of a cash till.
>This is why it is called electronic trading, they don't need cash tills.
But nobody would set a price as a fraction in the first place.
"Explain to me why 33 +33+33 is not 100."
Because that sum is absolutely and precisely 99. You don't deploy any imprecision in your statement apart from some dodgy spacing.
If you had gone for 33.0 + 33.0 + 33.0 != 100.0 that might have required a PhD (*) to explain away satisfactorily (not me, I'm an IT bod) but 33 + 33 + 33 = 100 is so obviously wrong you can use your fingers to determine its falsehood. That's the real point here - you used integers ie thingies that have a fixed value with no error lines or slight indiscretions. 1, 2, 3 etc are absolute things and are different to 1.0, 2.0, 3.0. The second lot are numbers with a described precision. 1.0 is really something that slides between 1.00 to slightly less than 1.05. The slightly bit is quite important and so is what I've called "slides" which isn't a formal description of anything ... but I'm not a mathematician.
(*) Maths is a bit odd: A couple of Hungarian chaps proved that a sphere can be turned into two spherical cows with some judicious sleight of sphere. The https://en.wikipedia.org/wiki/Banach%E2%80%93Tarski_paradox is called a paradox despite the fact it has been formally proven. We are weird, not cows and spheres. OK I lied about the cows bit. Maths is still odd.
JS has had arbitrarily long integers - Bigint - as a primitive since 2019, and is now supported across all Browsers.
Before that the only numeric primitive was "Number" which silently switched between integer and 64 bit floating point when integer could not be represented - kind of a nightmare.
Nasdaq chose not to store prices using a floating-point number format normally encountered in applications...and instead chose to multiply quotes by 10,000 and store them as 32-bit unsigned integers.
The advice from lawyers tends to be don't use a lawyer unless you absolutely have no choice. As someone who works with floats, my answer is the same. You could store an integer in a float; that would give you 53 bits. (1 implied, and 52 bits of data.) But why bother if it fits in a 32 bit int?
Apparently you long longer work with ints?
The questions are about performance and rounding behavior. The only way to get fixed point rounding in floating point is to use denormals--and hope that once the tea is ready, that you haven't hit that one case in 2^104 that they got wrong.
And if they said uint64 instead of uint32, the system would have worked. Or if they used one of the many languages which have an integer class which can resize itself when it needs to. Or if they tested some big numbers, because this is not really that big a number. Cobol also has big integer functionality, but so does basically everything in existence today. You still have to remember to use it or work in a language which doesn't give you a choice not to. Cobol isn't a panacea to this problem.
Better still, RPG.
That's the standard way of doing things. Huge integer, with the decimals implied by the currency.
The problem currency used to be the Turkish Lira. A million of the things bought a 5 minute taxi ride from the office to the hotel. Hotels would need to swipe my credit card several times, as both Visa and Mastercard could only handle up to a (US) billion in one transaction.
Years ago IBM spotted that using binary floating point when scaled integers are required caused problems. They came up with a way of storing numbers internally in decimal instead of binary. You can quickly try it out by firing up python3 and typing:
from decimal import Decimal
Decimal(2) / Decimal(7) == Decimal(1) / Decimal(7) * Decimal(2)
2 / 7 == 1 / 7 * 2And you get the result False for Decimal and True for binary floating point. It is almost as if using Decimal when you need integers is just as broken as using binary floating point when you need integers. All that work implementing decimal arithmetic and it still does not solve the problem of programmers using floating point where scaled integers are required.
(In python3 int / int -> float. If you want C flavoured division, use the // operator.)
1 / 7 is not a fair test, since it is an irrational number. I would not be so quick to dismiss python on this basis. 1/10 is a more realistic test.
I think perl6 does some floating point nearly-equally thing with floating point. It calls them Rats. Cobol is probably ok if you put the full stop in after the definition section.
Is 1/10 really rational in binary?
1/10 = 2(-4) + 2(-5) + 2(-8) + 2(-9) +2(-12) + decimal 0.00146484375, and it's not converging. Eventually the machine will not be able to represent the decimal remainder in binary, so math.(1/10) != machine.(1/10).
I don't think 1/7 is any different in that sense.
1/7 is a rational number because it can be expressed as a fraction of two integers. It is a recurring decimal (and a recurring binaral). 1/10 is not a recurring decimal but is a recurring binaral. Decimal arithmetic should get the right answers with rationals that happen to be non-recurring decimals - but even then there will be some limit (probably around 10^-28 for the implementation in python3). It should be straightforward to find examples involving 1/10 where binary arithmetic shows a small rounding error. I selected the example because a rounding error shows up in Decimal but cancels out in binary. With a little effort, it should be easy to find an example for which the rounding error shows up in binary but not decimal even with a recurring decimal.
IBM's original implementation required 4 bits per digit. That was so painful to work with ancient CPUs actually had instructions to add and subtract bytes containing two such digits. Later implementations stored three digits in 10 bits which was more space efficient but even worse to calculate with. Modern implementations use two binary integers that are interpreted as a*10^b instead of the usual a*2^b used by normal floating point numbers. That works well for multiplies but gets awkward with adds and subtracts.
Using decimal arithmetic does not actually solve problems that require scaled integer arithmetic. It just changes which examples show rounding errors.
It also creates an insane number of corner cases that have to be handled properly. I was at IBM when that joker started pushing base-1000 floating point. Nasty. Ten years after the Pentium bug, and the arrogant fool would not even listen to the fact that without formal proofs, it was a disaster in the making.
Those "standards" were a much political documents as technical. Once you had 754, (with or without the glossed-over undefined states), extending it to non-binary was pretty much fixed.
Having realized that they could, they never bothered to ask if they should.
They should not.
I agree with Mr Buffet on a lot of things. In general, his view on stock splits is one of them. People get strange views of stock prices being "cheap". If you don't understand why a split doesn't really make a stock "cheaper" look at it this way: suppose I'm selling 100# bags of potatoes for $10 each. Does the intrinsic value of yhe potatoes change if I "split" and start selling them in 10# bags for $1 each? No, they're the same price per unit of potato. It works the same way for businesses. A business is worth $x, a stock is a portion of ownership. Cutting smaller portions is like selling smaller bags of potatoes.
Of course, BRK is also the classic counter-example. They're selling 432,000# bags of potatoes. Once the unit size becomes too large to handle, you have practical issues to consider.
Stocks on the low side can have similar issues. "Cheap" stocks aren't suddenly good performing ones if you do a reverse split and tack a couple zeros on to the price. OTOH, if the price is under $1, maintaining a bid/ask spread is harder to do.
I've been close enough to the cool kids to overhear some conversations about this.
Suppose you are a market maker. Suppose you have two orders to process. One is for 1000 shares of a company priced at $1. The other is for one share of a company priced at $1000. Which to do service first?
In case you do know not, market makers make their money based on the difference between what the seller is asking and what the buyer is offering. It is not unreasonable to make $20 off the $1 company. It is much harder to make $20 off the $1000 company.
Certainly, back before we had even had mechanical calculators, stock splits would have looked very much like a government inflating its currency. But today, EVERY stock history tool adjusts for the splits. Companies want their share price to be in a certain range to maintain a certain level of liquidity.
Buffet, of course, has no concerns about that. To him, the paperwork is an unnecessary expense. You want to play? Anny up your half million.
As a Berkshire stockholder (class B, which is BRK.B, not class A), Berkshire isn't a NASDAQ stock its a NYSE stock so NASDAQ rules don't matter. The reason Buffet doesn't split the A class is because he doesn't want a lot of speculation and stocks are traded at the instituional level in blocks of 100 shares. When you as a individual investor buy an number of shares that isn't a multiple of 100, the brokerage may have to buy the remainder of the block if they can't make the trade internally to the brokerage. (The stocks in the brokers clients accounts are held in street name in the brokerages name, and the rest is internal accounting. So a big broker can often make the stock or mutual fund trades internally thus pocketing the spread themselves.)
Anyway, with the very large shareprice and the behind the scenes bl9ck level trading involved, the liquidiry is reduced so you keep the speculators out and skew the shareholder population towards owners who are long term investors. If you want the churn your account and invest in Berkshire, buy the B shares as the are only a couple of hundred dollars each.
Way back when, Microsoft released version 3.11 of Windows, also known as Windows for Warehouses because it brought no new features for anyone not connected to a networks, which was most people at the time. There was much merriment that the Windows Calculator app asserted that 3.11 - 3.1 = 0. This wasn't fixed, IIRC, until Windows 98.
-A.
There was a plain version of Windows 3.11 which was just a bug fix release - there was also a version 3.1 for Work Groups - but most people would have probably got Windows 3.11 for Workgroups as I believe that was default for OEMs
"A solution would be to use 64-bit unsigned integers"
There are other solutions too, including quite a choice of arbitrary precision math libraries out there - (a simple example is Java's BigDecimal) - that don't suffer from the inaccuracies of conventional IEEE floating point. This ain't a hard nut to crack.
It's not hard to crack, but it's even easier to smash. Arbitrary precision is REALLY slow. It also has to deal with a bunch of edge cases that the 64 bit int folks have paid PhD mathematicians to generate formally checked proofs to catch.
And you really, REALLY don't want the successor of the Pentium bug messing up stock quotes.
Working at a Market Data Provider in the 80s and 90s we used all sorts of compression and limitations to keep our data feed fast, since we were sending ticker info out at the warp speed of 38.4kb async. (Inputs from the exchanges were a maximum of 19.2kb and we had about 15 different exchanges at the time as I recall.) The limit for a price dollar was 2 bytes but one bit was reserved so we had a price limit of $32768. Our stream used lots of bit masking where certain bits were used for various details. The encoding allowed us to take an 80 byte message from the exchange to put out the same data in about 14 bytes.)
One day the developers of that code started to panic when they realized that Berkshire Hathaway was reaching that limit. They had to spend days trying to figure how to change the code to allow for a different bit to be used for that flag bit and how to get that code out to the customers so they could read the new encoding. We didn't make it for all the customers because it required mailing a disk to them with the new software.
Floating point calculations in programs are a horror just waiting to trip you up when you least expect it. It's rather ironic that computers are actually pretty bad(*) at performing floating point calculations. The one thing everyone expects them to do and they are bad at it...
(*)There are of course ways to mitigate the problem but you have to choose to use them. I suspect most programmers don't even know there's a problem. They think that 1/10 is easy to calculate :)
Well, floating point representations were designed to allow scientists and engineers to use widely ranging quantities without having to think about scaling in each expression. Remember, at that time the alternatives to the (digital) computer doing it were the human computer with a slide rule or log tables! Or of course electronic or mechanical analog computers.
Recall also that on the 360, 1.0 + 1.0 = 4.0. I assume that they fixed it at some point.
Floating point is good for graphics. If you are disciplined, it is acceptable for statistics. After that? Stay away.
I built a floating point emulator. Spent 4 years doing FPU validation and another 6 doing CPU validation. Stay away.
Have one -->
I was talking to a lead dev for the Athalon microprocessor about the speed of integer divide. I proposed using the FPU algorithm. He misunderstood me, and computed the time to ship the data across the part to the FPU, convert it, do the divide, convert the result to int, and ship it back, "No, that would be a bit slower."
Emulation tends to be SLOW.
You understand that the original Pentium passed the Berkley tests, right?
The comment about writing the emulator is not about being some sort of hot-shot programmer. Indeed, at the time, I was clawing my way up from hackerdom.
The comment was about learning IEEE-754 forwards and backwards. About understanding the full implications of each "may" and "should" in the document verses what Intel had actually done. And, implicitly, about understanding exactly what floating point representation is, and therefore what it is not. Which is a meaningful credential to present when talking about how floating point should or should not be used.
Christ on a fucking bike!
You NEVER use floating point for any kind of currency calculations.
Anyone in the banking industry would know this from the secret-to-most uckfup in the early 90s when some Clever Young Things, filled with the spirit of No More Cobol (which, of course, has a special scaled decimal currency type - "computational" - though that isn't spelled out in the code and the CYTs apparently never cracked a manual), did just that.
The reason that CYTs still *do* that is because newer IT manuals almost universally discuss FP with the same throwaway sentence used in the article and *never* go into why the Azathoth-damned type is approximate.
Jesus!
The problems with floating points is widely known if you bother look around. I have 35 year texts that discuss fp problems in calculations. But I wonder how many 'programmers' bothered to consider these issues when writing their code for their Pile it Higher and Deeper feces. Which raises the question how many computer analyses are actually correct.
I agree that the finance industry would never consider using floating point for storing values in any currency. Been there, done that: the rule is to hold integral values of the smallest unit used in that currency, so GBP amounts are held in pence, dollar values in cents and Indian Rupees still seem to be stored as paise (100 paise to the rupee even though the physical currency no longer has any currency smaller than a rupee.
However, COBOL's numeric COMPUTATIONAL qualifier (COMP is also valid) isn't special for currency.
All it means is that the value in a variable may contain only a sign and a string of digits, and COMP may be omitted if the variable's PICTURE contains only the characters S and 9. (unless its an edited value when +-9Z. are all usable) and it will always be handled as an integer value with the decimal point being treated as implied and/or used to align values with differing numbers of digits beyond the decimal point.
This causes issues for newbie programmers because if you have two values stored as PIC "999.99" COMP and you try to divide 7.00 by 2.00 you will get 3.00 and a remainder of 1.0 because integer division.
You can also add the 'SYNC' qualifier, which causes the computer's 'natural' storage to be used for that variable. That would be a 24 bit signed integer on an ICL 1900 mainframe, a 16 or 32 bit signed integer on a 32 or 64 bit Intel-based machine. OTOH if you were running on a low end IBM S/360 such as a model 30 you'd be using Packed BCD (Packed Binary Coded Decimal) which stored digits as 4 bits, packed two per byte, and had special instructions for dealing with larger packed BCD values stored as strings of bytes.
COBOL can also handle floating point values (COMP-1 or COMP-2 type variables, but that's outside scope for this discussion, since only a numpty or an inexperienced BASIC programmer would ever consider using floating point values to store monetary values.
By the time they got to the Spectrum it apparently wasn't that bad, at least as far as floating point was concerned.
On real computers the "SYNC RIGHT" gets automatically added because the mill expects that. You can override it of course, but it is scarcely worth the trouble unless really starved for memory/disk space or have single word fields with multiple short values packed in them (The UNIVAC 1100 family has a 36 bit word that can be chopped in a number of ways, so using Cobol's SYNC is often a waste of effort better spent in the Data Division where fewer runtime errors will be waiting to silently bite us, for example).
What makes COMP a 'currency type' is that it is a binary numeric representation with a *decimal* scale factor. This means that arithmetic works as long as you know how the language arithmetic operators/verbs do, and have a grasp of implied intermediate value field types. Screwing up DIVIDE is a different order of problem, nothing to do with the declared variable picture clauses.
COMP-? declarations are different beasts, retrofits that are implementation dependent. I well remember a newly hired boss giving me gyp about using report writer because "it doesn't handle packed decimal", until I pointed out that packed decimal was an IBM conceit and we were in the world of Univac, where report writer worked just fine.
I can do COMP-? discussions all day because I once made my money converting such fields for running on other hardware than they were originally intended (lots of contracts with people doing government work), and somewhere I have a Univac Cobol Supplemental manual, long out of print, that details all the bizarre COMP variants for working especially with IBM->UNIVAC conversions (tape formats being top of the list. IBM x-track, y-parity, that sort of thing).
And yes, a bunch of clever young things really did nerf Wall St bizzes plural in the early 90s by deploying float types to record and calculate currency amounts. It was quite the secret scandal.
python is actually very much used in banking nowadays mainly because it has arbitrary numbers builtin without any fuzz:
>>> int64bit = 2**63-1
>>> bigger = int64bit**10
>>> type(bigger)
<class 'int'>
>>> bigger
4455508415646675013373597242420117818453694838130159772560668808816707086990958982033203334310070688731662890013605553436739351074980172000127431349940128178077122187317837794167991459381249
Did the kettle whistle by the time that completed?
Arbitrary precision is slow. REALLY slow. It also contains a TON of corner cases, so unless the folks that put that library together truly are top notch, it will almost certainly have some errors. Somewhere.
I spent a decade doing microprocessor validation. Trust me.
Fixed point computations have their algorithms proven by PhD mathematicians. Those proofs are formally checked. The algorithms are expressed in a language that gets compiled into gates. And model equivalence checkers are used to ensure that the compilers did not mess it up.
It's perhaps fortunate then that Python was created by a PhD mathematician. It uses Karatsuba arithmetic, which has been around since the early 1960s.
Python is also the only language that I am familiar with that does modulo operations in the mathematically correct manner in all cases. Every other language that I have tested (as well as spreadsheets) will produce mathematically incorrect answers under some circumstances because it was "faster" (because the CPUs produce incorrect answers). Python's creator insisted on mathematical correctness over performance in that case, so I'm pretty sure he did the same with other integer operations.
As a typical example, the sum of a large integer array in pure C is roughly 7 times faster than in Python (the exact amount in C will vary by integer size and type). In Python however the sum will never overflow while with C you are pretty much guaranteed to have an integer overflow even with a relatively "small" array.
You can do integer operations in Python without ever having to check for overflow. In typical languages using native integers people don't often check for overflows either, but then we get constant security notices about vulnerabilities due to unchecked integer overflows.
I can pretty much guaranty that the majority of programmers don't even know how to check for integer overflow correctly. There are many corner cases that will vary depending on operation and integer type. Several years ago I had reason to find all the integer overflow cases for all the integer types and operations, and I could not find a comprehensive single source for this. Not that it matters of course, since as I said the average programmer never checks anyway.
If you really need maximum numerical performance in Python, then you are almost certainly using a numerical library anyway, the most common ones are written in C or Fortran and operate on native integers or floating point numbers in native arrays. The libraries can take care of the details.
If however you just need to add two numbers together, then just do it in Python and you can be assured that the result will never overflow without you having to add multiple lines of code around each operation to prevent it. If you are just adding a couple of numbers the performance overhead is insignificant in practical terms.
With Python, pragmatism is the preferred course of action.
It's Python. It's slower than native code, definitely. It is likely to be accurate though, so if you need very large numbers with arbitrary precision and you're willing to deal with the performance problem, that's a way to do it. Like most other systems, you have to know what you're doing while using it, since Python also has floats which work equally badly as everybody else's floats so you need to know when not to use them. A lot of possibilities would have avoided this bug. It just takes some consideration about possible inputs so the programmers store things in a type which can handle them.
"stopped BRK.A's data going out before it oveflowed"
I know they are highly regulated and the fines for non-compliance can be high, not to mention the reputational hit if they get it wrong, but wouldn't it be nice if other parts of the IT industry were watching for and fixing this type or error before it affects their users? Kudos to them for spotting it in time that the error didn't make it out into the wide world, although ideally they should have spotted it AND FIXED it before they needed to hide it.
This isn't striking me as a particularly intelligent action on their part. I would have imagined big numbers to be an important test case for the code before it got used at all. Even at the stage of throwing random data at the code to watch it work I think they could have found that. If they just theorized a test stock which increased in value geometrically, they could have tested the limits in a few seconds.
They knew the limits when they first wrote the code. They probably even had an alert in place to automatically email everyone from the CTO down when that limit was being approached.
Real code in the real world has limits. Real businesses have to make real business decisions about which limits are the appropriate ones for the job at hand.
As mentioned above, arbitrary precision is dirt slow, and error prone. Almost certainly the wrong solution for order processing.
Bumping my head on the TRS-80's 16k limit changed the way I programmed (well, hacked) for a decade and a half. (Until I was programming professionally.) :D
But in this case, no. By the time electrical computers came out, there had already been multiple instances of galloping inflation and hyperinflation in history. It was a deliberate decision to not accept the ongoing cost of supporting a hyperinflation scenario.
None of them involving stock exchanges controlled by computers though.
It's not just the maximum monetary value capable of being stored in a field, it is also the rate at which it changes which is of note. There has to be a "revalue" mechanism in place geared to massive daily changes. Charts used to show share prices and indices will have to be revamped to take inflation rate into consideration, and this would be dependent on the points at which revaluations occur through trading hours.
I would be interested to read up on when and how such decisions were made. Can a link be supplied please?
===
I'm aware that factories were shut down at intervals. I'm also aware that the LSE, for example, has mechanisms for imposing brief pauses in trading to cope with trading anomalies through the day. These may relate either to individual shares, or to the market as a whole. Being a Control System - a very complex one, things that throw the volatility of it out of kilter due to price gearing and the velocity of change would necessitate extra, clearly defined shutdowns. These would need to have greater visibility and be clearly understood by traders, otherwise grave instability will occur.
At the moment ad hoc price monitoring extensions are communicated through the regulatory news system, but how much extra load can be put on that system? How would the "depth" of the market cope with these unpredictable demands?
When I worked for an organisation that shall be nameless I was cross attached to a team of permies who were having issues with summing SWIFT values (VB6), I was told it was a problem they had been working on for several weeks - and could be expected to be there for quite a long time till the tests worked.
As you can probably immediately guess I was there for about 15 minutes; 1 minute to read the code (well only about the first 100 lines), 11 minutes to stop laughing and 3 minutes to explain that floats were not appropriate for financial transactions. When I returned to my other team we then spent about 4 hours trying to estimate how much money had been wasted through their ignorance, and what the consequences would be had they proceeded.
In all fairness their testing was good enough to identify the issue - they were just woefully ignorant of data structures.
This is all too large a problem these days with the Clever Untested New Technology masters..
Schools skip over the basics and throw on the frameworks and languages.. and the new tech masters have no idea what goes on behind the interpreter/compiler curtain.
From data structures, compilers, interpreters,, to algorithms, to how the high level code actually maps to a real machine.... nothing is known... and they get themselves into trouble... as the answers to problems they must deal with in real life are not found in the back of the book or in last year's classwork.
Unfortunately, most of the 'teachers' are themselves incompetent, and are teaching as they are unable to get a job themselves..... Of course this generalisation is not accurate, as there are some great teachers....
Double or extended precision is entirely appropriate - if you know what you’re doing. If you can estimate worst cases for rounding errors and compensate for them.
If you were nowhere near finished after reading “the first hundred lines” of code then they didn’t know what they were doing. And that would be the problem, not floating point.
One of the things I ponder about is if hyperinflation were to hit the dollar or the pound. Would financial systems be able to cope? This story indicates that this hasn't been thought through on a wider scale.
===
Just imagine if someone started a rumour that Berkshire Hathaway were a takeover target.
I remember now where I read this...
Joe Celko mentions it in his book Data and Databases: Concepts in Practice, section 6.9.1.
===
It can be done by having three internal fields for each display field (integer, numerator, denominator) separately maintained, and aggregated when needed for reports and display.
I would think that calculations such as standard deviations would not require the exact same precision that a financial transaction needs.
The idea of storing a third, a third and a third as numerator/denominator is there is no rounding if adding them together, the database engine would sum the numerators to give three thirds, which would collapse down to adding one in the numeric field.
As you know rounding can lead to bias in how numbers are aggregated, which banker's rounding attempts to address in a fair way, but it is still a compromise.
What you're really saying is that the stock price couldn't be represented as an unsigned long because of overflow so needs to be stored in an unsigned long long. These standard types exist regardless of the system memory or machine instruction width. The actual value isn't really an unsigned long, either, its a messed up version of fixed point arithmetic using a 10 base multiplier. In real fixed point artihmetic the multiplier is 2^n, that is a shft of 'n' bits with those bits representing the fractional part and the rest of the number space the integer. There are libraries and macros that help with keeping track of fixed points -- typically you'll declare a unique type that will be a 'Q20.12', 20 bits for the integer and 12 for the fraction. The notionation depends on the libary, AFAIK its not standardized.
So why not just use floating point? Two issues -- its imprecise and relatively slow. Its convenient and good enough for a lot of calculations but when you're running serious real-time you need the speed and predictability of integer arithmetic.
BTW -- If you're doing work with very large or small numbers then the preferred format is BCD. Numbers are represented by strings of indefinite length and are of arbitary precision. Operations are necessarily slow but will be precise. As you develop the calculations its possible that floating point may be usable for some or all of it but you still need to use BCD to check that any errors that creep in are tolerable.
Working for insurance company, in the dim and distant, when storage cost money.
So, how did we handle UK decimal currency in PL/1?
We defined a signed fixed (11,1) (packed decimal) field which took 6 bytes, and then moved the amount in pennies into it, so we got 001234567890C in hex. So £123.45 was x'000000123450C'
Then, we defined a 5-byte character field on top of that, using a pointer.
Then we moved the five bytes into the record for storage, to save the waste of space of the last x'0C'.
Reverse process when reading.
It got worse. Disk storage was expensive. We had one file which was a Regional(1) dataset (if I remember rightly from 30+ years ago). These had no key, but were accessed by location on disk. The file held 10,000,000 one-byte records, one for each policy number. The first two bits of the byte held two status indicators, the last 6 bits held an integer that was the policy type (10-63).
Ah the good old days.
Biting the hand that feeds IT © 1998–2021