Speed before new USB standard
It would be nice to fill a USB stick in 2 seconds, rather than minutes. Would change the outcome of many a Hollywood movie plot.
Samsung has unveiled a new generation of high-bandwidth DRAM chips called Flashbolt. These are the first chips based on the Korean chaebol's updated High Bandwidth Memory (HBM2E) model, and is the fastest the company has ever produced. Samsung claims that Flashbolt is 33 per cent faster than the previous generation HBM2 system …
I can understand devoting R&D to increase bandwidth for GPUs etc., but why not work on inventing cheaper memory with lower latency too? I know SRAM already exists, that's why I added the words invent and cheaper.
Just think how much cheaper, simpler and faster (and more secure) CPUs could become if they didn't have to dedicate precious die space to all that L1/L2/L3 cache, branch prediction and speculative execution type malarkey.
And before you say it, I know it's hard. But like all good solutions, once it's done, everyone will shake their heads in amazement and bewilderment as to why no one did it sooner and instead spent billions on coming up with convoluted cache and speculative execution type arrangements.
RAM and CPU speeds were rarely in sync. Early computers spread across multiple racks had to slow all clock rates down to allow signals to travel from one end to the other, and still the CPU was typically slower than memory. The CDC 6600 supercomputer was the fastest thing the 60's, achieved mainly by shrinking the CPU circuitboards, using faster transisors, and tailoring the instruction set. The CPU was now much faster than RAM so they had the CPU cycle between 10 peripheral processors (memory and external I/O interface). Kind of like hyperthreading, only on a hardware level.
RAM got faster too, moving from individual magnetic cores to integrated circuits.
Smaller and faster go hand-in-hand. The 90's saw CPU clock rates climb into the GHz range. The higher the clock rate, the smaller maximum size of the circuit you can control with one synchronous clock gets. A 10 GHz clock signal can travel about 30mm per cycle, limited by the speed of light. In a vacuum, in a straight line, somewhat less on a chip with lines routed all over the place. So at those speeds you're lucky to keep an entire CPU chip under control, you can't have the memory chip next to it running in sync on the same clock. Much less an entire board with 100s of GB of RAM. It's not that they can't make extremely fast RAM that could keep up with a CPU, but you can't route signals all over the motherboard at CPU clock rates. So faster CPUs required more and more layers of cache.
To add on to what Scott mentioned; it's important to know that "speed" has a lot of facets in this kind of conversation. Clock speeds only point to the signal generator and the ability to accept/process instructions. Things like bit width and efficiency also play a big part.
We often use the word speed interchangeably when it's not really appropriate. For example, network speeds being indicated by using how many bits per second the link is capable of transferring isn't the only measurement to consider when latency can have an impact on the ability to realize that potential.
I my memory serves me (and adding to what Scott mentioned), the divergence in the x86 world occurred when the clock generator in the CPU (circa Intel 486) tried to go beyond 33Mhz. The memory available at the time couldn't operate at frequencies higher than that and the system boards' circuit design had a problem with EM interference because of right angle traces and electrons "flying off the track".
This post has been deleted by its author
"...and devices for artificial intelligence training."
Annotating by dictionary definitions is surprisingly simple and fast: only around 1-50 a definition's paragraphs should be compared with 1-10 surrounding paragraphs. For a text of 10.000 words it's only around or much less than 10 million comparisons, less than one or few seconds.
The indexing could take ten or more times longer. Perhaps even hundreds and thousands times longer.
As for the images, formulas, signs, signals - it depends what software is used.