* Posts by Mye

12 publicly visible posts • joined 11 Jul 2022

Fining Big Tech isn't working. Make them give away illegally trained LLMs as public domain

Mye

The proposed fees for a sovereign fund are not a tax; they are rent, just like the monopoly holders of copyrighted and patented IP charge rent for using that IP. The foundation of why an IP holder or user should pay rent to society is that no IP is ever generated from nothing. It's all based on the work that comes before. As I said, all IP is part of a common pool. For example, many drugs result from research funded by governments. That research is based on something discovered previously. Pharmaceutical companies take that research for free and make obscene profits. They should be paying into a pool that could be used to pay for more study or benefit society in other ways.

The same is true of art. Artists train on the works of others and then incorporate the techniques and perspectives into their own work. This is why there are "schools" of art—a collection of artists who use the same or similar techniques to produce art. The Impressionists, Pointillists, abstract, Hudson River Valley, pastoral—art is also not limited to Homo sapiens. Earlier hominids have left behind similar works on cave walls or carved shapes. Elephants are another creature that creates art.

We don't need to significantly restrict or change IP protections. Semiconductor manufacturers use this practice. See https://www.jedec.org/about-jedec/patent-policy. This should become standard for all protected IP. You get paid for it but you can't restrict what people do with it. This could benefit society in many ways, for example, lower-cost pharmaceuticals. Third parties could make the same drug as the original discoverer, and they can do it at a lower price than more profit. One thing we need to protect against is the equivalent of patent trolls—someone "inventing" something and claiming part of the FRAND licensing pool without actually producing something themselves.

Yes, intellectual and physical labor are different. If you dig a ditch, you can't claim IP protection and extract rent from all ditch diggers for it. However, many people, including myself, have discovered that the results of our intellectual labor was created by somebody else earlier. This is why the patent office has a first-to-file rule. As a society, we need more intellectual honesty and effort to determine whether a piece of IP unique enough to be worth protecting. I also think we need to expand IP protections to AI-generated intellectual labor.

Mye

Even 30 years is too long. At 20 years, the vast majority of IP is forgotten and abandoned. There is the issue of time from discovery to reaching the market. I would also suggest that if IP was not commercialized and actively sold/generating sales five years after registration the IP is considered abandoned and most of the public domain.

Mye

Re: Delete them

A better idea is to let them remove your IP. You will see that your IP does not matter to the system's function. In other words, when taken in aggregate with all the other content, your content is insignificant and has very little value. If we get IP tracking, you will see that your IP is not unique, and you owe others for what you generated/"plagiarized".

Mye

Re: Small thinking

Fixing the IP tracking is an engineering problem. However, I'm not entirely convinced it is needed. The model behind the chat window does not contain the works in literal form. It is more like turning a book into paper mache, which is permitted if they buy a single copy of the book.(doctrine of first sale)

Mye

Re: Who wins?

This is why I recommended a RAND solution. on the other hand, they may only get 10^ -9 of a cent for the fragment of their work that was used but at least they get paid

Mye

Re: Throwing the baby out with the bathwater

That's a good question. I don't think so.. for example, with writing code, no, they can't. They need as big a training set as possible to synthesize the code matching the prompt. Gramerly it seemed that the bigger their data set, the more accurate the grammar corrections. Back the deep distant past, I had conversations with people at Dragon Systems. They told me that every generation of Dragons has an increasingly large vocabulary, allowing them to better match what is said and what should be written. from what I can tell, they were again using a precursor to an LLM for language models dependent on larger data sets. AI speech recognition is so much better than Dragon. Play with Aqua sometime and you will see what I mean. It still has problems with performance but it's getting better.

in a different comment, I outlined a different solution than banning companies from using harvested information. take a look and tell me what you think

Mye

Throwing the baby out with the bathwater

for me AI has created two major accessibility capabilities. My hands don't work right. I use speech recognition instead of keyboards. AI is significantly better at speech recognition than Dragon. Graham really helps with fixing speech recognition errors both in grammar and in missing words. With speech recognition, I'm now able to write code again. I'm sure that speech recognition could also aid in improving the usability of computers for vision-impaired people. With appropriate peripherals, it would probably also work for mobility-impaired people.

I suspect you and many others here are TABS (temporally able-bodied. Computers have royally forked us over, and until the introduction of chatbots, it's only gotten worse. If you're looking for an ethical and moral reason to keep developing AI systems, accessibility is my number one reason. Then there are a whole host of others when AI's beneficial such as weather forecasts, finding financial crimes, developing new medical treatments, and all the systems are based on a mixture of public domain and copyrighted information. Without that information, the system's performance is degraded to something that is not worth running, and we, as a society, lose out.

Mye

something I wrote elsewhere.

I know I have pissed off more than one person with my stance on IP changes necessary to accommodate the unethical behavior of AI companies. In a nutshell, all information discovered by humans is part of a common pool should be shared among all people. If you use that information to make a profit, the moral and ethical requirement is to put some of that profit into a sovereign fund (see Norway, revenue from oil fields and FRAND licensing), and any IP discovered while creating the AI system should be declared public domain.

I concede that limited-time monopolies are needed to protect new IP discoveries and enable the discoverer to extract revenue from the market, which would incentivize the commercialization of discoveries, whether from an AI or a biological entity.

My view is driven by my desire to gain a new perspective on the topic and have new conversations without re-treading the same cemented positions.

I thought I was alone in this idea when I discovered this article in my morning news feed. While it starts from a different set of premises, it almost ends up with the same result.

IMO, there are some flaws, such as preserving the current copyright model using GDPR to restrict what you can use for training and focusing on punishment rather than social benefit. However, the outcome of training results being considered public domain is good.

XenServer teases free VMware migration package

Mye

Re: Proxmox VE

As they say, nobody ever got fired for buying ESXi. But that perspective prevents people from seeing good alternatives.

Of all the virtualization solutions available, only VMware and XCP-ng+XO provide complete end-to-end solutions that are not a bag of parts. XCP-ng has become my go-to virtualization solution, especially since Broadcom bought VMware

Mye

Re: XCP-NG

I use xcp-ng in production at three customer sites. The current version of XenOrchestra will import VMware virtual machines directly into XCP-ng and XO. I've imported several VMware virtual machines (Windows and Linux) and the only virtual machines that gave me trouble were when the guest OSs were more than five or six years old.

I agree with you that XO and XCP-ng are not yet as feature-full as VMware, but in my day-to-day use, there is no functional difference between the two except that I find XO much easier to work with than anything from VMware.

Xen hypervisor port to RISC-V moving – slowly, but moving

Mye

Re: Is that the same XEN hypervisor...

Kind of. Xcp-ng is a fork of the original xen hypervisor.

I've worked with all the big hypervisor players in various organizations and I found that xcpng plus xen orchestra to be one of the easier systems to manage and provide better functionality for the dollar then any of the others.

Xen lost popularity because of Citrix and it's predatory pricing, not anything to do with its technical foundations, which I consider solid.

It's worth spending the time to understand xcp-ng and xen orchestra

Even robots have the right to learn from open source

Mye

Re: The complaint isn't about the use of FOSS code, it's about attribution

Whatever it generates it's not your unattributed code. There are numerous occasions where I have found code functionally identical to what I've written on a commercial project inside an Open source project. My code came first. Should the open source project license my code from the company I worked for?

After decades in software development, I've come to conclusion that the same code will be generated over and over again in different contexts and different languages because we work with a very small and very finite set of ways of expressing solutions to problems in code.

Even in cases where there are multiple solutions to a problem, the number of solutions can be counted one or at most two hands.

In other words, plagiarism is inevitable because the same description can cause multiple programmers to create the same code and that is what we have here, Give co-pilot a description and from that description it uses its trained networking to generate code from scratch. It's not copying your code or anybody else's code. It is generating the new code based on your description.

Copilot is working just like a human programmer in that it recognize a pattern and reapplying that pattern in new contexts. The only difference is that it is able to scan many orders of magnitude of code then you can in order to be able to identify patterns and figure out how to generate similar code based on a description. Another way of thinking about is that it's like you on stack overflow except much much more efficient.