In the new "virtual reality"
Absolutely everything and everyone is for sale to the highest bidder, and there are some very high bidders out there.
WeTransfer this week denied claims it uses files uploaded to its ubiquitous cloud storage service to train AI, and rolled back changes it had introduced to its Terms of Service after they deeply upset users. The topic? Granting licensing permissions for an as-yet-unreleased LLM product. Southport merseyside UK July 30th 2024, …
I think this will be like Big Data. Companies want our stuff, but the models aren't producing things on a level a human can anyway, and one person's excellence can be drowned out by everyone else's mediocrity. So even if your files include great groundbreaking things, they won't be weighted more against all the other idiots' things.
"Data needs to live in a database – not an Excel file. Access should be granted for each according to their needs."
Yeah tell that 98% of the population they wouldn't have a clue how to run a database for their data needs and storage. If you've got a simple solution Mr Eden lets have it but a bloomin Database it is fecking not.
Also Data should not be in something an A.I should be near unless you run it and know how it works, doesn't matter if it's perceived to be sensitive by you, the probability is that someone can derive value from it and that will likely not be to your benefit...
"An Excel file is an excellent way to implement many a simple database"
There's close to 19k people who would probably disagree with you on that one.... It was all down to a hidden tab, apparently....
https://www.gov.uk/guidance/data-incident-affecting-applicants-to-the-afghan-relocations-and-assistance-policy-scheme-and-afghanistan-locally-employed-staff-ex-gratia-scheme#background
but on the flip side, spooling up a database when a simple vlookup will do is a valid use case. Dont get me wrong, a database is the correct way to do many things but a spreadsheet can do lightweight tasks too. It depends on how you get your spreadsheet data. Look at Durham uni CEM exam analysis spreadsheets, they should probably be a database but sending those to schools would be a disaster in MDB so they have excel macros. Not ideal but portable for users.
I agree. A flat file system will meet the majority of most people's needs for data storage, even in a business context. There are certainly many situations where there is a better option, providing ability to create and manage it properly exists. However, it takes skill to develop and create a "proper database", skills that most people (and businesses) don't have. And for those that do, the media is full of accounts where the skill set fell short of arriving at what was actually needed (needed by the users, that is, not the consultants and company licensing the underlying software). I'm not claiming that spreadsheets (e.g. Excel) will meet all database needs, but most user needs are not really so complex that a more formal database application would be any more beneficial. Nor am I claiming that there aren't cases where people push Excel too far and stray into areas where there are better options. However, aiming for excellence is often the enemy of adequacy.
I'm now long retired but, from my working days, it was often the behaviour of those trying to push for the full-blown database that made the case to try Excel first (and once the spreadsheet was in use, and people managed to get the work done using it, it was an even harder challenge to change to what might have been better).
A screwdriver usually does the job of opening a paint tin, though there will be people who will try and sell you a tool just for doing that. The specialised tool (and similarly specialised tool for stirring the paint) will probably be a better choice for a painter who opens a lot of tins during their working day; at home, where it's only an occasional need, the screwdriver that's ready to hand does what's needed (opening the tin and, quite often, stirring the contents). And even if you've bought the special tool, you could have been well on with painting (the job that is actually needed) by the time you found it!
Absolutely. Add in that almost all small to medium businesses don't have any IT/software trained staff and a proper DB is almost impossible to learn "on the fly" by inputting the data to hand *even for a techie minded admin/service agent, even if they had the time to try. Which they don't
*Yes I tried. It didn't end well.
There used to be flatfile databases that ordinary users could er use. .But they wouldn't have had the fine access control that a big grown up DB has. However they would be ideal for those mostly small scale database systems that most people create in Excel (There's no excuse if it's a complex enormous, system critical database- that's a different matter.)
Most small office workers don't have the skills, knowledge or time to master a big grown-up database, Even Access (an ironic name if ever I heard one) is far too complex for most people, even pretty techie ones -unless that's what they do 9 till 5.
When we think of databases we tend to think of big, complex stores of information. But there are a n awful ot of people out there creating databases in Excel that list a few hundred objects and associated fields- and who don't hve a Scooby how to turn that into an access control DB and when it was frst developed never dreamt it would oneday hold thousand of objects.
Probably there needs to be a simple database programme designed in a modular way that can be upscaled over time. Starting with a simple flat file system and adding tools,, complexity, security and so forth as needed from a menu system. It would, dare I say it, even be a good use of Co-Clippy
" I see you are making a database with over a thousand data objects, would you like help with that?"
"There used to be flatfile databases that ordinary users could er use."
Not /rdb ? Not sure what became of this since '98.
Ashton-Tate's DBase II or III ? Surely not. ;)
To be honest most "ordinary" users couldn't be trusted with card index let alone anything vaguely resembling a database.
Personally I have never believed that spreadsheets were fit for any purpose whatsoever.
The format is reasonable presentation format for displaying numeric and other simple data in tabular form. Even a reasonable input format for tabular data or model parameters.
But integrating the model construction and execution into the input and presentation functionality was one of the more daft ideas in IT (and there have been and are some real doozies.)
A right bugger in this life that clouds never seem to remain yelled at.
† Have. Found it guilty as charged.
To be honest most "ordinary" users couldn't be trusted with card index let alone anything vaguely resembling a database.
Some people do need some kind of simple database, privately (collectors) or professionally (about all non-IT professions). It really doesn't matter they don't have the l33t know-how required, they simply need one. Cue the use of any strange program they might decide to use for that, and Excel is something most people have access (no pun intended) to.
I don't say it's right, I just say it's unavoidable.
Absolutely. The typical small service company- one or two people on sales and billing, a handful of technicians/fitters/repairers/etc and someone who buys the parts. Every High Street has a few of these. And they probably use Excel for the customer list, bookings and purchases. Probably cobbled together by the boss, or delegated to the admin. Probably grown organically ("Why don't we add a list of <something they do>").
The nearest I've seen to "something a user can cope with" was FileMaker - but since it's not included in the "this is the only toolset I know about" 365 subscription most people use, then it doesn't exist.
I've seen some very non-technical users create quite complex things with FileMaker. With a previous work hat on, a significant part of the business ran on it.
An Excel file is an excellent way to implement many a simple database
No, Excel is a crap way to implement anything that even vaguely resembles a database.
First hurdle is stopping it from mangling your data - the "I think that looks like a [some type] so I'm going to mangle it whether you like it or not approach. I've had that in the past - send someone a file with the information they requested - they come back and tell me it's wrong, yup, they imported it into Excel which cheerfully mangled it for them.
And that's before you get to strong data typing (doable, sort of, with a lot of effort), null value support (AWOL), referential integrity (AWOL), and many other things that are needed in even basic databases.
With my work hat on (hence posting anon), I have to work with some of these so called databases - and to say it's a p.i.t.a. would be an understatement. Having to spend months manually cross checking things that could be done in minutes (OK, perhaps a week or two as I also had to cross check the cabling diagrams) with a proper database - check. Manually filtering because there's no concept of one-many relationships (so someone decided the way around that was to put multiple entries in a multi-line text cell) - check. Manually looking for missing values because there's no null support - check. And at times it seems I'm the only one who things there might be a problem with that ! And yes, I did find an error (not in the system I was checking, but one it relies on) which had I not found it would have caused some £[number with lots of zeroes] rework down the line on a high profile £[number with even more zeroes] project.
No, Excel is not even a mediocre way to implement anything bigger than a shopping list. But it gets used because a) people don't know what an even playschool grade database looks like, and b) people tend to have it to hand, and c) they don't have anything better, and d) it's been crammed with functions to make it look like it might be a database (if you are clueless as to what a database actually looks like). And the costs to global society from all this misuse of a crap tool must be immense.
No, EXCEL IS NOT A DATABASE
Why?
The alternatives involve either opening up your own filestore/database to third parties (good luck securing that) or permanently storing that data with some random third party who you trust to restrict remote access to only those you specify (and pay them for the ongoing costs).
Sending a file - preferably encrypted against the errors, omissions and commissions of intermediaries - at least bounds the potential data loss. It's no worse a vector for dodgy executables than any of the other options.
I believe he may have been talking about corporate policies. Where all company data be responsibly managed via controlled means. Laptops being locked down, having no locally accessible storage. The company IT dept sorts it all out. Dropbox would have no place in the work flow.
The problem in smaller organizations is that there's not usually a secure data transfer mechanism (or it's not well publicized) for communicating with outside entities, so employees default to using Dropbox or something like it. Maintaining one's own solution comes with its own pitfalls, so sending files via a third party service is often good enough for many use cases.
The company IT dept sorts it all out
Solutions like WeTransfer exist because for most people that isn't an option. It's particularly not an option in "creative" industries where underpaid freelancers are very often providing large files to their barely-corporate content contractors. Use proper access controls doesn't actually require corporate policies, it just requires an awareness of the potential threat and the mitigation (appropriate encryption) to deal with it.
"A what?" would be the standard answer of 99% of people out there. You can be a great [non-IT professional], and still not have the skills required to think of this solution and implement it.
That's why services like Dropbox or WeTransfer flourish: The vast majority of professionals out there are not IT professionals, they need something simple which "just works".
Whoever heard of Dropbox or WeTransfer until they first heard of them?
The question is how to keep US megacorps' fingers out of SMB's data so you reccomend US megacorps on the basis that they have better marketing - because marketing is effectively what you're saying.
If tranferring data is part of a business's operations that acquiring some knowledge about the options available should be essential. Would you say that knowledge of the other things it does, be it accounting machine tools or design software should also be neglected?
"Solutions like WeTransfer exist because for most people that isn't an option. It's particularly not an option in "creative" industries where underpaid freelancers are very often providing large files to their barely-corporate content contractors. Use proper access controls doesn't actually require corporate policies, it just requires an awareness of the potential threat and the mitigation (appropriate encryption) to deal with it."
Many freelancers run with a gmail and instapintatwitface account rather than their own domain. I have several domains and a gob of FTP space. I can upload a file to my host (who contractually can't do anything with it outside of storage) and send a link to a client. When I send a file, the link is my own domain so clients know it's legit. It also doesn't cost me a bloody cent for the volumes I do. I have a Dropbox account since some clients insist on using one and they can send a file to mine instantly and I can download it from there. There's also a couple where I do that in reverse. I won't add anymore accounts for things like that and don't tell anybody I will do DropBox transfers up front.
Just don't mention standard naming conventions where on NAS storage folders have to be structured "like this" and named "like this" and all files prefixed "like this" for "this" or "that" type of data/content.
Only saw it a couple of times and it looked a nightmare to use...no wonder it was universally ignored !
Picard: What do you mean the Borg have joined the Federation? This is insane! Don't you remember what they did to me?
Admiral: I know.. I know Jean Luc. But they have come a long way from forcible assimilation. They play by the rules now. They have lawyers, written policies and terms of service. They are civilized now.
No. You regret getting caught.
There was no confusion. There was you slipping in a bit of text written by a highly-paid besuited Porsche-driving lawyer to ensure that, in the future, all of your options were open and available.
Unfortunately, it would appear that there are people who actually bother to take the time to read those coma-inducing piles of legalese (no, it's not English, it's its own language) and who have enough knowledge themselves to decrypt and unravel the truth behind the waffling.
And it would appear that, this time, they are not your friends.
Who knew ?
> There was no confusion
Obviously not, this is a well-used formula of corporate arrogance. Instead of "we were wrong", they claim "we were misunderstood" (implied: "by those hopeless idiots out there")...
You hear this often after PR blunders like this one, when companies with arrogant management are involved.
" There was you slipping in a bit of text written by a highly-paid besuited Porsche-driving lawyer to ensure that, in the future, all of your options were open and available."
There was you slipping in a bit of text written by copied by a highly-paid besuited Porsche-driving lawyer to ensure that, in the future, all of your options were open and available.
The terms quoted are very boilerplate at this point and one should never agree to them.
I pasted that snippet into Copilot (yes, I know, sorry) & this is what it returned...
That’s quite a powerful clause. Let’s unpack it so it’s easier to understand:
What It Means
By agreeing to these terms:
- You’re giving the company permission to use any content you provide.
- That permission is forever ("perpetual"), worldwide, and free of charge.
- They can transfer or sub-license the rights to others.
- Your content may be used for:
- Running and improving their service
- Developing new services or technologies
- Enhancing machine learning models (especially for content moderation)
Why This Matters
- You lose exclusive control over your content.
- Even if you stop using the service, they may continue to use your content indefinitely.
- It could be shared or repurposed beyond the original platform.
If the content you’re sharing includes personal info, creative works, or sensitive data, this type of clause definitely deserves a closer look
So why wasn't/isn't there such a fuss over M$'s ToS? Their implied Services Agreement includes:
"...you grant to Microsoft a worldwide and royalty-free intellectual property license to use Your Content, for example, to make copies of, retain, transmit, reformat, display, and distribute via communication tools Your Content on the Services."
There's even more detail about your personal data in M$'s Privacy Statement.
That affects many, many more users than WeTransfer, and (almost) everyone blindly signs up by default...
Because for a lot of people a) Microsoft is the computer, like Edge is the internet. and b) work requires Microsoft
WeTransfer is specifically for transferring sensitive information, so the people using it are a lot more clueful about privacy and security, otherwise they wouldn't be there.
That any data they supply to a third party will be used for AIs (even if just their own and supposedly only for internal use) and quite possibly find their way into the hands of bad guys. Sometimes you can't help it - you can't help that by showing up for your annual physical exam you are giving your medical provider data. You can't tell them "don't record my cancer diagnosis in my medical records I don't want that information falling into the wrong hands".
But where you do control the outflow of data, like if you are syncing/backing up your phone to the cloud or putting a file on a file sharing site to share with a friend or coworker, you can protect yourself / your data by encrypting it using a key that YOU control rather than relying on the third party's promises of their own encryption. If they can't read it, they can't use it meaning you don't have to rely on their promises or potential changes in policy, and bad guys can't steal it.
Yes that's wonderful in theory. Now try to share a file with your cousin the auto mechanic using PGP. If something isn't easily accessible to people of average/typical technical abilities then it doesn't exist as an option 98% of the time.
If you want something that is accessible to most people you'd use Windows native ZIP tool to encrypt the file before sharing it, and provide them the password via secure means (like encrypted chat or something) They might be annoyed at the extra steps versus just getting a file they can immediately use, but they'd be familiar with Windows ZIP and have done this process before.
There's a lot more infrastructure that would have to be changed than SMTP to make that work. You have to distribute the keys and for servers that's done via DNS but since this is per user you can't realistically use DNS but would need to create something new.
I agree we should have done something to standardize this, but doing would be present a way more complex problem than you seem to think.
SMTP transfers can be encapsulated by TLS. Both ends of the connection need to agree to use it so in my experience (run my own server for the family domains) it doesn't happen very often. This ensures the contents of the message remain private whilst in transit.
PGP or S/MIME allows the recipient to verify that the sender is who they say they are and as noted would require a whole PKI to be built out on a per user level. Anyone who has used ssh will know what that entails.
But who uses email nowadays? Its going the way of the landline. A lot of email services impose fairly low limits on the maximum size of a message so if you need to send say a bunch of photos to someone you have to break them up into batches. Its easy to see why cloudy file sharing services take over in that scenario especially when they are integrated right into your desktop. Along comes ML training and its not hard to predict what happens next.
Even if email did go the way of the landline (color me skeptical on that) if there was infrastructure to distribute validated per user public keys that would be valuable for any other personal communication systems to have at their disposal. For instance a closed messaging system like iMessage or WhatsApp does the end to end encryption by doing exactly the sort of key distribution system I mentioned the lack of in my earlier post. Except that rather than the sort of open worldwide directory of email addresses that would require, those are a private directory of just Apple users and just Meta users.
If there had already been a global system to distribute PGP keys for everyone's email account, then end to end encryption would be possible not just within a closed system like iMessage or WhatsApp, but possible from anyone to anyone regardless of the method of communication. The "key" would still be an email address, but that's actually true of a lot of these messaging systems where it is keyed either by your email (i.e. your Apple ID) or by a username that is associated with an email address (WhatsApp) so it could use the same infrastructure that had been set up for email (if one had been set up)
That would allow file sharing type stuff to use that system too - I want to give you a file, so I visit a file sharing site and upload a file which I mark as being for "munnoch@gmail.com" or whatever your email was. The file sharing site uses the standard API that would exist to look up your public key based on that email address, encrypt the file with it, then send you a notification that the file is waiting for you. You click on it and whatever device/software you used to click on it would have your stored private key at its disposal to automatically decrypt the file for you as it is downloaded.
Re: "But who uses email nowadays? Its (sic) going the way of the landline".
Vast numbers of people, outside the corporate or Whatsapp worlds (respectively).
Most small businesses use Email. Schools and other public facing institutions use email. My wife's communications from Girl Guiding come in emails, and her messages to the Brownie's parents go by email.
There's an awful lot of old fashioned email about still- and come to that landlines, even if they send the calls through IP and not copper now.
"I don't know but doesn't every social media giant have similar clauses in their T's & C's that they will use the content you upload to improve their moderation algorithms "
Pretty much all of them. Not only can they monetize PII, they can also monetize everything a user uploads. If a company wants to license a photo you posted on Instagram and you ask for too much money, the company can license it from Instagram as you have given them permission to do so. According to a story on FStoppers, the photographer lost the case. I haven't kept up to see if any more have been reported.
"....at some point someone will decide to take advantage of them."
It's not just the corp whose T&C's you've accepted. If you read a bit further down, that license extends to the entity that gobbles them up or buys the assets out of bankruptcy. That new company may decide that selling copies of your work to help pay for the acquisition is just the boost they need.
Digital documents qualify for the old saying these days. The contract overrides any verbal or other assurance that they'd never do such a thing. Once you click "accept", that's you on the BBQ.
If you are a creative and try to earn a living at it, you must learn how to spot these overreaches and avoid them. If I make a photo and exclusively license it to a company for a big ad campaign and also post a copy somewhere with the above terms, I could get in a load of trouble if that entity sub-licenses that image to somebody else and my customer sees it. The photo could also be one I've made in the past that a company wants to license so it isn't a matter of my keeping it under wraps to prevent its escape. I rarely do exclusive licenses, but I've had media lockouts where a magazine wanted no publication for 90 days so their article would come out before anybody else would see the image. That's very typical. It's more common for me to have a company see something I've made used someplace else and want to license it or they are looking for something specific in my genre and I'll curate a gallery for them. It would be perfectly legal for Instagram to start a stock photo service and whore out images users have posted for £1 each. Not moral, but legal. I expect it would crater the company, but..........
I looked at the mentioned DBs Dataease and Filemaker.
One of these has a really complicated web page that shouts "unfriendly" in every line and costs at least £500 per user . Not even per business with up to N users. Per user. And yes I know Sid's Garage probably only has one PC and one person using it, but even so....
The other costs a chunk of cash per month which is both very costly and a hostage to fortune. Who's going to rent software from some company they've certainly never even heard of before?
And it's just about impossible to find out what they're like first. If there's a trial version I couldn’t see it, in either.
IOW both seem to bo going out of their way to avoid having small business clients.
OTOH Everyone these days has Excel (or the LO equivalent,) to hand, effectively if not actually free. And relatively simple to use.
"OTOH Everyone these days has Excel (or the LO equivalent,) to hand, effectively if not actually free. And relatively simple to use."
But, a spreadsheet is not a DB. Both store information, but both handle it differently. Both are useful for the things they are good at, but not interchangeable. There used to be some inexpensive simple database apps that were still quite good. I've lost touch with them as I'm handed the DB I'll be working with and it's also far beyond what those simple ones are capable of. I don't even do much of that anymore.
I can export data from my accounting app, which is something of a database, into a spreadsheet to do further analysis. People have attempted to use a spreadsheet to do their accounting, but only they know how it works so it can't be handed off with any sort of ease. I could teach somebody how to do data entry into my accounting in less than a day and also lock them out of access that would let them make big mistakes. I don't think I could do that with a spreadsheet. At least not without going through a whole reconfiguration. It's then cost of software vs. what my time's worth to do that.
"a spreadsheet is not a DB. Both store information, but both handle it differently."
Yes, but you tell that to Sid, who's using his Excel spreadsheet to manage his customer list, and his stock list and who knows what else. Because it's free and easy for him.
This post has been deleted by its author