back to article Playing instruments, musical talent? Psh, this is the 2020s – Meta has models for that now

Meta on Wednesday released AudioCraft, a set of three AI models capable of automatically creating sound from text descriptions. As generative AI models that take written prompts and turn them into images or more text continue to mature, computer scientists are looking into making other forms of media using machine learning. …

  1. Leedos

    Agreed, the samples suck

    They should have the option to export as MIDI so it could be used with better sounding instruments and tweaked to sound less generic.

    On second thought...

    They should get some better sounding instruments and add some swing to the tempo. AI is supposed to make things better, not make me work harder.

    1. Andrew Hodgkinson

      Re: Agreed, the samples suck

      They should have the option to export as MIDI

      It can't. It's an ML system. It has no comprehension at all; it is a stochastic parrot (https://en.wikipedia.org/wiki/Stochastic_parrot). Given their description of the samples used for training, we know that this is trained on raw audio data and just recombines it in pattern-matchy ways.

      It is therefore just another boring, generic, bland ML system.

      AI is supposed to make things better

      Says who? The likes of Meta, OpenAI and so-forth produce these things to make money. The fact that ChatGTP officially cannot be accurate according to its makers, yet has been put in front of a search engine that has the one job of producing accurate search results (by Microsoft - a major shareholder in OpenAI) should've made that abundantly clear.

      Generative AI in its current form exists entirely to make vast corporations even richer.

      1. Pascal Monett Silver badge

        Re: Says who?

        Well, not to be gratuitiously contradictory, but in general I do expect that, if an effort is made to use ungodly amounts of electricity and cooling in order to produce something, that something had better be worth it.

        Of course, I'm willing to give it the time to train and get up to speed, but if, after all is said and done, the end product is barely good enough for elevator music, then shut the damn thing down now and stop wasting precious ressources and time for nothing.

        If I want to hear bad music, I can already turn on the radio. Nobody needs a pseudo-AI to add to that mess.

        1. Anonymous Coward
          Anonymous Coward

          Re: Says who?

          " that something had better be worth it"

          It is. To their shareholders. That is all they care about.

          Who don't even care about functionality. At worse, they can get media exposure for this stunt. What do they care if the response on the Register is scorn? Just as long as it gets mentioned: "pioneers are always mocked" they can say to each other.

          Personally, I agree with you.

          But Capitalism doesn't. Unfortunately.

          1. SundogUK Silver badge

            Re: Says who?

            "But Capitalism doesn't. Unfortunately."

            Good thing too. If capitalists offer a product that nobody wants, or doesn't do what they say it does, it will fail and disappear. If socialists do so, well you have to accept it or it's the gulag for you.

            1. Anonymous Coward
              Anonymous Coward

              Re: Says who?

              == "But Capitalism doesn't. Unfortunately."

              = Good thing too

              Huh?

              Pascal: don't want

              AC : sadly, Capitalism do want

              Sundog: good for Capitalism

              So: Sundog *want* resources wasted on this? They likes muzak that much, want more?

              Sundog then say socialists send people to gulag for - what? Good health care?

              Sundog confused person, not know difference between socialists and Communist Party, who run gulags?

              1. FIA Silver badge

                Re: Says who?

                Sundog pointed out capitalism exists to serve demand.

                So no demand, no thing.

                AC disingenuous person, conveniently omitted that.

                AC also confused, not know difference between capitalism and corruption.

                (Capitalism is an economic system, corruption is what happens when people get involved, no matter the economic system).

                1. Anonymous Coward
                  Anonymous Coward

                  Re: Says who?

                  > Sundog pointed out capitalism exists to serve demand.

                  No, Capitalism exists to create unnecessary demand and then whitewash itself by "satisfying" that demand.

                  > So no demand, no thing.

                  No. The Thing is created, and the resources already wasted on doing so. If no long term demand, or demand deemed not sufficiently high, Thing goes away again.

                  That is why so many new businesses are expected to fail, why venture capital exists in the first place, to take bets against new businesses. If Capitalism only reacted to genuine demands then simple bank loan schemes would be all that was required, at reasonable rates.

                  You will now explain why that is hopelessly naive:

                2. that one in the corner Silver badge

                  Re: Says who?

                  Have to admit, I am now intrigued.

                  FIA, this particular "thing" apparently exists already, so can you point us towards the reported customer need for an "AI" that can make a noise like a siren or some rather dreary music?

                  Preferably, a massive pent up demand which has been filling the relevant trade papers for months, if not years, with calls to replace existing foley artists, sound effects samples - and all those annoying session musicians. One that justifies the costs of creating and running a neural net on 20,000 hours of samples (presumably at a 44.1kHz sample rate).

                  And not just "me too" calls for the wonders of AI, this Miracle Of The Modern Age, to be applied to this industry sector just because all the other sectors are getting to play, we want to as well.

                  1. jake Silver badge

                    Re: Says who?

                    "a massive pent up demand"

                    This demand has existed in the minds of the people forced to sign checks for musicians, sound effects people, and other obviously useless folks on the payroll. If these people can go by the wayside, the bosses in charge of the studios can purchase a new business jet and/or dacha this year.

                    Doesn't matter if it's a million monkeys with a million microphones, or so-called "AI", the folks in charge are seeing massive dollar signs. And they are right.

                    Sadly, they are not looking long-term. And that is where they are wrong.

                    AI is just a fad as being sold to the studio bosses. It doesn't work, and indeed it can't work ... at least not as being sold to them. It's a sham, a fraud. And the bosses are being led down the garden path by their own greed. Fuck 'em.

          2. Boris the Cockroach Silver badge

            Re: Says who?

            Quote

            "It is. To their shareholders. That is all they care about."

            of course.

            In the bad old days of the '70s bands got 5p per album sale

            Now with AI generated muzak, the record labels will be able to save that 5p and give it to more deserving people (the execs and shareholders)

    2. Michael Strorm Silver badge

      Re: Agreed, the samples suck

      > They should [..] add some swing to the tempo.

      Indeed. As it currently stands, it don't mean a thing.

      1. Anonymous Coward
        Anonymous Coward

        Re: Agreed, the samples suck

        Great reference, I like the Stéphane Grappeli version myself!

  2. This post has been deleted by its author

  3. jake Silver badge

    My first thought was ...

    ... that it can't be any worse than autotune.

    And then I thought about it, and realized that it will be. Much, much worse.

    Has anybody ever used the phrase "popular music winter"?

    Not really a "git orf me lawn" moment. At least not yet. If you like music, be afraid. Very afraid.

    1. Korev Silver badge
      Terminator

      Re: My first thought was ...

      > ... that it can't be any worse than autotune.

      God yeah, the sad thing is that as everything is over-autotuned these days then that's what the model will have been trained upon...

      > Has anybody ever used the phrase "popular music winter"?

      No, but they should!

  4. TheMaskedMan Silver badge

    "They sound like repetitive and generic jingles for bad hold music or elevator songs rather than hit singles."

    Which is likely what they will be used for, at least at this stage.

    I wonder if anyone has tried training one of these things on sheet music rather than audio samples? There are centuries of that, mostly out of copyright, that could be used for training.

    1. that one in the corner Silver badge

      > I wonder if anyone has tried training one of these things on sheet music rather than audio samples? There are centuries of that, mostly out of copyright, that could be used for training.

      I believe they have (although don't press me for a citation right now): there is a long history of analysing music properly in maths and computer science (including all shades of AI). Given the stochastic performance of these neural net models you could probably get away with including the Musikalisches Würfelspiel dice-throwing parlour game from the 18th century (using the very best AI tech they had to hand).

      Although music is a "serious business", especially if you use old pieces, classical, baroque et al and the "pile it in with a shovel" approach has been fine for a student project (like the singing dog, we are not so much impressed by the quality but...) but serious people take a serious approach and like to hand-craft their rules or, at least, have a result that can explain to people how the piece works (which, of course, neural nets generally can not). That wasn't meant to be (too) snide, by the way: I thoroughly approve of musical analysis, without the shovel.

      1. hoola Silver badge

        I would bet a large cake that the cost of all the people and tech trying to create this is far higher than the cost of of having real performers actually make music.

        I exclude the over-hyped mega names for obvious reasons.

        What the ordinary musician gets paid is generally an insult to the training, experience and work needed to produce the result.

        I say this as someone who trained professionally but ended up working in IT so that I could pay the bills and enjoy making music.

    2. Michael Strorm Silver badge

      "There are centuries of that, mostly out of copyright, that could be used for training."

      Someone took up your suggestion, and have announced what they hope will be the first AI-generated #1 chart hit, "Summer is icumen in again".

  5. that one in the corner Silver badge

    > That said, the model weights are not open source. They are shared under a Creative Commons license that specifically forbids commercial use. As we saw with Llama 2, whenever Meta talks about open sourcing stuff, check the fine print.

    (Haven't checked reality - the repo - yet, but just going from the above sentence and talking it as gospel)

    > check the fine print

    You've just said that all the software tools are under the MIT licence. Complaining that the weights are not similarly licenced is like praising a oddball programming language's compiler for being open source then damning it because they haven't applied the same licence to all of the programs written in that language!

    Although that is not a strict analogy, of course, as the weights are less program and more of a data dump. Which at least makes the use of a Creative Commons licence a sensible usage (CC licences are not a good match for program source code but are fine for data dumps, whether that be text, images, any old pile of nadans, your database of top-ten hits since 1972, ...).

    There is a difference between being supportive of open source and trying to stole the flame wars (and I'm aware of the irony that I'm probably doing exactly that).

  6. Missing Semicolon Silver badge

    It doesn't kown about Thelonious Monk

    "in the style of thelonious monk with piano" just produces a random-walk chunk of piano noodling that sounds like Mozart on a bad day.

    A similar query for "Miles Davis" sounds more like a bunch of samples from very old recordings, spliced together.

    More experimentation would be needed to check it does not perform ChatGPT-levels of plagiarism.

    1. that one in the corner Silver badge

      Re: It doesn't kown about Thelonious Monk

      > More experimentation would be needed to check it does not perform ChatGPT-levels of plagiarism.

      Ah, there is the clever step in this particular approach (not that I approve of what I'm about to describe; let's swap "crafty" for "clever").

      By training the model on audio samples, they can (and reportedly have) restricted themselves to material they own the copyright to. When they recorded the music, for which they (may) have had to pay performance rights for modern pieces, they get reproduction rights over that recording.

      If the model starts spitting out untransformed audio waveforms then those are simply reproductions of their own recordings. There is no performance going on, just pseudo-random reproduction of materials that they own the rights to reproduce. No plagiarism is possible!

      Now, if you want to argue against that interpretation, then you are going to weaken the argument against language-oriented models (i.e. LLMs and the ongoing copyright suits).

      So, crafty: they'll have us either coming or going. The sods.

    2. Anonymous Coward
      Anonymous Coward

      Re: It doesn't kown about Thelonious Monk

      That's just downright wrong. Monk is a legend!

  7. abend0c4 Silver badge

    A few minutes sampled at 44.1 kHz

    Have they tried feeding it MusicXML? It's rather more information-dense.

    1. Dinanziame Silver badge

      Re: A few minutes sampled at 44.1 kHz

      Yeah, I find curious that the 44.1 kHz sampling is relevant — music is made of notes. It's like saying that it's harder to generate text when the screen resolution is higher.

      1. lowwall

        Re: A few minutes sampled at 44.1 kHz

        It's just notes? So all musical instruments sound identical except for range?

        1. abend0c4 Silver badge

          Re: A few minutes sampled at 44.1 kHz

          The thing is, the work on text prediction wasn't done by processing a bunch of audiobooks. Ingesting the musical equivalent of text would presumably be a lot easier than processing audio files, though it would exclude a whole sector of music that isn't based on notation. There is also context in the notation to identify different instruments so the opportunity to "learn" is there.

          I'm sure working directly from digitized audio is an interesting problem and it may have results that are transferable to speech, but it just seems like there may be a more productive place to start.

  8. 897241021271418289475167044396734464892349863592355648549963125148587659264921474689457046465304467

    Tried requesting Chopin in the style of electro pop, got a different very low fidelity result each time. I think this contraption needs a lot more work, but eventually musicans will become obsolete.

    1. GrumpenKraut

      Your handle equals 3 * 79 * 34741859 * 108970193390406341511991996777364618571790157695783210212070651472997691291195028786487349. You are welcome.

    2. Handel was a crank

      Musicians become obsolete? I doubt it, unless people suddenly prefer to see ChatGPT live instead of flesh and blood.

      1. 897241021271418289475167044396734464892349863592355648549963125148587659264921474689457046465304467

        Code is cheaper than people.

    3. jake Silver badge

      "I think this contraption needs a lot more work, but eventually musicans will become obsolete."

      Not until the contraption gains a self/id/soul/ego. Which is never going to happen.

  9. Anonymous Coward
    Anonymous Coward

    I like music with instruments and people.

    Anything else, just doesn't cut it for me. Sampling and looping has a place, but let's keep it real. AI doesn't actually know what the concept of a "note" is. It's ultimately just a random mashing of numbers.

    1. 897241021271418289475167044396734464892349863592355648549963125148587659264921474689457046465304467

      People, instruments and music are just bunches of numbers!

      1. Ken Hagan Gold badge

        Well you're just a bunch of numbers at any rate.

  10. Plest Silver badge
    Pint

    Creativity is not always about the output

    I took up photography and playing guitars about 15 years ago as I was 25st and needed an excuse to get outside and I also needed some hobbies that stopped me getting anxious and stressed about work. I still do them simply for the enjoyment of taking my mind off my stress, as it happens I now make money from my photos as a by products due to all the practice I got quite good at it but the core reason I do it is still the enjoyment of the activity and for my mental health, the pocket money I make is just a nice byproduct.

  11. Big_Boomer

    AI Sh!t

    I'm all for anything that makes life better, happier, safer, more productive, and even more profitable, but all I have seen so far from AI is sh!t that we already have a vast amount of. Stop wasting everyone's time producing sh!t music and sh!t literature and concentrate on ADDING VALUE. We have enough sh!t and don't need/want any more. If what you produce makes it easier to find cancer tumours, or improves a production process or makes a self-driving car safer then kudos to you, but we don't seem to hear much about those.

    1. Version 1.0 Silver badge

      Re: AI Sh!t

      So basically, "When St Patrick drove the snakes out of Ireland, they all started to work for Meta and became AI" - Brendan Behan's original quote updated. Whenever I see happy talks about AI, I fear what could happen in the future if AI takes over our entire working world. Will we be told that it's tremendously helpful because AI has made us all slaves with absolutely no skin colour factor because the entire world has been enslaved by AI?

    2. that one in the corner Silver badge

      Re: AI Sh!t

      > . If what you produce makes it easier to find cancer tumours, or improves a production process or makes a self-driving car safer then kudos to you, but we don't seem to hear much about those.

      Well, no, we wouldn't hear much about those (except perhaps the safer self-driving, if that ever comes to fruition).

      Aside from the old "if it works, it isn't AI anymore" aspect, which accounts for all the useful spinoffs quietly working away as "just normal stuff we use to get the job done", none of the rest are things that you would hear being shouted about in the general media (or even on a tech site, to be honest).

      Why?

      Mainly, because they work and don't need to be throwing tantrums in order to get some attention or any money from a sugar daddy flashing the cash and hoping to impress his pals with the stunning platinum blonde[1] AI hanging on his arm.

      The production process was improved by 9% and costs will be recouped in 18 months? There was a notice in the relevant glossy to announce a price freeze to customers (whilst competitors prices went up) to use the savings by improving retention: a staid business, nobody trusts flash adverts about "revolutionary new methods".

      Helping find tumours? Well, there was a report in July just gone: New AI tool can help treat brain tumors more quickly and accurately, study finds but that is too boring for any more shouting: medial stuff like this needs years of testing and approvals, and in the end it will "only" improve outcomes, it won't raise Lazarus. No immediate monetary gains? No immediate screaming from the rooftops by marketing.

      [1] pssst: platinum blonde my backside! Rumour has it that model was trained using software intended for a different field. They reformatted the data to trick the program into reading it: the AI equivalent of dying your hair.

  12. DrollLeek

    I'm just gonna keep playing the guitar.

    If I don't get the sounds out of my head I go a bit funny! ZFG whether anyone else likes it, or even hears it.

    Also, I'll keep listening to "Human Music" because they keep coming up with chord sequences that put me in a dream state.

  13. bobdehn

    My response comes in the form of Postmodern Jukebox's video for the cover of Africa by Toto.

  14. Will Godfrey Silver badge
    Stop

    Just... No!

    About 30 years ago in a pub, a girl in her late teens sang 'Streets of London' accompanying herself with a guitar. There was utter silence until she finished after which the pub near exploded with applause, and I don't think there was a dry eye in the place. I'll never forget it.

    I don't envisage AI ever getting anywhere near such an emotional performance.

    1. that one in the corner Silver badge

      Re: Just... No!

      > I don't envisage AI ever getting anywhere near such an emotional performance.

      Not from the drinkers, but you'll find your phone batteries are completely drained by the performance.

      Oh, those poor, sad, zeroes, left alone on the streets while we stay safely backed up in The Cloud.

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like