• LanguageIsCool@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    1 month ago

    I’ve heard that a Claude 4 model generating code for an infinite amount of time will eventually simulate a monkey typing out Shakespeare

    • MonkeMischief@lemmy.today
      link
      fedilink
      arrow-up
      1
      ·
      1 month ago

      It will have consumed the GigaWattHours capacity of a few suns and all the moisture in our solar system, but by Jeeves, we’ll get there!

      …but it won’t be that impressive once we remember concepts like “monkey, typing, Shakespeare” were already embedded in the training data.

    • Match!!@pawb.social
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 month ago

      llms are systems that output human-readable natural language answers, not true answers

  • coherent_domain@infosec.pub
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    1 month ago

    The image is taken from Zhihu, a Chinese Quora-like site.

    The prompt is talking about give a design of a certain app, and the response seems to talk about some suggested pages. So it doesn’t seem to reflect the text.

    But this in general aligns with my experience coding with llm. I was trying to upgrade my eslint from 8 to 9, and ask chatgpt to convert my eslint file, and it proceed to spit out complete garbage.

    I thought this would be a good task for llm because eslint config is very common and well-documented, and the transformation is very mechanical, but it just cannot do it. So I proceed to read the documents and finished the migration in a couple hour…

    • Lucy :3@feddit.org
      link
      fedilink
      arrow-up
      0
      ·
      1 month ago

      I asked ChatGPT with help about bare metal 32-bit ARM (For the Pi Zero W) C/ASM, emulated in QEMU for testing, and after the third iteration of “use printf for output” -> “there’s no printf with bare metal as target” -> “use solution X” -> “doesn’t work” -> “ude printf for output” … I had enough.

    • MudMan@fedia.io
      link
      fedilink
      arrow-up
      0
      ·
      1 month ago

      It’s pretty random in terms of what is or isn’t doable.

      For me it’s a big performance booster because I genuinely suck at coding and don’t do too much complex stuff. As a “clean up my syntax” and a “what am I missing here” tool it helps, or at least helps in figuring out what I’m doing wrong so I can look in the right place for the correct answer on something that seemed inscrutable at a glance. I certainly can do some things with a local LLM I couldn’t do without one (or at least without getting berated by some online dick who doesn’t think he has time to give you an answer but sure has time to set you on a path towards self-discovery).

      How much of a benefit it is for a professional I couldn’t tell. I mean, definitely not a replacement. Maybe helping read something old or poorly commented fast? Redundant tasks on very commonplace mainstream languages and tasks?

      I don’t think it’s useless, but if you ask it to do something by itself you can’t trust that it’ll work without singificant additional effort.

      • wise_pancake@lemmy.ca
        link
        fedilink
        arrow-up
        0
        ·
        1 month ago

        It catches things like spelling errors in variable names, does good autocomplete, and it’s useful to have it look through a file before committing it and creating a pull request.

        It’s very useful for throwaway work like writing scripts and automations.

        It’s useful not but a 10x multiplier like all the CEOs claim it is.

        • MudMan@fedia.io
          link
          fedilink
          arrow-up
          0
          ·
          1 month ago

          Fully agreed. Everybody is betting it’ll get there eventually and trying to jockey for position being ahead of the pack, but at the moment there isn’t any guarantee that it’ll get to where the corpos are assuming it already is.

          Which is not the same as not having better autocomplete/spellcheck/“hey, how do I format this specific thing” tools.

          • jcg@halubilo.social
            link
            fedilink
            arrow-up
            1
            ·
            29 days ago

            I think the main barriers are context length (useful context. GPT-4o has “128k context” but it’s mostly sensitive to the beginning and end of the context and blurry in the middle. This is consistent with other LLMs), and just data not really existing. How many large scale, well written, well maintained projects are really out there? Orders of magnitude less than there are examples of “how to split a string in bash” or “how to set up validation in spring boot”. We might “get there”, but it’ll take a whole lot of well written projects first, written by real humans, maybe with the help of AI here and there. Unless, that is, we build it with the ability to somehow learn and understand faster than humans.

      • vivendi@programming.dev
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 month ago

        It’s not much use with a professional codebase as of now, and I say this as a big proponent of learning FOSS AI to stay ahead of the corpocunts

        • MudMan@fedia.io
          link
          fedilink
          arrow-up
          1
          ·
          1 month ago

          Yeah, the AI corpos are putting a lot of effort into parsing big contexts right now. I suspect because they think (probably correctly) that coding is one of the few areas where they could get paid if their AIs didn’t have the memory of a goldfish.

          And absolutely agreed that making sure the FOSS alternatives keep pace is going to be important. I’m less concerned about hating the entire concept than I am about making sure they don’t figure out a way to keep every marginally useful application behind a corporate ecosystem walled garden exclusively.

          We’ve been relatively lucky in that the combination of PR brownie points and general crappiness of the commercial products has kept an incentive to provide a degree of access, but I have zero question that the moment one of these things actually makes money they’ll enshittify the freely available alternatives they control and clamp down as much as possible.