• TheFogan@programming.dev
    link
    fedilink
    English
    arrow-up
    18
    ·
    2 days ago

    I mean it’s kind of obvious… they are giving their LLMs simulators. access to test etc…, IE chat gpt can run code in a python environment and detect errors. but obviously it can’t know what the intention is, so it’s inevitably going to stop when it gets it’s first “working” result.

    of course I’m sure further issues will come from incestuous code… IE AIs train on all publicly listed github code.

    Vibe coders begin working on a lot of “projects” that they upload to github. now new AI can pick up all the mistakes of it’s predicesors on top of making it’s new ones.

  • Kissaki@programming.dev
    link
    fedilink
    English
    arrow-up
    11
    ·
    2 days ago

    A task that might have taken five hours assisted by AI, and perhaps ten hours without it, is now more commonly taking seven or eight hours, or even longer.

    What kind of work do they do?

    in my role as CEO of Carrington Labs, a provider of predictive-analytics risk models for lenders. My team has a sandbox where we create, deploy, and run AI-generated code without a human in the loop. We use them to extract useful features for model construction, a natural-selection approach to feature development.

    I wonder what I have to imagine this is doing and how. How do they interface with the loop-without-a-human?

    Either way, they do seem to have a (small, narrow) systematic test case and the product variance to be useful at least anecdotally/for a sample case.

    • lad@programming.dev
      link
      fedilink
      English
      arrow-up
      2
      ·
      13 hours ago

      I have a feeling that their test case is also a bit flawed. Trying to get index_value instead of index value is something I can imagine happening, and asking an LLM to ‘fix this but give no explanation’ is asking for a bad solution.

      I think they are still correct in the assumption that output becomes worse, though

      • VoterFrog@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        10 hours ago

        It just emphasizes the importance of tests to me. The example should fail very obviously when you give it even the most basic test data.

          • VoterFrog@lemmy.world
            link
            fedilink
            arrow-up
            2
            ·
            6 hours ago

            This isn’t even a QA level thing. If you write any tests at all, which is basic software engineering practice, even if you had AI write the tests for you, the error should be very, very obvious. I mean I guess we could go down the road of “well what if the engineer doesn’t read the tests?” but at that point the article is less about insidious AI and just about bad engineers. So then just blame bad engineers.

            • lad@programming.dev
              link
              fedilink
              English
              arrow-up
              1
              ·
              5 hours ago

              Yeah, I understand that this case doesn’t require a QA, but in the wild companies seem to increasingly think that developers are necessary (yet), but QA are surely not

              It’s not even bad engineers, it’s just squeezing of productivity as dry as possible, as I see it

    • voodooattack@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      11 hours ago

      I’d really love to read that, but medium is just… not my thing. I hate that site so much.

      Have you considered writing on dev.to? I won’t promote it, extol any virtues, or try to convince you to go there. Just asking if you’re aware of it and others like it!