Newer AI Coding Assistants Are Failing in Insidious Ways

brianpeiris@lemmy.ca · 2 days ago

Newer AI Coding Assistants Are Failing in Insidious Ways

xep@discuss.online · 12 hours ago

As opposed to the older ones working perfectly well?

/s

themusicman@lemmy.world · 11 hours ago

As opposed to older ones failing in obvious and fixable ways

middlemanSI@lemmy.world · 2 days ago

Who would have thought…Oh wait, any sane person would. Don’t move, you might burst it.

TheFogan@programming.dev · 2 days ago

I mean it’s kind of obvious… they are giving their LLMs simulators. access to test etc…, IE chat gpt can run code in a python environment and detect errors. but obviously it can’t know what the intention is, so it’s inevitably going to stop when it gets it’s first “working” result.

of course I’m sure further issues will come from incestuous code… IE AIs train on all publicly listed github code.

Vibe coders begin working on a lot of “projects” that they upload to github. now new AI can pick up all the mistakes of it’s predicesors on top of making it’s new ones.

Kissaki@programming.dev · 2 days ago

A task that might have taken five hours assisted by AI, and perhaps ten hours without it, is now more commonly taking seven or eight hours, or even longer.

What kind of work do they do?

in my role as CEO of Carrington Labs, a provider of predictive-analytics risk models for lenders. My team has a sandbox where we create, deploy, and run AI-generated code without a human in the loop. We use them to extract useful features for model construction, a natural-selection approach to feature development.

I wonder what I have to imagine this is doing and how. How do they interface with the loop-without-a-human?

Either way, they do seem to have a (small, narrow) systematic test case and the product variance to be useful at least anecdotally/for a sample case.

lad@programming.dev · 13 hours ago

I have a feeling that their test case is also a bit flawed. Trying to get index_value instead of index value is something I can imagine happening, and asking an LLM to ‘fix this but give no explanation’ is asking for a bad solution.

I think they are still correct in the assumption that output becomes worse, though

VoterFrog@lemmy.world · 10 hours ago

It just emphasizes the importance of tests to me. The example should fail very obviously when you give it even the most basic test data.

lad@programming.dev · 8 hours ago

Yeah, if only QA vere not the first ‘replaced’ by AI 😠

VoterFrog@lemmy.world · 6 hours ago

This isn’t even a QA level thing. If you write any tests at all, which is basic software engineering practice, even if you had AI write the tests for you, the error should be very, very obvious. I mean I guess we could go down the road of “well what if the engineer doesn’t read the tests?” but at that point the article is less about insidious AI and just about bad engineers. So then just blame bad engineers.

lad@programming.dev · 5 hours ago

Yeah, I understand that this case doesn’t require a QA, but in the wild companies seem to increasingly think that developers are necessary (yet), but QA are surely not

It’s not even bad engineers, it’s just squeezing of productivity as dry as possible, as I see it

Horrabin@programming.dev · 2 days ago

Sorry for the self-reference but… https://medium.com/@nandofm/ai-the-danger-of-endogamic-programming-8114b90dffb2

voodooattack@lemmy.world · 11 hours ago

I’d really love to read that, but medium is just… not my thing. I hate that site so much.

Have you considered writing on dev.to? I won’t promote it, extol any virtues, or try to convince you to go there. Just asking if you’re aware of it and others like it!