LLM observations

Published on

I have a lot of thoughts on “AI”, but here are a couple quick observations from giving it a fair shake as a tool in the belt of a software developer:

  1. For well documented (but think machine friendly well documented, not like necessarily really non-technically minded human approachable stuff… looking at you Typescript) GPT can really shine. A few examples I have personally had good results with:
    • help building a Regex (look, could I go relearn regex syntax for the 60th time? yes… but maybe I never will again)
    • Apache eCharts configuration
    • Typescript assistance
    • MUI theme settings / customization
  2. For tech that is a little more… i dunno, “move fast and break things”? it can really lead you astray by recommending things that are out of date or wrong. I was looking for way to have Next.j be able to run local env with https, and went down a wild goose chase of installing proxy servers etc, but after a while I was like “this feels ridiculous…” and just used plain search and immediately found a new, but not like all that new, setting in the next.js docs that got me just what I was going for with a single startup parameter.

Maybe these two observations boil down to one. You can trust LLMs a lot more for information about projects with Robust stable documentation specifically, if the tool/library has had a major revision, it seems to just guess at the (usually older?) version. I just tried this for Svelte. All the suggestions were based on v3 whereas v5 is new and also very well documented, but has some fundamental differences. It didn’t tell me its suggestions assumed v3 as context, and presumably because the v5 stuff has a lot less training data, even when I gave it that context its suggestions were not as strong and it mixed in a few v3 type things that weren’t correct.

And of course, yes, if you provide more context either as explicit context input or via the prompt that helps, for sure, but 1, you have to know and remember to do that if you’re asking about something that might have that sort of variability or instability and 2, (guessing here) because LLM don’t actually have a any sort of solid data model of the world as compared to a statistical model based around language use, things like software versions are much more permeable boundaries than you might expect.

Leaving aside all question of the ethics of things, which include questionably sourced training data, and all the hidden energy costs, LLMs can be a useful tool. The caveat is that understanding it’s strengths and weaknesses is crucial to using it effectively - same as any tool, I suppose.

If you want more similarly ambivalent commentary this was a good post