I am looking for better ways to work with coding agents in a group learning setting. Today I was experimenting a bit with the shortest time to next interaction. Glazing over a coding agent’s output, or task switching while it is doing its thing is annoying enough when working alone, but is killing in a classroom setting.
A few things that I iterated further on today:
- Use the fastest programming language for iteration (the programming language is a means for the experiment. It is good fun and a bit different. It sometimes helps if no one, including me, is that familiar with the target)
- Prompt for small steps, and see if the agent stops for interaction
- Observe the number of turns and the time from questions to response
- Block the agent when it is taking too many turns.
Blocking the agent after a set number of turns is automatic. I made an extension for Pi, because I need this for regular work as well. Sometimes models overthink or get stuck calling tools in a loop. Counting the number of turns is a fairly simple way to address this. This exercise was a surprisingly quick way to come up with some improvements for it.
Limiting the number of turns is also a way to reduce the time to next interaction. The time between we give an instruction, and the agent stops, so we can steer it, calmly.
What happened
I wasn’t planning to blog about it, hence the first point. I found with an earlier experiment that chicken scheme lets an agent complete simple tdd and refactoring exercises almost twice as fast as some other languages and runtimes.
Prompting for small steps.
I asked the pi coding agent to work with me on a kata. It proposed some katas, and I chose one. I asked it to work in small steps, and discuss with me. This worked fairly well. It asked me some questions, I answered, gave feedback and explained my preferences. For a classroom exercise I might prepare the first round to reflect my taste, but critiquing the agents’ output also might have value.
And then… a surprise…
After two rounds, I realised we had not actually run the tests. So I asked the agent to run the tests. What happened next was a bit of a surprise, it took a bit longer than a few turns. Since I was doing it for my entertainment, I let it run for two minutes, in wich pi with a local model made it’s own mini testing framework, because the sandbox does not allow package installation!
The mini testing framework had some errors, the model eventually matched the parentheses (it is a Lisp after all) and fixed something else. And came out with two passing tests.
In the nick of time. After 25 turns (the default) my extension blocked the model from running folder.
What next?
I made an extension for Pi that limits the number of turns the agent can run. That works fairly works well. The default is set too high (25 turns) for an interactive session. I would like a widget that allows me to set the maximum number of turns in the chat, and that asks me whether to continue or not when the number of turms will be exceeded. Sometimes it is interesting to see the model continue.
On the other hand, aborting after the max number of turns will clearly show up as ‘Abort’ in the session transcript, so might be handy to review.
I will also let the tests fail first :-). The model correctly-ish listed steps of TDD (Red - Green - Refactor), but there is a difference between knowing and doing. So we had nothing, then a test framework, and then two passing tests.
The first number of rounds ran fast enough, so limiting the number of turns should be better.
I might keep the discussion of chicken scheme in the beginning. You can see the full session transcript . What do you make of it?