[Buildroot] [git commit] support/testing: test_aichat: improve test reliability
Julien Olivain
ju.o at free.fr
Thu Mar 19 20:54:12 UTC 2026
commit: https://gitlab.com/buildroot.org/buildroot/-/commit/f163d200022e72aa8d9f9d37332cc7e2663503de
branch: https://gitlab.com/buildroot.org/buildroot/-/tree/master
Since llama.cpp update in Buildroot commit [1], the test_aichat can
fail for several reasons:
The loop checking for the llama-server availability can fail if curl
succeed, but the returned json data is not formatted as expected.
This can happen if the server is ready but the model is not completely
loaded. In that case, the server returns:
{"error":{"message":"Loading model","type":"unavailable_error","code":503}}
This commit ignore Python KeyError exceptions while doing the
server test, to avoid failing if this message is received.
Also, this new llama-server version introduced a prompt caching, which
uses too much memory. This commit completely disable this prompt
caching by adding "--cache-ram 0" in the llama-server options.
[1] https://gitlab.com/buildroot.org/buildroot/-/commit/05c36d5d875713521f99b7bad48be316dcde2510
Signed-off-by: Julien Olivain <ju.o at free.fr>
---
support/testing/tests/package/test_aichat.py | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/support/testing/tests/package/test_aichat.py b/support/testing/tests/package/test_aichat.py
index 5fc554bbb5..50c07f87dc 100644
--- a/support/testing/tests/package/test_aichat.py
+++ b/support/testing/tests/package/test_aichat.py
@@ -70,6 +70,8 @@ class TestAiChat(infra.basetest.BRTest):
llama_opts = "--log-file /tmp/llama-server.log"
# We set a fixed seed, to reduce variability of the test
llama_opts += " --seed 123456789"
+ # We disable prompt caching to reduce RAM usage
+ llama_opts += " --cache-ram 0"
llama_opts += f" --hf-repo {hf_model}"
# We start a llama-server in background, which will expose an
@@ -91,9 +93,12 @@ class TestAiChat(infra.basetest.BRTest):
if ret == 0:
models_json = "".join(out)
models = json.loads(models_json)
- model_name = models['models'][0]['name']
- if model_name == hf_model:
- break
+ try:
+ model_name = models['models'][0]['name']
+ if model_name == hf_model:
+ break
+ except KeyError:
+ pass
else:
self.fail("Timeout while waiting for llama-server.")
More information about the buildroot
mailing list