[Buildroot] Some analysis of the major build failure reasons
Thomas Petazzoni
thomas.petazzoni at bootlin.com
Mon Aug 2 21:46:47 UTC 2021
Hello,
Giulio, Bernd, Adam, there are questions for you below. Thanks for your
support!
On Mon, 02 Aug 2021 06:09:39 -0000
Thomas Petazzoni <thomas.petazzoni at bootlin.com> wrote:
> master | 1903 | 802 | 53 | 2758 |
So it's getting a bit better, but we still have lots of build failures.
> Classification of failures by reason for master
> -----------------------------------------------
>
> host-python3 | 50
So this is a timeout, which occurs only on James machines (that have
very high parallelism), during the compile_all.py invocation in the
build of host-python3. It fails like this:
Traceback (most recent call last):
File "/tmp/instance-3/output-1/host/lib/python3.9/threading.py", line 973, in _bootstrap_inner
self.run()
File "/tmp/instance-3/output-1/host/lib/python3.9/concurrent/futures/process.py", line 317, in run
result_item, is_broken, cause = self.wait_result_broken_or_wakeup()
File "/tmp/instance-3/output-1/host/lib/python3.9/concurrent/futures/process.py", line 376, in wait_result_broken_or_wakeup
worker_sentinels = [p.sentinel for p in self.processes.values()]
File "/tmp/instance-3/output-1/host/lib/python3.9/concurrent/futures/process.py", line 376, in <listcomp>
worker_sentinels = [p.sentinel for p in self.processes.values()]
RuntimeError: dictionary changed size during iteration
and then apparently gets stuck.
This issue had already been reported upstream several months ago:
https://bugs.python.org/issue43498
A pull request was posted: https://github.com/python/cpython/pull/24868
but not merged.
Also, a similar problem was found in another Python project that uses
somewhat the same code:
https://github.com/joblib/loky/pull/273/commits/956e8ae84ce4d385ef3a61635db7370f2645c4ad.
The fix is quite different from the one in the CPython pull request,
making it not trivial to know what is the correct fix.
I have submitted
https://patchwork.ozlabs.org/project/buildroot/patch/20210802211040.2535969-1-thomas.petazzoni@bootlin.com/
with the patch from the PR backported, and let another maintainer judge
if it's worth it.
> libmodsecurity-3.0.5 | 47
I have applied
https://git.buildroot.org/buildroot/commit/?id=94b6fbd5823cbf94f2f76e402b0d73b473a8b64f
to fix this.
> libogg-1.3.5 | 39
This was a download issue affecting gcc159 only. It couldn't download
libogg-1.3.5.tar.xz from upstream due to SSL issue, but the fallback on
sources.buildroot.net returned 404. I could reproduce once, then after
fetching sources.buildroot.net/libogg/libogg-1.3.5.tar.xz from another
machine, gcc159 stopped getting a 404 for this URL.
It feels like the CDN in front of sources.buildroot.net is playing on us.
> pixman-0.40.0 | 31
I have investigated this. It fails only on sh4, due an internal
compiler error. It only occurs at -Os, at -O0 and -O2 it builds fine. I
have reported gcc bug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101737 for this. Since I
tested only with gcc 9.3.0 for now, I've started a build with gcc 11.x,
to see how it goes.
Based on the result, I'll send a patch adding a new
BR2_TOOLCHAIN_GCC_HAS_BUG_101737 and disable -Os on pixman on SuperH
based on this.
> unknown | 31
I did not look into these for now.
> zeromq-4.3.4 | 30
Giulio: this is happening only on or1k, with a binutils assert. Do you
think this is solved by your or1k fixes?
> mpv-0.33.1 | 28
You manually enabled the feature 'vaapi-drm', but the autodetection check failed.
Bernd, could you have a look, perhaps ?
> erlang-jiffy-1.0.6 | 26
This should be addressed by
https://patchwork.ozlabs.org/project/buildroot/patch/20210802135203.3059317-1-fhunleth@troodon-software.com/
> libfuse3-3.10.4 | 23
This only happens on Microblaze, with:
../lib/fuse_loop_mt.c:361:1: error: symver is only supported on ELF platforms
Giulio, does it ring a bell to you ? It's weird because Microblaze is
using ELF binaries.
> ffmpeg-4.4 | 22
/tmp/ccDU0nYS.s: Assembler messages:
/tmp/ccDU0nYS.s:2136: Error: opcode not supported on this processor: mips32 (mips32) `dmult $20,$20'
/tmp/ccDU0nYS.s:2138: Error: opcode not supported on this processor: mips32 (mips32) `dsrl $21,$21,32'
I tried looking into this, but I'm not entirely clear how the detection
of the MIPS variant is done in the ffmpeg configure script.
Bernd, perhaps ?
> gconf-3.2.6 | 22
make[4]: *** No rule to make target 'GConf-2.0.typelib', needed by 'all-am'. Stop.
Adam, isn't this GOI related ?
> libmediaart-1.9.4 | 21
make[3]: *** No rule to make target 'MediaArt-2.0.typelib', needed by 'all'. Stop.
Ditto. Adam, isn't this GOI related ?
> optee-client-3.13.0 | 20
I pushed fixes for these issues.
> nfs-utils-2.5.4 | 19
I assume would be fixed by
https://patchwork.ozlabs.org/project/buildroot/patch/20210802172116.10073-1-petr.vorel@gmail.com/
> gobject-introspection-1.68.0 | 18
Adam ?
> gpsd-3.21 | 17
Giulio, the workaround to pass -O0 no longer works for some reason. -O0
is still passed, but -O2 is passed *after*, causing the build issue to
pop up again. Could you have a look ?
> qemu-6.0.0 | 17
../subprojects/libvhost-user/libvhost-user.c:1637:22: error: 'F_ADD_SEALS' undeclared (first use in this function)
Smells like a kernel headers dependency missing on a particular Qemu
feature, or something like this.
> harfbuzz-2.8.2 | 14
On ARC, toolchain issue: /tmp/ccYRTbeW.s:2996: Error: operand out of range (0x0000000000001036 is not between 0xfffffffffffff000 and 0x0000000000000fff)
On ARMv6 (only), a weird issue with mutexes:
../src/hb-mutex.hh:88:2: error: #error "Could not find any system to define mutex macros."
88 | #error "Could not find any system to define mutex macros."
> libgtk2-2.24.33 | 7
Bernd, these are also .typelib related.
Thomas
--
Thomas Petazzoni, co-owner and CEO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
More information about the buildroot
mailing list