[Buildroot] Some analysis of the major build failure reasons

Thomas Petazzoni thomas.petazzoni at bootlin.com
Mon Aug 2 21:46:47 UTC 2021


Hello,

Giulio, Bernd, Adam, there are questions for you below. Thanks for your
support!

On Mon, 02 Aug 2021 06:09:39 -0000
Thomas Petazzoni <thomas.petazzoni at bootlin.com> wrote:

>     master   | 1903 | 802 | 53  | 2758 |

So it's getting a bit better, but we still have lots of build failures.

> Classification of failures by reason for master
> -----------------------------------------------
> 
>                   host-python3 | 50

So this is a timeout, which occurs only on James machines (that have
very high parallelism), during the compile_all.py invocation in the
build of host-python3. It fails like this:

Traceback (most recent call last):
  File "/tmp/instance-3/output-1/host/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/tmp/instance-3/output-1/host/lib/python3.9/concurrent/futures/process.py", line 317, in run
    result_item, is_broken, cause = self.wait_result_broken_or_wakeup()
  File "/tmp/instance-3/output-1/host/lib/python3.9/concurrent/futures/process.py", line 376, in wait_result_broken_or_wakeup
    worker_sentinels = [p.sentinel for p in self.processes.values()]
  File "/tmp/instance-3/output-1/host/lib/python3.9/concurrent/futures/process.py", line 376, in <listcomp>
    worker_sentinels = [p.sentinel for p in self.processes.values()]
RuntimeError: dictionary changed size during iteration

and then apparently gets stuck.

This issue had already been reported upstream several months ago:
https://bugs.python.org/issue43498

A pull request was posted: https://github.com/python/cpython/pull/24868
but not merged.

Also, a similar problem was found in another Python project that uses
somewhat the same code:
https://github.com/joblib/loky/pull/273/commits/956e8ae84ce4d385ef3a61635db7370f2645c4ad.
The fix is quite different from the one in the CPython pull request,
making it not trivial to know what is the correct fix.

I have submitted
https://patchwork.ozlabs.org/project/buildroot/patch/20210802211040.2535969-1-thomas.petazzoni@bootlin.com/
with the patch from the PR backported, and let another maintainer judge
if it's worth it.

>           libmodsecurity-3.0.5 | 47

I have applied
https://git.buildroot.org/buildroot/commit/?id=94b6fbd5823cbf94f2f76e402b0d73b473a8b64f
to fix this.

>                   libogg-1.3.5 | 39

This was a download issue affecting gcc159 only. It couldn't download
libogg-1.3.5.tar.xz from upstream due to SSL issue, but the fallback on
sources.buildroot.net returned 404. I could reproduce once, then after
fetching sources.buildroot.net/libogg/libogg-1.3.5.tar.xz from another
machine, gcc159 stopped getting a 404 for this URL.

It feels like the CDN in front of sources.buildroot.net is playing on us.

>                  pixman-0.40.0 | 31

I have investigated this. It fails only on sh4, due an internal
compiler error. It only occurs at -Os, at -O0 and -O2 it builds fine. I
have reported gcc bug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101737 for this. Since I
tested only with gcc 9.3.0 for now, I've started a build with gcc 11.x,
to see how it goes.

Based on the result, I'll send a patch adding a new
BR2_TOOLCHAIN_GCC_HAS_BUG_101737 and disable -Os on pixman on SuperH
based on this.

>                        unknown | 31

I did not look into these for now.

>                   zeromq-4.3.4 | 30

Giulio: this is happening only on or1k, with a binutils assert. Do you
think this is solved by your or1k fixes?

>                     mpv-0.33.1 | 28

You manually enabled the feature 'vaapi-drm', but the autodetection check failed.

Bernd, could you have a look, perhaps ?

>             erlang-jiffy-1.0.6 | 26

This should be addressed by
https://patchwork.ozlabs.org/project/buildroot/patch/20210802135203.3059317-1-fhunleth@troodon-software.com/

>                libfuse3-3.10.4 | 23

This only happens on Microblaze, with:

../lib/fuse_loop_mt.c:361:1: error: symver is only supported on ELF platforms

Giulio, does it ring a bell to you ? It's weird because Microblaze is
using ELF binaries.

>                     ffmpeg-4.4 | 22

/tmp/ccDU0nYS.s: Assembler messages:
/tmp/ccDU0nYS.s:2136: Error: opcode not supported on this processor: mips32 (mips32) `dmult $20,$20'
/tmp/ccDU0nYS.s:2138: Error: opcode not supported on this processor: mips32 (mips32) `dsrl $21,$21,32'

I tried looking into this, but I'm not entirely clear how the detection
of the MIPS variant is done in the ffmpeg configure script.

Bernd, perhaps ?

>                    gconf-3.2.6 | 22

make[4]: *** No rule to make target 'GConf-2.0.typelib', needed by 'all-am'.  Stop.

Adam, isn't this GOI related ?

>              libmediaart-1.9.4 | 21

make[3]: *** No rule to make target 'MediaArt-2.0.typelib', needed by 'all'.  Stop.

Ditto. Adam, isn't this GOI related ?

>            optee-client-3.13.0 | 20

I pushed fixes for these issues.

>                nfs-utils-2.5.4 | 19

I assume would be fixed by
https://patchwork.ozlabs.org/project/buildroot/patch/20210802172116.10073-1-petr.vorel@gmail.com/

>   gobject-introspection-1.68.0 | 18

Adam ?

>                      gpsd-3.21 | 17

Giulio, the workaround to pass -O0 no longer works for some reason. -O0
is still passed, but -O2 is passed *after*, causing the build issue to
pop up again. Could you have a look ?

>                     qemu-6.0.0 | 17

../subprojects/libvhost-user/libvhost-user.c:1637:22: error: 'F_ADD_SEALS' undeclared (first use in this function)

Smells like a kernel headers dependency missing on a particular Qemu
feature, or something like this.

>                 harfbuzz-2.8.2 | 14

On ARC, toolchain issue: /tmp/ccYRTbeW.s:2996: Error: operand out of range (0x0000000000001036 is not between 0xfffffffffffff000 and 0x0000000000000fff)

On ARMv6 (only), a weird issue with mutexes:

../src/hb-mutex.hh:88:2: error: #error "Could not find any system to define mutex macros."
   88 | #error "Could not find any system to define mutex macros."


>                libgtk2-2.24.33 | 7 

Bernd, these are also .typelib related.

Thomas
-- 
Thomas Petazzoni, co-owner and CEO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


More information about the buildroot mailing list