[Buildroot] [PATCH] core/pkg-infra: restore completeness of packages files lists

Thomas De Schampheleire patrickdepinguin at gmail.com
Wed Feb 6 20:12:01 UTC 2019


Hello,

El mié., 6 feb. 2019 a las 15:38, Yann E. MORIN
(<yann.morin.1998 at free.fr>) escribió:
>
> In commit 7fb6e782542f (core/instrumentation: shave minutes off the
> build time), the built stampfile is used as a reference to detect files
> installed by a package.
>
> However, packages may install files keeping their mtime intact, and we
> end up not detecting this. For example, the internal skeleton package
> will install (e.g.) /etc/passwd with an mtime of when the file was
> created in $(TOP_DIR), which could be the time the git repository was
> checked out; that mtime is always older than the build stamp file, so
> files installed by the skeleton package are never accounted for to that
> package, or to any other package for that matters.
>
> We switch to an alternate solution, which consists of storing some extra
> metadata per file, so that we can more reasily detect modifications to
> the files. Then we compare the state before the package is installed (by
> reusing the existing list) and after the package is installed, compare
> that to list any new file or modified files (in reality, ignoring
> untouched and removed files). Finally, we store the file->package
> association in the global list and store the new stat list as the global
> list.
>
> Signed-off-by: "Yann E. MORIN" <yann.morin.1998 at free.fr>
> Cc: Peter Korsgaard <peter at korsgaard.com>
> Cc: Thomas Petazzoni <thomas.petazzoni at bootlin.com>
> Cc: Arnout Vandecappelle <arnout at mind.be>
> Cc: Thomas De Schampheleire <patrickdepinguin at gmail.com>
> Cc: Trent Piepho <tpiepho at impinj.com>
> ---
>  package/pkg-generic.mk | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index f5cab2b9c2..c07cb32349 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -63,13 +63,20 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
>  # $(2): base directory to search in
>  # $(3): suffix of file  (optional)
>  define step_pkg_size_inner
> +       @touch $(BUILD_DIR)/packages-file-list$(3).stat
>         @touch $(BUILD_DIR)/packages-file-list$(3).txt
>         $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
>         cd $(2); \
> -       find . \( -type f -o -type l \) \
> -               -newer $($(PKG)_DIR)/.stamp_built \
> -               -exec printf '$(1),%s\n' {} + \
> +       LC_ALL=C find . -printf '%T@:%i:%#m:%y:%s,%p\n' \
> +       |LC_ALL=C sort >$($(PKG)_BUILDDIR)/.files-list$(3).stat
> +       comm -13 $(BUILD_DIR)/packages-file-list$(3).stat \
> +               $($(PKG)_BUILDDIR)/.files-list$(3).stat \
> +               >$($(PKG)_BUILDDIR)/.files-list$(3).new
> +       sed -r -e 's/^[^,]+/$(1)/' \
> +               $($(PKG)_BUILDDIR)/.files-list$(3).new \
>                 >> $(BUILD_DIR)/packages-file-list$(3).txt
> +       mv $($(PKG)_BUILDDIR)/.files-list$(3).stat \
> +               $(BUILD_DIR)/packages-file-list$(3).stat
>  endef
>

I am testing this code by building a reference build with this change
and the original-original situation using md5sum and comparing the
output.
The build is not yet complete so below are not yet complete
observations. Nevertheless, it looks very good so far.

Observations:

1. The call to 'comm' should also happen with LC_ALL=C or comm may
complain that a file is not sorted. This is noticed in tzdata, where
there are two files differing only in a '+' and '-' sign in their
name. Depending on the locale, the sort order is different:

$ echo \
'1549481786.6863713840:20982773:0644:f:127,./usr/share/zoneinfo/posix/Etc/GMT+0
1549481786.6863713840:20982773:0644:f:127,./usr/share/zoneinfo/posix/Etc/GMT-0'
| sort
1549481786.6863713840:20982773:0644:f:127,./usr/share/zoneinfo/posix/Etc/GMT-0
1549481786.6863713840:20982773:0644:f:127,./usr/share/zoneinfo/posix/Etc/GMT+0


$ echo \
'1549481786.6863713840:20982773:0644:f:127,./usr/share/zoneinfo/posix/Etc/GMT+0
1549481786.6863713840:20982773:0644:f:127,./usr/share/zoneinfo/posix/Etc/GMT-0'
| env LC_ALL=C sort
1549481786.6863713840:20982773:0644:f:127,./usr/share/zoneinfo/posix/Etc/GMT+0
1549481786.6863713840:20982773:0644:f:127,./usr/share/zoneinfo/posix/Etc/GMT-0

The error given by 'comm' with such input is:
comm: file 2 is not in sorted order



2. This is more an observation than a change-request: the directories
where a package installs files, e.g. usr/bin, usr/lib, ... are
attributed for that package. This means that 'usr/lib' is for example
attributed to each and every library.
In a way I like this, because it means that with the output for one
package you have both all files and all directories that it touches,
regardless of who created the directory first.
But it should be checked whether other users of the output can cope
with it, like graph-size.


Best regards,
Thomas



More information about the buildroot mailing list