This week I stumbled across a footgun in the Dockerfile/Containerfile ARG instruction.

ARG is used to define a build-time variable, possibly with a default value embedded in the Dockerfile, which can be overridden at build-time (by passing --build-arg). The value of a variable FOO is interpolated into any following instructions that include the token $FOO.

This behaves a little similar to the existing instruction ENV, which, for RUN instructions at least, can also be interpolated, but can't (I don't think) be set at build time, and bleeds through to the resulting image metadata.

ENV has been around longer, and the documentation indicates that, when both are present, ENV takes precedence. This fits with my mental model of how things should work, but, what the documentation does not make clear is, the ENV doesn't need to have been defined in the same Dockerfile: environment variables inherited from the base image also override ARGs.

To me this is unexpected and far less sensible: in effect, if you are building a layered image and want to use ARG, you have to be fairly sure that your base image doesn't define an ENV of the same name, either now or in the future, unless you're happy for their value to take precedence.

In our case, we broke a downstream build process by defining a new environment variable USER in our image.

To defend against the unexpected, I'd recommend using somewhat unique ARG names: perhaps prefix something unusual and unlikely to be shadowed. Or don't use ARG at all, and push that kind of logic up the stack to a Dockerfile pre-processor like CeKit.


Comments

comment 1
Did you report this bug upstream? Seems like a showstopper in many circumstances
Comment by Governmen T. Name,
comment 1

This fits with my mental model of how things should work

Really? For me the precedence should be, from most to least:

  • command line
  • envvar
  • local (cwd/project's root dir) config file
  • user's config file
  • global config file
  • default

I'm mostly a Python programmer and I think there are many configuration libs that use this order. One thing I would like, but hadn't searched if any has, is the ability to dump final calculated config with tags showing which of those 4 sources the final value comes from. Sounds like something docker could do too. Maybe I should pick one and open an issue, but TBH I'm not using any at he moment.

Comment by Marcos Dione,
comment 3

Did you report this bug upstream?

Not yet, no. It’s debatably a bug, in that I think it was an intended behaviour, just unexpected to me. It’s also codified in a spec, because at least docker, Podman and OSBS behave the same way.

jon,