This article is a translation of How “go build” Works.
How does go build
compile the simplest Golang program?
This article aims to answer that question.
Consider the simplest program below.
// main.go
package main
func main() {}
Running go build main.go
prints a 1.1Mb executable main
and nothing is done. What did go build
do to create this do-nothing binary?
The go build
command offers some useful options.
-work
: go build
creates a temporary folder for your work files. This argument prints the location of that folder and does not remove it after build-a
: Golang caches previously built packages. -a
causes go build
to ignore the cache, so the build prints all steps-p 1
: This sets the process to be done in a single thread and logs the output linearly.-x
: go build
is a wrapper for other Golang tools such as compile
. -x
prints the commands and arguments sent to these toolsRunning go build -work -a -p 1 -x main.go
produces a lot of logs as well as main
, which is what is done when creating main
with build
. Will tell us.
The log first outputs the following contents.
WORK=/var/folders/rw/gtb29xf92fv23f0zqsg42s840000gn/T/go-build940616988
This is a working directory with a structure similar to the following.
├── b001
│ ├── _pkg_.a
│ ├── exe
│ ├── importcfg
│ └── importcfg.link
├── b002
│ └── ...
├── b003
│ └── ...
├── b004
│ └── ...
├── b006
│ └── ...
├── b007
│ └── ...
└── b008
└── ...
go build
defines an action graph for the task that needs to be completed.
Each action in this graph gets its own subdirectory (defined in NewObjdir
).
The first node b001
in the graph is the root task for compiling the main binary.
The number of dependent actions is large, ending with b008
. (I don't know where b005
went, but I don't think it's a problem, so I'll omit it.)
b008
The first action to be taken is b008
at the end of the graph.
mkdir -p $WORK/b008/
cat >$WORK/b008/importcfg << 'EOF'
# import config
EOF
cd /<..>/src/runtime/internal/sys
/<..>/compile
-o $WORK/b008/_pkg_.a
-trimpath "$WORK/b008=>"
-p runtime/internal/sys
-std
-+
-complete
-buildid gEtYPexVP43wWYWCxFKi/gEtYPexVP43wWYWCxFKi
-goversion go1.14.7
-D ""
-importcfg $WORK/b008/importcfg
-pack
-c=16
./arch.go ./arch_amd64.go ./intrinsics.go ./intrinsics_common.go ./stubs.go ./sys.go ./zgoarch_amd64.go ./zgoos_darwin.go ./zversion.go
/<..>/buildid -w $WORK/b008/_pkg_.a
cp $WORK/b008/_pkg_.a /<..>/Caches/go-build/01/01b...60a-d
In b008
importcfg
file for use with the tool compile
(empty)runtime / internal / sys
package. This package contains constants used at runtimebuild id
to write the metadata to the package ( -w
) and copy the package to the go-build
cache (all packages are cached, so this description is omitted hereafter) To do)Let's break this down into the arguments sent to the tool compile
(also explained in go tool compile --help
).
-o
Output destination file$ WORK / b008 =>
from -trimpath
source file path-p`` import
-std`` compiling standard library
(I wasn't sure at this point)-+
compiling runtime
(I didn't know this either)-complete
The compiler outputs a complete package instead of C or assembly-buildid
metadata a build id-goversion
The version required for the compiled package-D
The relative path used for local import is" "-importcfg
Refer to other packages for import configuration file-pack
package as an archive .a
instead of the object file .o
-c
How much parallel processing should be done at build timeMost of these arguments are the same for all compile
commands, so we'll omit this description below.
The output of b008
is an archive file called $ WORK / b008 / _pkg_.a
that corresponds to runtime / internal / sys
.
buildid
Let me explain what a build id
is.
The format of buildid
is<actionid> / <contentid>
.
It is used as an index to cache packages and improve the performance of go build
.
<actionid>
is a hash of the action (all calls, arguments, and input files). <contentid>
is the hash of the output .a
file.
For each go build
action, you can search the cache for content created by another action with the same<actionid>
.
This is implemented in buildid.go
.
The buildid
is stored in a file as metadata, so you don't have to hash it every time to get the<contentid>
. You can find this ID with go tool buildid <file>
(it also works in binary).
In the b008
log above, the buildID
is set by the compile
tool as gEtYPexVP43wWYWCxFKi / gEtYPexVP43wWYWCxFKi
.
This is just a placeholder and will be overwritten with the correct gEtYPexVP43wWYWCxFKi / b-rPboOuD0POrlJWPTEi
with go tool buildid -w
before it is cached later.
b007
Next is b007
cat >$WORK/b007/importcfg << 'EOF'
# import config
packagefile runtime/internal/sys=$WORK/b008/_pkg_.a
EOF
cd /<..>/src/runtime/internal/math
/<..>/compile
-o $WORK/b007/_pkg_.a
-p runtime/internal/math
-importcfg $WORK/b007/importcfg
...
./math.go
importcfg
that says packagefile runtime / internal / sys = $ WORK / b008 / _pkg_.a
This indicates that b007
depends on b008
runtime / internal / math
If you take a look inside math.go
, you are surely importing runtime / internal / sys
made with b008
.The output of b007
is an archive file called $ WORK / b007 / _pkg_.a
that corresponds to runtime / internal / math
.
b006
cat >$WORK/b006/go_asm.h << 'EOF'
EOF
cd /<..>/src/runtime/internal/atomic
/<..>/asm
-I $WORK/b006/
-I /<..>/go/1.14.7/libexec/pkg/include
-D GOOS_darwin
-D GOARCH_amd64
-gensymabis
-o $WORK/b006/symabis
./asm_amd64.s
/<..>/asm
-I $WORK/b006/
-I /<..>/go/1.14.7/libexec/pkg/include
-D GOOS_darwin
-D GOARCH_amd64
-o $WORK/b006/asm_amd64.o
./asm_amd64.s
cat >$WORK/b006/importcfg << 'EOF'
# import config
EOF
/<..>/compile
-o $WORK/b006/_pkg_.a
-p runtime/internal/atomic
-symabis $WORK/b006/symabis
-asmhdr $WORK/b006/go_asm.h
-importcfg $WORK/b006/importcfg
...
./atomic_amd64.go ./stubs.go
/<..>/pack r $WORK/b006/_pkg_.a $WORK/b006/asm_amd64.o
Now let's break out of the regular .go
file and start processing the low level Go assembly .s
file.
go_asm.h
runtime / internal / atomic
package of low-level functionsgo tool asm
(described in go tool asm --help
) to create the symabis
" Symbol Application Binary Interfaces (ABI) file "and then the object fileasm_amd64.o Create
compile
to create a _pkg_.a
file containing a symabis
file and a header containing -asmhdr
asm_amd64.o
to _pkg_.a
with the pack
commandThe asm
tool is called here with the following arguments:
-I
: actions b007
and the libexec / pkg / includes
folders. includes
has three files asm_ppc64x.h
, funcdata.h
and textflag.h
, all with low-level function definitions. For example, FIXED_FRAME defines the size of the fixed part of the stack frame.-D
: Includes predefined symbols-gensymabis
: Create a symabis
file-o
: Output destination fileThe output of b006
is an archive file called $ WORK / b006 / _pkg_.a
that corresponds to runtime / internal / atomic
.
b004
cd /<..>/src/internal/cpu
/<..>/asm ... -o $WORK/b004/symabis ./cpu_x86.s
/<..>/asm ... -o $WORK/b004/cpu_x86.o ./cpu_x86.s
/<..>/compile ... -o $WORK/b004/_pkg_.a ./cpu.go ./cpu_amd64.go ./cpu_x86.go
/<..>/pack r $WORK/b004/_pkg_.a $WORK/b004/cpu_x86.o
b004
is the same as b006
except that the target is changed to internal / cpu
.
First create the symabis
and object files by assembling cpu_x86.s
, compile the go file, and then combine them to create the archive _pkg_.a
.
The output of b004
is an archive file called $ WORK / b004 / _pkg_.a
that corresponds to internal / cpu
.
b003
cat >$WORK/b003/go_asm.h << 'EOF'
EOF
cd /<..>/src/internal/bytealg
/<..>/asm ... -o $WORK/b003/symabis ./compare_amd64.s ./count_amd64.s ./equal_amd64.s ./index_amd64.s ./indexbyte_amd64.s
cat >$WORK/b003/importcfg << 'EOF'
# import config
packagefile internal/cpu=$WORK/b004/_pkg_.a
EOF
/<..>/compile ... -o $WORK/b003/_pkg_.a -p internal/bytealg ./bytealg.go ./compare_native.go ./count_native.go ./equal_generic.go ./equal_native.go ./index_amd64.go ./index_native.go ./indexbyte_native.go
/<..>/asm ... -o $WORK/b003/compare_amd64.o ./compare_amd64.s
/<..>/asm ... -o $WORK/b003/count_amd64.o ./count_amd64.s
/<..>/asm ... -o $WORK/b003/equal_amd64.o ./equal_amd64.s
/<..>/asm ... -o $WORK/b003/index_amd64.o ./index_amd64.s
/<..>/asm ... -o $WORK/b003/indexbyte_amd64.o ./indexbyte_amd64.s
/<..>/pack r $WORK/b003/_pkg_.a $WORK/b003/compare_amd64.o $WORK/b003/count_amd64.o $WORK/b003/equal_amd64.o $WORK/b003/index_amd64.o $WORK/b003/indexbyte_amd64.o
Doing b003
is the same as b004
and b006
.
The main problem with this package is that there are multiple .s
files to create many object files .o
, each of which needs to be added to the _pkg_.a
file.
The output of b003
is an archive file called $ WORK / b003 / _pkg_.a
that corresponds to internal / bytealg
.
b002
cat >$WORK/b002/go_asm.h << 'EOF'
EOF
cd /<..>/src/runtime
/<..>/asm
...
-o $WORK/b002/symabis
./asm.s ./asm_amd64.s ./duff_amd64.s ./memclr_amd64.s ./memmove_amd64.s ./preempt_amd64.s ./rt0_darwin_amd64.s ./sys_darwin_amd64.s
cat >$WORK/b002/importcfg << 'EOF'
# import config
packagefile internal/bytealg=$WORK/b003/_pkg_.a
packagefile internal/cpu=$WORK/b004/_pkg_.a
packagefile runtime/internal/atomic=$WORK/b006/_pkg_.a
packagefile runtime/internal/math=$WORK/b007/_pkg_.a
packagefile runtime/internal/sys=$WORK/b008/_pkg_.a
EOF
/<..>/compile
-o $WORK/b002/_pkg_.a
...
-p runtime
./alg.go ./atomic_pointer.go ./cgo.go ./cgocall.go ./cgocallback.go ./cgocheck.go ./chan.go ./checkptr.go ./compiler.go ./complex.go ./cpuflags.go ./cpuflags_amd64.go ./cpuprof.go ./cputicks.go ./debug.go ./debugcall.go ./debuglog.go ./debuglog_off.go ./defs_darwin_amd64.go ./env_posix.go ./error.go ./extern.go ./fastlog2.go ./fastlog2table.go ./float.go ./hash64.go ./heapdump.go ./iface.go ./lfstack.go ./lfstack_64bit.go ./lock_sema.go ./malloc.go ./map.go ./map_fast32.go ./map_fast64.go ./map_faststr.go ./mbarrier.go ./mbitmap.go ./mcache.go ./mcentral.go ./mem_darwin.go ./mfinal.go ./mfixalloc.go ./mgc.go ./mgcmark.go ./mgcscavenge.go ./mgcstack.go ./mgcsweep.go ./mgcsweepbuf.go ./mgcwork.go ./mheap.go ./mpagealloc.go ./mpagealloc_64bit.go ./mpagecache.go ./mpallocbits.go ./mprof.go ./mranges.go ./msan0.go ./msize.go ./mstats.go ./mwbbuf.go ./nbpipe_pipe.go ./netpoll.go ./netpoll_kqueue.go ./os_darwin.go ./os_nonopenbsd.go ./panic.go ./plugin.go ./preempt.go ./preempt_nonwindows.go ./print.go ./proc.go ./profbuf.go ./proflabel.go ./race0.go ./rdebug.go ./relax_stub.go ./runtime.go ./runtime1.go ./runtime2.go ./rwmutex.go ./select.go ./sema.go ./signal_amd64.go ./signal_darwin.go ./signal_darwin_amd64.go ./signal_unix.go ./sigqueue.go ./sizeclasses.go ./slice.go ./softfloat64.go ./stack.go ./string.go ./stubs.go ./stubs_amd64.go ./stubs_nonlinux.go ./symtab.go ./sys_darwin.go ./sys_darwin_64.go ./sys_nonppc64x.go ./sys_x86.go ./time.go ./time_nofake.go ./timestub.go ./trace.go ./traceback.go ./type.go ./typekind.go ./utf8.go ./vdso_in_none.go ./write_err.go
/<..>/asm ... -o $WORK/b002/asm.o ./asm.s
/<..>/asm ... -o $WORK/b002/asm_amd64.o ./asm_amd64.s
/<..>/asm ... -o $WORK/b002/duff_amd64.o ./duff_amd64.s
/<..>/asm ... -o $WORK/b002/memclr_amd64.o ./memclr_amd64.s
/<..>/asm ... -o $WORK/b002/memmove_amd64.o ./memmove_amd64.s
/<..>/asm ... -o $WORK/b002/preempt_amd64.o ./preempt_amd64.s
/<..>/asm ... -o $WORK/b002/rt0_darwin_amd64.o ./rt0_darwin_amd64.s
/<..>/asm ... -o $WORK/b002/sys_darwin_amd64.o ./sys_darwin_amd64.s
/<..>/pack r $WORK/b002/_pkg_.a $WORK/b002/asm.o $WORK/b002/asm_amd64.o $WORK/b002/duff_amd64.o $WORK/b002/memclr_amd64.o $WORK/b002/memmove_amd64.o $WORK/b002/preempt_amd64.o $WORK/b002/rt0_darwin_amd64.o $WORK/b002/sys_darwin_amd64.o
You can see why the previous actions were needed by looking at b002
.
b002
contains all the runtime
packages needed to run Go's binaries. For example, b002
also contains a Go GC implementation called mgc.go
. It is importing b004
( internal / cpu
) and b006
( runtime / internal / atomic
).
b002
may be the most complex package in the core library, but the build itself is the same process as before. In other words, the file output by asm
and compile
is pack
ed to _pkg_.a
.
The output of b002
is an archive file called $ WORK / b002 / _pkg_.a
that corresponds to runtime
.
b001
cat >$WORK/b001/importcfg << 'EOF'
# import config
packagefile runtime=$WORK/b002/_pkg_.a
EOF
cd /<..>/main
/<..>/compile ... -o $WORK/b001/_pkg_.a -p main ./main.go
cat >$WORK/b001/importcfg.link << 'EOF'
packagefile command-line-arguments=$WORK/b001/_pkg_.a
packagefile runtime=$WORK/b002/_pkg_.a
packagefile internal/bytealg=$WORK/b003/_pkg_.a
packagefile internal/cpu=$WORK/b004/_pkg_.a
packagefile runtime/internal/atomic=$WORK/b006/_pkg_.a
packagefile runtime/internal/math=$WORK/b007/_pkg_.a
packagefile runtime/internal/sys=$WORK/b008/_pkg_.a
EOF
/<..>/link
-o $WORK/b001/exe/a.out
-importcfg $WORK/b001/importcfg.link
-buildmode=exe
-buildid=yC-qrh2sY_qI0zh2-NE7/owNzOBTqPO00FkqK0_lF/HPXqvMz_4PvKsQzqGWgD/yC-qrh2sY_qI0zh2-NE7
-extld=clang
$WORK/b001/_pkg_.a
mv $WORK/b001/exe/a.out main
First it builds an importcfg that includes runtime built in b002 to then compile main.go to pkg.a
importcfg
that includes the runtime
of b002
, then compile main.go
to create a _pkg_.a
.importcfg.link
that contains command-line-arguments = $ WORK / b001 / _pkg_.a
in addition to all the packages that appeared before, then link them with the link
command to create an executable file Create a.main
and move to the output destination.Let's supplement the argument of link
.
-buildmode
: Build the executable-extld
: Refer to an external linkerI finally got what I was looking for.
The main
binary is born from b001
.
Creating action graphs for efficient caching is the same idea as the build tools Bazel uses for fast builds.
Golang's action id
and content id
correspond to the action cache
andcontent-addressable store (CAS)
that Bazel uses in the cache.
Bazel is a product of Google, and so is Golang. It would be very reasonable for them to have a similar philosophy on how to build software quickly and accurately.
In Bazel's rules_go
package, you can see how to reimplement go build
in the builder
code.
This is a very clean implementation, as action graphs, folder management, and caching are handled externally by Bazel.
go build
did a lot to compile a program like this one that doesn't do anything.
I didn't go into too much detail about the tool (compile`` asm
) and its input and output files (.a`` .o
.s
) this time around.
Also, this time I'm just compiling the most basic program.
You can make the compilation more complicated by doing the following:
fmt
to output Hello world
will add 23 more actions to the action graph.go.mod
to reference external packagesGOOS
and GOARCH
For example, compiling for wasm
will have completely different actions and arguments.Running go build
and inspecting the logs is a top-down approach to learning how the Go compiler works. If you want to learn from the basics, it's a great starting point to dive into resources such as:
References