-
Notifications
You must be signed in to change notification settings - Fork 283
Description
In #246 I came up with an idea to generate a JSON Compilation Database from GNU Make output. This concept has been realized before, and I found two existing, if crude, implementations. I came up with the name make2compdb
, which is the name I'll use here, but I'm open to other names. In it simplest usage:
$ make -Bnw | make2compdb >compile_commands.json
The user can invoke make
in whatever way is appropriate, choosing targets, overriding CFLAGS
, etc. The three make
options are necessary:
-
-B
(--always-make
): force all build commands to print regardless of current build state -
-n
(--dry-run
): no build is actually desired at this time (this still invokes!=
and$(shell ...)
, which is a good thing) -
-w
(--print-directory
): The"directory"
key in a compilation database is required, and despite what the "specification" says must be an absolute path. At least some LLVM tools (e.g.clang-tidy
) do not support relative paths. This is highly unfortunate because it meanscompile_commands.json
cannot be meaningfully checked into source control.-w
provides the"directory"
key formake2compdb
via "Entering directory" lines, which internally maintains a stack of directories. The stack is initialized with the absolute working directory.
The output of make
will look something like:
make: Entering directory 'C:/Users/me/src/example'
cc -c -Iinclude -o example.o example.c
c++ -c -Iinclude -o util.o util.cpp
make[1]: Entering directory 'C:/Users/me/src/example/lib'
cc -c -I../include -o lib.o lib.c
make[1]: Leaving directory 'C:/Users/me/src/example/lib'
c++ -o example.exe example.o util.o src/lib.o
make: Leaving directory 'C:/Users/me/src/example'
From this, the output of make2compdb
would be (assumed UTF-8):
[
{
"directory": "C:/Users/me/src/example",
"file": "example.c",
"command": "cc -c -Iinclude -o example.o example.c"
}, {
"directory": "C:/Users/me/src/example",
"file": "util.cpp",
"command": "c++ -c -Iinclude -o util.o util.cpp"
}, {
"directory": "C:/Users/me/src/example/lib",
"file": "lib.c",
"command": "cc -c -I../include -o lib.o lib.c"
}
]
The program does not need to access nor examine any of these files, but transform data from one format to another. This should be easy, but since we're bothering to write a whole program to do it, it ought to be robust too. The "directory"
fields are copied verbatim from '...'
in the Entering
line, but the rest must be parsed the same way as would a shell, and with some understanding of what it sees (note the distinction between "must" and "should"):
-
Must ignore lines that don't look like a compiler invocation. It should match
cc
,c++
,gcc
,g++
,clang
,clang++
, and even fancier stuff likex86_64-w64-mingw32-gcc
, which can be done by splitting on-
and looking at the last element. It should must absolute paths, too, likeC:/w64devkit/bin/gcc.exe
. If you want to be fancy, handlecl
andclang-cl
, too, but specially. -
It must handle quoting and backslashes when word splitting:
"C:/Users/John Falstaff/w64devkit/bin/gcc.exe" -c ... C:/Users/John\ Falstaff/w64devkit/bin/gcc.exe -c ... C:/Users/'John Falstaff'/w64devkit/bin/gcc.exe -c ...
-
It must handle line continuations:
gcc \ -c ...
The continuation backslash and newline should probably be omitted from
"command"
. I don't know how well database-consuming tools handle this. -
It should navigate around subshell expressions:
gcc -c $(pkg-config ...) ...
That is, ignore the stuff between
$(...)
. In the case ofpkg-config
, a good Makefile will expand these on its own first — which is great, as the output will end up in the compilation database — and they won't appear as subshell expressions in make's output. So trying to actually interpret them is unimportant. -
It should understand common compiler flags like
-o
,-I
, etc., so-o foo.c
will not matchfoo.c
. It should only match known file extensions. -
If a command contains multiple matching positional arguments, there must be a compilation database for each.
cc x.c y.c
will produce an entry forx.c
andy.c
. -
Optional (bonus) variable expansion:
"$CC" -c ...
This is not so easy (consider variations like
${...}
). Typically these are handled as make variables, and are already expanded on output, so this is low importance. Probably better to not do this at all than do it poorly.
This all assumes a unix shell (e.g. sh.exe
), but some makefiles are written for cmd.exe
, and perhaps via an option can be instructed to parse like cmd.exe
instead. Output should be properly-formatted JSON, which is easy. Output that cannot be understood, or which cannot be represented as JSON, is silently ignored.
The usual utility requirements stand:
-
Must support Windows XP.
-
Must not link MSVCRT (e.g.
cc -nostartfiles -o make2compdb make2compdb.c -lmemory
). -
Must be a single source file, like
make2compdb.c
. C++ is acceptable, but mind the previous item. -
Must be dedicated to the public domain. Still put your name in it and take credit of course.
As mentioned, I expect this is a pretty easy project. Inputs and outputs are relatively small, and must fit in memory in practical use. So it should slurp the entire input before starting. That means as far as system interfaces goes, it only needs to read from standard input, write (buffered!) to standard output, and retrieve the working directory (to seed the directory stack).
Looking at the various tools that consume compilation databases, I do not see a reason I'd personally use any of them. So I'm not really the right person to build this tool, as I wouldn't be using it. So I'm opening it up to anyone who would like to explore the idea.