Skip to content

Speculative application: make2compdb #251

@skeeto

Description

@skeeto

In #246 I came up with an idea to generate a JSON Compilation Database from GNU Make output. This concept has been realized before, and I found two existing, if crude, implementations. I came up with the name make2compdb, which is the name I'll use here, but I'm open to other names. In it simplest usage:

$ make -Bnw | make2compdb >compile_commands.json

The user can invoke make in whatever way is appropriate, choosing targets, overriding CFLAGS, etc. The three make options are necessary:

  • -B (--always-make): force all build commands to print regardless of current build state

  • -n (--dry-run): no build is actually desired at this time (this still invokes != and $(shell ...), which is a good thing)

  • -w (--print-directory): The "directory" key in a compilation database is required, and despite what the "specification" says must be an absolute path. At least some LLVM tools (e.g. clang-tidy) do not support relative paths. This is highly unfortunate because it means compile_commands.json cannot be meaningfully checked into source control. -w provides the "directory" key for make2compdb via "Entering directory" lines, which internally maintains a stack of directories. The stack is initialized with the absolute working directory.

The output of make will look something like:

make: Entering directory 'C:/Users/me/src/example'
cc -c -Iinclude -o example.o example.c
c++ -c -Iinclude -o util.o util.cpp
make[1]: Entering directory 'C:/Users/me/src/example/lib'
cc -c -I../include -o lib.o lib.c
make[1]: Leaving directory 'C:/Users/me/src/example/lib'
c++ -o example.exe example.o util.o src/lib.o
make: Leaving directory 'C:/Users/me/src/example'

From this, the output of make2compdb would be (assumed UTF-8):

[
    {
        "directory": "C:/Users/me/src/example",
        "file": "example.c",
        "command": "cc -c -Iinclude -o example.o example.c"
    }, {
        "directory": "C:/Users/me/src/example",
        "file": "util.cpp",
        "command": "c++ -c -Iinclude -o util.o util.cpp"
    }, {
        "directory": "C:/Users/me/src/example/lib",
        "file": "lib.c",
        "command": "cc -c -I../include -o lib.o lib.c"
    }
]

The program does not need to access nor examine any of these files, but transform data from one format to another. This should be easy, but since we're bothering to write a whole program to do it, it ought to be robust too. The "directory" fields are copied verbatim from '...' in the Entering line, but the rest must be parsed the same way as would a shell, and with some understanding of what it sees (note the distinction between "must" and "should"):

  • Must ignore lines that don't look like a compiler invocation. It should match cc, c++, gcc, g++, clang, clang++, and even fancier stuff like x86_64-w64-mingw32-gcc, which can be done by splitting on - and looking at the last element. It should must absolute paths, too, like C:/w64devkit/bin/gcc.exe. If you want to be fancy, handle cl and clang-cl, too, but specially.

  • It must handle quoting and backslashes when word splitting:

      "C:/Users/John Falstaff/w64devkit/bin/gcc.exe" -c ...
      C:/Users/John\ Falstaff/w64devkit/bin/gcc.exe -c ...
      C:/Users/'John Falstaff'/w64devkit/bin/gcc.exe -c ...
    
  • It must handle line continuations:

      gcc \
        -c ...
    

    The continuation backslash and newline should probably be omitted from "command". I don't know how well database-consuming tools handle this.

  • It should navigate around subshell expressions:

      gcc -c $(pkg-config ...) ...
    

    That is, ignore the stuff between $(...). In the case of pkg-config, a good Makefile will expand these on its own first — which is great, as the output will end up in the compilation database — and they won't appear as subshell expressions in make's output. So trying to actually interpret them is unimportant.

  • It should understand common compiler flags like -o, -I, etc., so -o foo.c will not match foo.c. It should only match known file extensions.

  • If a command contains multiple matching positional arguments, there must be a compilation database for each. cc x.c y.c will produce an entry for x.c and y.c.

  • Optional (bonus) variable expansion:

      "$CC" -c ...
    

    This is not so easy (consider variations like ${...}). Typically these are handled as make variables, and are already expanded on output, so this is low importance. Probably better to not do this at all than do it poorly.

This all assumes a unix shell (e.g. sh.exe), but some makefiles are written for cmd.exe, and perhaps via an option can be instructed to parse like cmd.exe instead. Output should be properly-formatted JSON, which is easy. Output that cannot be understood, or which cannot be represented as JSON, is silently ignored.

The usual utility requirements stand:

  • Must support Windows XP.

  • Must not link MSVCRT (e.g. cc -nostartfiles -o make2compdb make2compdb.c -lmemory).

  • Must be a single source file, like make2compdb.c. C++ is acceptable, but mind the previous item.

  • Must be dedicated to the public domain. Still put your name in it and take credit of course.

As mentioned, I expect this is a pretty easy project. Inputs and outputs are relatively small, and must fit in memory in practical use. So it should slurp the entire input before starting. That means as far as system interfaces goes, it only needs to read from standard input, write (buffered!) to standard output, and retrieve the working directory (to seed the directory stack).

Looking at the various tools that consume compilation databases, I do not see a reason I'd personally use any of them. So I'm not really the right person to build this tool, as I wouldn't be using it. So I'm opening it up to anyone who would like to explore the idea.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions