Emacs Projectile in a Monorepo

In monorepos, Projectile determines the project root to be the monorepo, not the subproject that you're in. In this article I update Projectile to instead prioritize the most specific project it can find.

To jump to the solution, go to Solution. Or, see the Elisp file generated from this document on GitHub (which I use in my personal Emacs config).

Two Kinds of Monorepos

I've seen two kinds of monorepos:

  1. A single git repository to hold several, self-contained projects.

    In this case, the repo is basically a dumping ground for smaller projects. The motivation here is to stave off an explosion of small repositories, and to collect issues and PRs in one place where a team can review them.

  2. A single git repository with common libraries and multiple independently deployable services.

    This is what the term "monorepo" is supposed to connote - a single commit describes a working version of several services that may need to work together and share code.

Always in case 1, and sometimes in case 2, I would rather projectile consider a subproject as a "project root". One could still use Projectile on the whole monorepo by navigating to a file at the root of the monorepo.

Example of the Problem

Consider the following monorepo:

tree -F -a /tmp/repo
#+RESULTS:
/tmp/repo/
├── .git
├── go/
│   ├── projectA/
│   │   └── go.mod
│   └── projectB/
│       └── go.mod
└── python/
    ├── projectC/
    │   └── setup.py
    └── projectD/
        └── setup.py

6 directories, 5 files

Now when I open a file in go/projectA, Projectile says that the project root is the monorepo:

(let ((default-directory "/tmp/repo/go/projectA"))
  (projectile-project-root))
#+RESULTS:
/private/tmp/repo/

(Don't worry about the /private/ – it's because MacOS symlinks /tmp/private/tmp).

I want Projectile to say that the project root is the Go subject: /tmp/repo/go/projectA. But alas, Projectile instead found the monorepo root: /tmp/repo/.

How does projectile determine that project root?

About Projectile Project Detection

The relevant documentation is here: Customizing Project Detection. Projectile has a few strategies for finding a project root, and it tries each strategy until one returns a result. The order is defined by this variable:

projectile-project-root-functions
projectile-root-local
projectile-root-bottom-up
projectile-root-top-down
projectile-root-top-down-recurring

In our example, the second function – projectile-root-bottom-up – is the culprit. We can try it out interactively:

(projectile-root-bottom-up "/tmp/notes/go/projectA")
#+RESULTS:
/tmp/notes/

Yup – it found the monorepo, not the subproject. To understand why this is, let's look at the source! Here it is, slightly reformatted:

(defun projectile-root-bottom-up (dir &optional list)
  "Identify a project root in DIR by bottom-up search for
files in LIST.

If LIST is nil, use `projectile-project-root-files-bottom-up'
instead. Return the first (bottommost) matched directory or nil
if not found."
  (cl-some
   (lambda (name) (projectile-locate-dominating-file dir name))
   (or list projectile-project-root-files-bottom-up)))

In regular words, this function is doing this:

  • For each marker file in projectile-project-root-files-bottom-up
    • Is the file in this directory? No?
    • Is the file in the parent directory? No?
    • Is the file in the parent-parent directory? No?
    • … etc up to /

Two important takeaways:

  1. projectile-project-root-files-bottom-up is the variable that holds the list of marker files that signify a project root.
  2. The function looks for one file all the way up to root before looking for the next file.

And what are these "marker files"?

projectile-project-root-files-bottom-up
.projectile .git .hg .fslckout FOSSIL .bzr _darcs

So, assuming we're somewhere in our monorepo, Projectile starts by looking for a .projectile between here and /, then looks for a .git between here and /, finds a .git at /tmp/repo/.git, and returns /tmp/repo.

To drive this point home, say we append go.mod to that list of marker files:

(setq projectile-project-root-files-bottom-up
      '(".projectile" ".git" ".hg" ".fslckout"
        "_FOSSIL_" ".bzr" "_darcs" "go.mod"))
.projectile .git .hg .fslckout FOSSIL .bzr _darcs go.mod

Projectile still won't find our Go subproject, because .git comes earlier in the list of marker files.

(projectile-root-bottom-up "/tmp/notes/go/projectA")
#+RESULTS:
/tmp/notes/

Projectile found a .git two directories up before ever even looking for a go.mod. In this case we could rectify that by prepending go.mod instead of appending it, but the general problem would still remain (i.e. that marker files several directories up could be discovered before files in the current directory).

That behavior of the (projectile-root-bottom-up) function is useful in one situation: when you want to force a project root to a higher level, despite the presence of marker files in the current directory.

For example: if we wanted /tmp/ to be the project root for some reason, we could just put a .projectile file there.

touch /tmp/.projectile
(let ((default-directory "/tmp/notes/go/projectA"))
  (projectile-project-root))
#+RESULTS:
/private/tmp/

That's useful, and works! Ok, we should retain that behavior. So what exactly are the problems we need to fix now?

Project Detection Issues

Two things prevent Projectile from discovering subprojects in our monorepo:

  1. Marker Files: The variable projectile-project-root-files-bottom-up doesn't have go.mod or setup.py in it.
  2. Lookup Behavior: Even if Projectile knows to look for setup.py and go.mod, the contents of directories above our project affects lookup. Projectile might find some other, totally unrelated file at a higher level before even looking for go.mod or setup.py.

Let's examine each problem.

Marker Files

The problem was:

The variable projectile-project-root-files-bottom-up doesn't have go.mod or setup.py in it.

We just need to add setup.py and go.mod to the list of marker files. While we're at it, let's add every other filename that indicates a project root. Projectile already has a variable for this, documented in File markers:

projectile-project-root-files
dune-project pubspec.yaml info.rkt Cargo.toml
stack.yaml DESCRIPTION Eldev Cask
shard.yml Gemfile .bloop deps.edn
build.boot project.clj build.sc build.sbt
application.properties gradlew build.gradle pom.xml
poetry.lock Pipfile tox.ini setup.py
requirements.txt manage.py angular.json package.json
gulpfile.js Gruntfile.js mix.exs rebar.config
composer.json CMakeLists.txt Makefile debian/control
WORKSPACE flake.nix default.nix meson.build
SConstruct GTAGS TAGS configure.ac

Decent start, but it doesn't have go.mod, so we should add that. Might as well also add all the files in projectile-project-root-files-bottom-up (which has .git, etc).

(defvar my-project-root-files
  (-concat
   '("go.mod")
   projectile-project-root-files-bottom-up
   projectile-project-root-files))

That creates a pretty complete list of marker files that can indicate project roots.

Lookup Behavior

The problem was:

Even if Projectile knows to look for setup.py and go.mod, the contents of directories above our project affects lookup. Projectile can find marker files at a higher level before looking for go.mod or setup.py.

Instead, we want to look for for every marker file in the current directory before continuing to a parent directory.

Instead of looping over the marker files and running locate-dominating-file on each filename, we should loop over the directories (starting from the bottom) and check whether any maker file is in that directory.

The f library already has a perfect function for that: f-traverse-upwards.

Solution

(require 'projectile)
(require 'f)

Here is the Elisp file generated from this section (which I use in my personal Emacs config).

We will define a new strategy for discovering projects. First, define a variable with the marker files that indicate a project root.

(defvar my-project-root-files
  (-concat
   '("go.mod")
   projectile-project-root-files-bottom-up
   projectile-project-root-files))

Define a new discovery function. In keeping with how exsting projectile project discovery functions work, allow elements to be predicates.

(defun any-file-exists? (files dir)
  "True if any of FILES exist in DIR.
FILES is a list of file names and/or predicates.

An element of FILES can also be a predicate taking one
argument (a directory) and returning a non-nil value if that
directory is the one we're looking for.

DIR is a path to a directory."
  (cl-some
   (lambda (name)
     (if (stringp name)
         (f-exists? (f-expand name dir))
       (funcall name dir)))
   files))
(defun my/projectile-root-bottom-up (dir &optional list)
  "Identify a project root.
Perform a bottom-up search for files in LIST starting from DIR.
Always return the lowest directory that has any file in LIST. If
LIST is nil, use `my-project-root-files' instead. Return the
first (bottommost) matched directory or nil."
  (let ((marker-files (or list my-project-root-files)))
    (f--traverse-upwards
     (any-file-exists? marker-files it) dir)))

Insert our new lookup function into projectile-project-root-functions.

;; Add my new function into Projectile's hierarchy of project
;; discovery functions.
(setq projectile-project-root-functions
      '(projectile-root-local
        projectile-root-bottom-up
        my/projectile-root-bottom-up  ;;  New function
        projectile-root-top-down
        projectile-root-top-down-recurring))

Restrict Projectile's original bottom-up discovery to only work for .projectile files. This allows us to force a project root to a higher level by creating a .projectile file in a parent directory.

;; Only allow a .projectile file to force project
;; roots to higher levels.
(setq projectile-project-root-files-bottom-up
      '(".projectile"))

Test

After applying the functions above, let's see what Projectile says the project roots are for our monorepo.

Go

Can we correctly identify a Go project?

(let ((default-directory "/tmp/notes/go/projectA"))
  (projectile-project-root))
#+RESULTS:
/private/tmp/notes/go/projectA

✔ Works!

Python

Can we correctly identify a Python project?

(let ((default-directory "/tmp/notes/python/projectC"))
  (projectile-project-root))
#+RESULTS:
/private/tmp/notes/python/projectC

✔ Works!

Force Root Higher

Can we force the project root to a higher level by creating a .projectile file?

touch /tmp/repo/.projectile
(let ((default-directory "/tmp/repo/go/projectA"))
  (projectile-project-root))
#+RESULTS:
/private/tmp/repo/

✔ Works!