countingup.com

Ad-hoc static code analysis in Go

10 minute read

Michele Fiordispina

We recently discovered that we incorrectly used the wrong database pointers within database transactions. This issue was a bit spread throughout our Go codebase.

In this blog post, I'll show how we used a custom static code analyser to improve the quality of our production code.

How do we handle database transactions?

We use the GORM library to handle database operations in our Go code.

Regarding transactions, similar to what the documentation describes, GORM provides some function receivers to begin, rollback and commit transactions.

Here's an excerpt of documentation showing one of the possible uses of these receivers:

func CreateAnimals(db *gorm.DB) error {
	// Note the use of tx as the database handle
	// once you are within a transaction
	tx := db.Begin()

	if err := tx.Error; err != nil {
		return err
	}

	if err := tx.Create(&Animal{Name: "Giraffe"}).Error; err != nil {
		tx.Rollback()
		return err
	}

	if err := tx.Create(&Animal{Name: "Lion"}).Error; err != nil {
		tx.Rollback()
		return err
	}

	return tx.Commit().Error
}

In our case, we simplified this a bit further by having a few helper functions to do the transaction operations when needed

e.g.:

database.RetryingTransaction(db, func(tx *gorm.DB) error {
	// code goes here
})

RetryingTransaction will begin the SQL transaction, execute the runner function and commit or roll back said transaction depending on the result. It also handles transaction retries; these are useful for handling transient errors.

As you might have noticed, the GORM documentation tells us to use the tx pointer rather than the db one.

Why is that? What could go wrong?

Why is this important

Let's have a look at why this is important first. Then, we can start working on how to enforce this.

Issuing an INSERT or UPDATE query means a connection to the DBMS occurs. Some tables or row locking might occur; every new database connection trying to acquire a lock for the same resource will have to wait until said resource is freed or unlocked.

Similarly, we create a new connection to the database when we initiate a transaction with db.Begin() with similar locking mechanisms.

What happens if we use a different pointer within the transaction? We are actively creating a new connection to the DBMS. A few dangerous scenarios might occur:

  • A race could happen: we are creating multiple connections to the DB trying to access the same resource. If the non-transactional database connection issues a SELECT ... FOR UPDATE, the rest of the code most likely won't run as the execution gets stuck. A different connection has locked the row we want to edit, thus resulting in a deadlock.
database.RetryingTransaction(db, func(tx *gorm.DB) error {
	user := users.GetUserForUpdate(tx, userID) // Locks the row

	user.LastLogin = time.Now()

	return user.Save(db) // Using the wrong db pointer here. Deadlock
})
  • Rolling back a transaction might not occur: the code might create some resources using the non-transactional database pointer db and then fail for any other reason. At this point, the rollback process will revert all created/edited resources. However, this won't happen as said resource is technically not part of that transaction.
database.RetryingTransaction(db, func(tx *gorm.DB) error {
	event := events.CreateEvent(db, EventType) // Saved an event

	user := users.GetUser(tx, userID)
	user.lastEvent = event

	// if this errors the event won't get rolled back
	// and retrying multiple times will make it worse
	return user.Save(tx)
})

These are just a few things that can happen. They won't happen frequently, but we risk leaving the database in an incoherent state.

Solution

We wrote a custom go-vet program to catch these improper uses of GORM to avoid deploying them in our production environment.

What is go-vet

go vet is a tool made by the Go team that serves as Go's static code analysis tool.

It is available as part of Go and provides a few modules to ensure good code quality. Here are a few built-in modules to give an example:

assign: check for useless assignments
atomic: check for common mistakes using the sync/atomic package
bools: check for common mistakes involving boolean operators
httpresponse: check for mistakes using HTTP responses
loopclosure: check references to loop variables from within nested functions
lostcancel: check cancel func returned by context.WithCancel is called
printf: check consistency of Printf format strings and arguments
unreachable: check for unreachable code
unusedresult: check for unused results of calls to some functions

You can use this analyser by pointing it to your project's folder like so:

go vet my/project/...

There are options to select a subset of those or to specify JSON output, for example.

Custom vet tool with custom analysers

The most interesting thing about this tool is that it allows users to write a custom vet program with an arbitrary number of custom analysers.

The multichecker package (available here: golang.org/x/tools/go/analysis/multichecker) is a self-contained solution that solves this problem.

package main

import (
	"golang.org/x/tools/go/analysis/multichecker"
)

func main() {
	multichecker.Main(
		analysers.EnsureDBTransaction,
		// any other analysers
	)
}

The provided Main variadic function will take an arbitrary number of analysers.Analyser structs. It will then generate a tool that validates the code you feed it into. It also creates command line help and options handling out-of-the-box.

Let's create one that captures these non-transactional pointer pitfalls.

Abstract Syntax Tree

Before embarking on static code analysis, a brief mention of abstract syntax trees is due. However, we won't describe them in detail as it is out of the scope of this post.

We can see an AST as a tree structure that describes a collection of connected statements that define a program. Tools generate them when parsing the source code, often after the lexical analysis step.

Given this code that returns the absolute value of an integer:

if num < 0 {
	return -x
}
return x

the resulting syntax tree would look like this:

This kind of structure is preferred when we want to make analysis and/or automatic modifications to a program.

Software optimisation is a good example of such modifications: a compiler is smart enough to figure out ways to optimise our code; it will make changes to the AST, rather than the code directly. It's also much easier and safer for a machine to navigate a data structure instead of reading text (aka, the source code).

These vet tools are internally using this tree to navigate the code.

Code

Along with multichecker, the Go team give us access to tools used to create and navigate these ASTs. These are available on the golang.org/x/tools/go/analysis package.

To create a custom analyser you need to define a variable of type analyser.Analyzer. That will hold the definition and logic of what will scan our code.

import (
	"golang.org/x/tools/go/analysis"
)

const Doc = `find improper db pointer uses to transaction db calls

The ensuredbtransaction analysis reports incorrect uses of the
non-transactional db pointer within a transaction`

var EnsureDBTransaction = &analysis.Analyzer{
	Name: "ensuredbtransaction",
	Doc:  Doc,
	Run:  run,
}

The run function is the core of the analysis. We won't show it in full, but we'll provide and explain a few snippets.

It starts by fetching all function calls within our project,

import (
	"golang.org/x/tools/go/analysis"
)

func run(pass *analysis.Pass) (interface{}, error) {
	for _, f := range pass.Files {
		ast.Inspect(f, func(n ast.Node) bool {
			if funcCall, ok := n.(*ast.CallExpr); ok {
				return checkFunctionCall(funcCall)
			}
		}
	return nil, nil
}

then it filters out those we don't need to scan by checking their function names, number of parameters and types. Ultimately, it will look inside these function bodies for improper pointer usage by comparing the name of the selector expressions.

// This function will actually check the code
// within our `database.RetryingTransaction`
func ensureDBTransaction(funcCall *ast.CallExpr, funcIdent *ast.Ident) bool {
	// The function RetryingTransaction has only 2 parameters
	if len(call.Args) != 2 {
		return false
	}

	// Make sure the first parameter is the dbPointer selector
	// and the second one is the literal function.
	// Function identifier won't be a problem here as
	// they will use the correct pointer anyway
	var dbPointer *ast.SelectorExpr
	var txFunction *ast.FuncLit
	if dbPointer, ok = call.Args[0].(*ast.SelectorExpr); !ok {
		return false
	}

	if txFunction, ok = call.Args[1].(*ast.FuncLit); !ok {
		return false
	}

	// For every statement within the tx function report all the
	// selector uses that match the name of the DB pointer
	for _, stmt := range txFunction.Body.List {
		ast.Inspect(stmt, func(nn ast.Node) bool {
			selector, ok := n.(*ast.SelectorExpr);
			if ok && selector.Sel.Name == dbPointer.Sel.Name {
				pass.Report(analysis.Diagnostic{
					Pos:     selector.Pos(),
					Message: fmt.Sprintf("use of non-transactional db pointer within %s",
						funcIdent.Name),
				})
			}
			return true
		})
	}
}

That is enough to cover the case we are trying to detect.

Findings

Immediately after creating it, we've run this custom go-vet through our Go codebase.

A quick test on one of our projects shows we have a few issues to fix:

svc/cleanup.go:19:32: use of non-transactional db pointer within RetryingTransaction
svc/cleanup.go:23:35: use of non-transactional db pointer within RetryingTransaction
svc/cleanup.go:27:36: use of non-transactional db pointer within RetryingTransaction
svc/handler.go:72:32: use of non-transactional db pointer within RetryingTransaction
svc/handler.go:80:25: use of non-transactional db pointer within RetryingTransaction
svc/handler.go:90:49: use of non-transactional db pointer within RetryingTransaction

This result shows that it's an error that's easy to make and also easy to miss in code reviews.

Thanks to this tool, we are making our code more robust.

A more proactive approach

It would be nice to avoid this in the first place, so we added a step in our building process that uses this tool. This way, we ensure we use the correct database transaction pointer every time.

Not long after this addition to our CI, we can see that this solution is proving useful:

Future

We might develop new analysers in the future, for example, one to enforce specific code styles in tests.

Another aspect we can expand upon is the capability that go vet has to automatically fix the faulty code found by analysers (provided that the analyser knows how to fix it).

Closing

It's refreshing to see that we can use the same powerful tools core to the Go language to improve quality with minimal effort.

There are also plenty of examples provided by the Go team; those proved valuable when we started this experiment.

That's all I have for now, 👋