C# 8.0 nullable references: getting started in an existing codebase
In an earlier post in this series, "non-nullable is the new default", I described how enabling C# 8.0's new nullable references feature changes a fundamental element of the language.
Variables, properties, parameters, and fields of reference type become non-nullable by default. We've been allowed to put nulls into these things for around 2 decades, so this is a dramatic change.
In this post I'll describe various C# 8.0 features that can help ease the transition into a nullable-aware world.
C# 8.0 smooths the adoption of nullable references by making it possible to enable the feature gradually. It doesn't have to be an all-or-nothing choice. This is very different from the async
/await
feature added in C# 5.0, which had a tendency to spread: asynchronous operations effectively oblige their callers to be async
, which means their callers' callers must be async
, and so on all the way to the top of the stack.
Fortunately, nullable types are not like this: it is possible to adopt them selectively, and gradually. You can go one file at a time, or even line by line if necessary.
The most important transition-enabling aspect of nullable types is that they are off by default. If they weren't, the chances are most developers would opt out of using C# 8.0, because this change would cause warnings in more or less any codebase. But this also means that the perceived barrier to entry is pretty high: if this feature makes such a drastic change that it's off by default, you might think that perhaps you may as well leave it off, and that it will never be worth the pain of switching it on.
But that would be a shame, because it is a valuable feature: it can help you find bugs in your code before your users do.
So if you're contemplating using nullable types, the next most important thing to know is that you can enable the feature incrementally.
Warnings only
The most coarse-grained level of control beyond a simple project-wide on/off is that you can enable warnings independently from annotations. For example, if I fully-enable nullability for the Corvus.ContentHandling.Json in our Corvus.ContentHandling repo by adding <Nullable>enable</Nullable>
to a property group in the project file, then in its current state I instantly get 20 warnings from the compiler.
However, if instead I use <Nullable>warnings</Nullable>
, I end up with just a single warning. It's going to take much less work to get back to 0 warnings in that second case.
But hold on a second! Why am I seeing fewer warnings? After all, the one thing I asked for here was warnings. The somewhat cryptic answer is that some variables and expressions can be null-oblivious.
Null obliviousness
In an earlier blog in this series on inferred (non-)nullness I described how C# maintains two notions of nullability.
First, any variable of reference type can be declared as nullable or not, and second the compiler will, where it can, infer whether that variable could or could not be null at any particular point in the code. In this article, I'm looking purely at the first kind of nullabilty: the static type of the variable. (And in fact it's not just variables, nor their obvious relatives such as parameters and fields; both static and inferred nullability are determined for every expression in C#.)
The first kind of nullability, the kind we're looking at now, is effectively an extension of the type system.
But it turns out that even if we narrow down our focus just to the nullability of a type, things are not quite as straightforward as you might imagine. It is not a simple case of "nullable" vs "not nullable". There are in fact two more possibilities. There is "unknown", a category that needs to exist because of generics: if you have an unconstrained type parameter, it's not possible to know anything about its nullability: code using the relevant generic method or type could plug in either a nullable or a non-nullable type argument.
It is possible to add constraints, but in many cases such constraints are undesirable because they limit the applicability of the generic type or method. So variables or expressions of some unconstrained type parameter T
are deemed to have unknown nullability: they might have some particular nullability disposition in any particular instance, but we can't know what that is in our generic code, because it will depend on the type argument.
The final category is "oblivious". This is the name for how things used to work before C# 8.0 came along, and how they continue to work if you do not enable nullable references. (This is essentially an act of retcon. Even though the idea of null-obliviousness was newly introduced in C# 8.0, by deeming it to be the natural state of all pre-nullable-reference code, the designers of C# have asserted that C# has in fact never been obliviousnessless.)
Arguably I don't need to explain what "oblivous" means, because it's how C# always used to work, so you already know...however, that's probably cheating. So here goes: in a nullable-aware world, the most important characteristic of null-oblivious expressions is that they don't cause nullability warnings.
You can assign a null-oblivious expression into either a nullable or a non-nullable variable. You can assign expressions inferred to be either "maybe null" or "not null" into a variable (or property or field etc.) that is null-oblivious.
This is why enabling warnings alone does not produce many new warnings. All of the code remains in a disabled nullable annotation context, so all of the variables, parameters, fields, and properties will be null-oblivious, meaning that there will be no warnings associated with attempts to use them in conjunction with anything that is null-aware.
So why do I get any new warnings at all? One common reason might be that I'm attempting to connect two nullable-aware pieces of code together in a way that is illegal. For example, suppose I have a library in which I've fully enabled the nullable references feature, and that it contains this deeply contrived class:
Next, in a different project I might write this code in an enabled nullable warning context, but a disabled nullable annotation context:
Because nullability annotations are disabled, the x
parameter here is null-oblivious. That means that compiler can't know if this code is right or wrong. If the compiler were to raise warnings when null-oblivious expressions mingle with null-aware ones, a high proportion of those warnings would be spurious, so it does not raise a warning.
With this wrapper I've effectively made the nullable-awareness invisible. It means I can now write this:
The compiler knows that GetNullable
may return null, but because I've called a method with a null-oblivious parameter, it can't know whether that's right or wrong. By going via a null-oblivious wrapper, I've defeated the compiler's ability to detect a problem. However, if I were to combine these two methods directly, it's different:
Here, I pass the result of GetNullable
directly into RequireNonNull
. If I try this in an enabled nullable warning context, the compiler will generate a warning regardless of whether it is in an enabled or disable nullable annotation context. In this particular case, the annotation context is irrelevant because there are no declarations with reference type.
If you enable nullable warnings but disable nullable annotations, any declarations will be null-oblivious, but that doesn't mean all expressions will be—the result of GetNullable
is knowably nullable so we get a warning.
To summarize, because all declarations in a disabled nullable annotation context are null-oblivious, enabling only warnings tends not to produce very many warnings because most of the expressions will be null-oblivous.
But the compiler will still be able to detect nullability errors in cases where expressions didn't go through some a null-oblivious intermediary. And the most directly useful kinds of errors that this will detect are attempts to dereference possibly null values through the use of .
, e.g.:
If your code is in good shape there shouldn't be many errors of this kind. So this is a good gentle way in to nullability-awareness.
Gradually annotating your project
Once you've made the first step of just enabling warnings, you can gradually enable annotations one file at a time. A good way to approach this is to turn them on for the entire project, and see which files get warnings, and then to pick one with relatively few. Turn them back off at the project level, and then in the file you have chosen, write #nullable enable
at the top.
This fully enables nullability (both warnings and annotations) for the whole file (unless you disable it again later on with another #nullable
directive). You can then go through the file ensuring that anything that can reasonably be null is annotated as nullable (i.e., add a ?
) and then address the warnings for this file that remain.
You might find that adding any necessary annotations is enough to remove all warnings. Conversely, you might find that having annotated one file for nullability, you now see some additional warnings in other files that use it. Typically there won't be many and it won't take long to fix them. But if for some reason this step means you're suddenly drowning in warnings you have a couple of choices.
You could just reverse your decision to tackle this particular file and pick a different one. Alternatively, you could selectively disable annotations for whichever member or members seem to be causing the most trouble. (You can use the #nullable
directive as many times as you like, making it possible to control nullability settings on a line-by-line basis if you really want.) You might find that if you return to these later, having fully enabled nullability in more of the rest of the project, that you see fewer warnings than you did at first.
You can gradually work through your project one file at a time. If you repeat the step of enabling annotations temporarily at the project level in order to decide which file to tackle next, you may eventually find you reach a point where few enough problems remain that you're ready to switch over completely, at which point you can remove any per-file #nullable
directives.
There are certain cases where it won't be as straightforward as this. In particular, certain serialization scenarios (e.g., using Json.NET or Entity Framework) can be trickier to deal with. I've got an article coming up later on that though, so don't worry, it's probably not as bad as it looks.
Nullable references improve the expressiveness of your code and increase the chances of the compiler detecting mistakes before your users run into them, so it's good to enable this feature if you can. And by enabling it selectively, you may be able to reap the benefits sooner.