Code generation in .NET with Roslyn (part 3)
12 July 2012
This post was originally published on the Softwire blog
This post is the third in a series on code generation on the .NET platform. In this post, we will look at how to package a code generator as a Visual Studio extension, and how best to share a single library between multiple extensions.
Creating a single file code generator
Visual Studio allows you to hook into code generators from within the IDE by associating files in your project with a ‘Custom tool’ that produces another file based on the first, which will automatically be included in the project as well. The original file might be a class file, XML (e.g. a dataset definition), or any other kind of project file. The output file will typically be some source code, and might define a new class, or contain a partial class definition for the same class as in the original file.
Creating a single file code generator actually involves quite a lot of boilerplate, including various COM interop assemblies, several poorly documented interfaces, and some tedious steps to add your extension to the registry so Visual Studio will pick it up. Fortunately, there’s an MSDN sample that covers all of this for you and includes an example code generator for you to crib from if you get stuck. The base classes implement all the necessary boilerplate and just leave you to override a single, fairly self-explanatory, abstract method:
protected abstract byte GenerateCode(string inputFileContent);
This is all very nice, apart from the fact that the documentation says “To build and execute the sample, press F5 after the sample is loaded. This will launch the experimental hive which will demonstrate the sample’s function.” (The words “experimental hive” just means it makes a new place in the registry for all the settings used by the test instance of Visual Studio, so you can muck about with installing extensions without having any wider effect). This didn’t work at all for me, and despite finding a blog post claiming to get it working in Visual Studio 2010 I couldn’t get to a point where I could develop my extension and just hit F5 to debug it in Visual Studio. Trying to get it to work involved digging into a lot of the work the sample is supposed to abstract away in the first place, and ended up being a bit of a time sink for me.
However, it’s not actually that important to be able to debug your extension in Visual Studio. In fact, if you’re just writing a code generator it’s pretty pointless. Every time you debug, you would have to create or open a project in the ‘debuggee’ Visual Studio instance and set your code generator as the custom tool on a file. It’s much quicker to just write a unit test for your code generator with a suitable input string (either read in from a file or just specified as a constant in the unit test code), and debug via this test instead. Once you’re happy with the behaviour of your code generator under test, it’s very easy to install your extension in Visual Studio to check that it still works (just double-click the .vsix file in your output directory and open a new instance of VS).
Sharing a library between extensions
Having created a Roslyn-based code generator, I wanted to create a shared library of any code that wasn’t specific to my code generation scenario but could be used by any Roslyn-based code generator. This would include some of the extension methods I’d written, and a base class for abstracting away some of the more general communication with Roslyn.
For VS 2010, extensions are packaged into VSIX files, which follow a fairly neatly-specified XML-based format used in the nice Extensions ‘gallery’ introduced in the latest version of Visual Studio. Unfortunately, packaging an extension containing multiple assemblies doesn’t ‘just work’ as one might hope it would. I also experimented with creating a ‘base extension’, but this is rather more complex and wasn’t necessary for my purposes. There’s a fairly thorough VSIX Best Practices blog post explaining the various approaches and when you should use each. It mentions packaging multiple assemblies into one extension, but doesn’t quite go into detail on how to achieve this.
It turns out that packaging multiple assemblies into a Visual Studio extension is actually very simple if you edit the VSIX manifest directly, but the Visual Studio ‘Designer’ for this file (as is the case with many Visual Studio ‘Designer’ tools) is a bit half-baked and doesn’t understand the complete spec of the file type it’s supposed to edit. The designer does however helpfully support some much more esoteric options that are explicitly ruled out by the official source on best practices (see the ‘Things to Avoid’ section in the blog post mentioned above).
You can find the final VSIX manifest I created, along with the rest of the source for this blog series, in my RoslynGenerators repository on github. This repository includes a project template and a supporting library for creating your own Roslyn-based single file code generators as Visual Studio extensions. The repository also includes an example generator for creating an asynchronous version of a WCF service interface.
Postscript: Not forgetting…
While this series was primarily an explanation of what’s possible with Roslyn, I couldn’t conclude without mentioning a few other existing tools and libraries.
Existing third-party libraries
It’s worth noting some excellent third-party libraries that cover much of the same functionality as promised by Roslyn. In particular, NRefactory is a C#/VB parser (developed for the third-party SharpDevelop IDE) that does almost exactly what’s described in the section “Introducing Roslyn” in the first post of this series, while the Mono project’s CSharp namespace covers some of the other features available in Roslyn (e.g. compiler-as-a-service, and a REPL).
T4 is the Text Template Transformation Toolkit, which is part of Visual Studio and therefore the standard tool for code generation in .NET. If you’re already familiar with T4, you might have been wondering why it hasn’t come up in this series so far. The reason is that T4 is a much more than just another code generation process. It’s a very general purpose tool for generating code (or indeed any text-based output). T4 templates can call arbitrary assemblies, so could make use of any of the options discussed in the first post of this series (although T4 naturally pushes you towards using plain text for your output model).
In fact, I could have written a T4 template that made use of Roslyn for code generation, but I went with writing a Visual Studio extension because it felt like a better fit. In particular, while T4 templates can be very powerful and I’ve found them useful in the past, the template code itself can become quite difficult to maintain. I prefer to write my code generation logic in a normal average .NET project; benefiting from Visual Studio’s standard syntax highlighting, intellisense, and many useful design- and compile-time checks.
I certainly don’t mean to dismiss T4 entirely. In particular, there are some 3rd-party Visual Studio plugins that make T4 templates much easier to work with, by providing syntax highlighting and intellisense among other features: Tangible T4 Editor and Clarius Visual T4 both offer similar feature sets and have a similar business model (a limited free version and a full-featured ‘pro’ version). Additionally, T4 is highly extensible; you can create your own text templating host and directive processors to control how your templates are compiled and run. For an excellent overview of the T4 architecture and many more T4 resources, see Oleg Sych’s blog.