Tuesday, December 19, 2006

F# Life

As I said before, I've been getting very into the idea (if not the practice) of functional programming recently, so I was really excited to read about F#. I'm not the only one, there seems to be a general buzz about this new programming language from Microsoft research in Cambridge. The guy who designed it, Don Syme also played a big part in designing the excellent generics implementation in the CLR and has implemented a number of new CLR features to support F# and functional languages in general. I guess in this respect you can see F# as a test bed for pushing the CLR in some exciting new directions. The great thing about F# is that it allows you to leverage all of the .net framework from a functional language with great Visual Studio integration, including an interactive shell that you can use in any project, not just F# ones. Of course, to use the .net libraries, F# has to include object oriented and imperative programming styles as well and while this has the effect of polluting its pure 'functionalness' it also promise to make it a real swiss army knife of programming techniques. Don Syme has got some great examples of F# utilising the power of .net on his blog.

I'm a total novice at functional programming and as I'm still learning about OO programming even after 10 years, I'm sure I'm just starting off on a very long path of discovery. I really don't get a lot of the higher level concepts I've been reading about but some of the simple stuff does make a lot of sense. In particular I'm very excited about the concept of using functions as first class constructs and some of the cool patterns that this allows.

To illustrate how I'm getting on with F# and to show some of the cool stuff that I've found most interesting, I've implemented an F# version of John Conway's game of life. If you've never come across this before it's was one of the first (if not the first) cellular automations and it's great fun to have a play with one of many on line implementations. I've been using the game of life as a great way of trying out new languages for years. It was both my first java program and even one of my first VB ones (all those years ago). The rules are dead simple: there are a grid of cells of arbitrary size and each cell can be either on or off (alive or dead). If a cell has less than two neighbours it dies from loneliness, if it has more than three, it dies from overcrowding. If a dead cell has exactly 3 neighbours it comes alive. You start off with a pattern of cells and then let the generations roll.

OK, let's have a look at the code. I'm going to paste it in segments with some commentary after each, and you can run each segment in F# interactive or put the whole thing in an F# code file and run it through the F# compiler. First let's define some data structures:

// the size of the grid
let size = 10

// make the grid of 10 X 10 cells initialize each to 0
let theGrid = Array.create_matrix size size 0

// a square array of array with the pattern to run
let thePattern = [|
    [|1;1;1|];
    [|1;0;0|];
    [|0;1;0|]|]

The 'let' statement defines a value (which can also be a function). F# programs seem to be just a collection of let statements as you build up a hierarchy of values defined with other values. It's this declarative style that is one of the hardest things to get your head around if you're an imperative programmer of many years like me. I've defined two arrays of arrays here, one for the initial grid and one for the initial pattern. There's a data structure called a 'list' which is lightweight immutable linked list and more in the functional tradition, but I've used a mutable array here, not because I want the mutability, but because you can address the cells by co-ordinate. The pattern I've set here is the famous 'Glider' that should move across the grid.

// define a function for iterating through a matrix (thanks to DeeJay)
let iteri_matrix f = Array.mapi (fun i -> Array.mapi (f i)) 

This function, iteri_matrix, iterates through my array of array applying a function that takes the co-ordinates of the current cell. I got it from a comment to 'a walk with a newbie', on HubFS which is the community site for F#. That post is a great introduction to a lot of good F# techniques, but I couldn't help thinking that the programming style was too imperative. In the same way that you can write procedural code in OO languages, you can write imperative code in F#. You can do it, but it's not particularly elegant.

The 'iteri_matrix' function illustrates one of the core features of functional programming: functions as first class constructs. iteri_matrix takes a function 'f' as its argument and returns a function. The function it returns consists of a static function Array.mapi that for each item (which is an array, since we want to run it over an array of arrays) applies a second Array.mapi that applies the function 'f' that we originally passed in to each item of the array. When you call iteri_matrix passing it a function argument and a matrix, it constructs a new function that will iterate over the matrix with the given function and then applies that function to the matrix. OK, it's a bit mind boggling, but really powerful when you get the hang of it.

// add the pattern to the middle of the grid
let offset = (size - thePattern.Length)/2
let getPatternCell (i,j) =
    if i < 0 || j < 0 then 0 
    elif i < thePattern.Length && j < thePattern.Length then thePattern.(i).(j)
    else 0
let addPatternToGrid grid = grid |> iteri_matrix(fun i j _ -> grid.(i).(j) + getPatternCell(i-offset, j-offset))

Next we add the initial pattern to the centre of the grid. First we define a function to get a pattern cell at a particular co-ordinate. It returns zero if the co-ordinates are out of range because we want to use it to sum the pattern and the grid without having to make the pattern matrix the same size as the grid matrix. Of note here is the use of a 'traditional' if.. elif ... else statement, later on we'll see another way of mapping conditionals called 'pattern matching' (not to be confused with regular expressions!). 

The second function, 'addPatternToGrid', takes a grid and uses iteri_matrix to sum the grid and the pattern (the offset value puts the pattern in the middle of the grid). A really cool feature of F# here is the '|>' operator which secret geek explains very nicely here. It allows you to chain functions very intuitively a bit like the way the pipe operator '|' works in most unix shell environments.

// define neighbours
let prev i = if i = 0 then size - 1 else i - 1
let next i = if i = size - 1 then 0 else i + 1
let neighbours (i,j) = [
    (prev i, prev j);
    (prev i, j);
    (prev i, next j);
    (i, prev j);
    (i, next j);
    (next i, prev j);
    (next i, j);
    (next i, next j)]

The next step is to define the neighbours of a cell. We're using a feature of F# called 'tuple' to define co-ordinates '(i,j)'. It's a really simple data structure that just groups together some values. The functions 'prev' and 'next' just define the previous and next numbers and 'wrap around' the matrix. The function 'neighbours' takes a co-ordinate and returns a list of the co-ordinates of the neighbours.

// sum a list
let rec sum aList =
    match aList with
    | [] -> 0
    | first::newList -> first + sum newList

This function simply sums a list, but it shows several interesting features. The first is that recursion is the preferred way of doing looping in functional languages. We've defined the function with 'let rec' which tells F# that this function is recursive (there must be a good reason for this, but why can't F# just figure that out for itself?) and we get the function to call itself passing the list minus it's first member. 'match aList with' is a match expression that defines a list of possible pattern matches for aList. The first match matches an empty list and returns zero, the second match strips off the first item from the list and then adds it to the sum of the remaining items.

// get the sum of the live neighbours
let cellAddNeighbours (i,j) grid = neighbours (i,j) |> List.map (fun (a,b) -> grid.(a).(b)) |> sum

// calculate neighbour sums for all cells
let addNeighbours grid = grid |> iteri_matrix (fun i j _ -> cellAddNeighbours (i,j) grid)

Next we work out the sum of all a cell's neighbours with the function 'cellAddNeighbours'. This takes a co-ordinate 'tuple' and a grid as arguments. We pass the cell co-ordinate to the neighbours function, get a list of neighbour co-ordinates back, pass that list to a static function of the List class, List.map, which runs a function on each item and returns a new list of results. Here we're just getting the cell value at the neighbour co-ordinate. The list of cell values is then passed to 'sum' which returns a single value of the sum of all the neighbours. The next function, 'addNeighbours', simply runs 'cellAddNeighbours' for each cell in the grid and returns a new grid of the sums. Once again it uses the iteri_matrix function we defined above.

// live or die rules for a single cell:
//      if the cell is alive and has 2 or 3 neighbours, then it lives, otherwise it dies
//      if the cell is dead and has 3 neighbours it comes alive
let cellLiveOrDie cellValue neighbourSum =
    match (cellValue, neighbourSum) with
    | (1,(2 | 3)) -> 1
    | (0, 3) -> 1
    | (_,_) -> 0

// calculate live or die for the whole grid
let liveOrDie grid neighbourSumGrid = grid |> iteri_matrix (fun i j _ -> cellLiveOrDie grid.(i).(j) neighbourSumGrid.(i).(j))

The next function 'cellLiveOrDie' contains the main set of rules for the game of life. Once again we're using pattern matching, but this time on a tuple. This is a really neat feature because it allows you to just list a set of rules for matching each element in a tuple without writing loads of conditional logic. The first rule says 'if the cellValue is 1 and the neighbourSum is 2 or 3 set the cell value to 1'. The second rule says 'if the cellValue is 0 and the neighbourSum is 3 set the cell value to 1. The third rule says, 'for any other combination, set the cell value to 0. The underscore '_' is a really useful bit of syntax that means 'any value'.

The second function here, 'liveOrDie' takes two matrixes, one of the current grid and one containing the neighbour sums that we calculated earlier. Once again it uses iteri_matrix to apply 'cellLiveOrDie' to each cell in the grid.

// print a cell
let printCell cell = 
    if cell = 1 then printf("X ") else printf("_ ")

// print the grid
let printGrid grid = 
    grid |> Array.iter (fun line -> (line |> Array.iter printCell); printf "\n");
    printf "\n\n"

This function 'printCell' prints a cell to the command line and 'printGrid' uses printCell to print the whole grid. There's a tiny bit of imperative programming in 'printGrid': '(line |> Array.iter printCell); printf "\n"', you can use a semicolon to separate statements just like in C and here we're using it to print a new line after each row of the grid.

// do n generations
let rec DoGenerations n grid =
    printGrid grid;
    match n with
    | 0 -> printf "end\n"
    | _ -> grid |> addNeighbours |> liveOrDie grid |> DoGenerations (n-1)

'DoGenerations' is the core loop of the application. Once again we're favoring recursion over looping in order to do the generations. I really like the clarity of the |> operator. Here we see how nice it is to simply chain together our previously defined functions 'addNeighbours' and 'liveOrDie' to create the next generation.

// run 10 generations
do theGrid |> addPatternToGrid |> DoGenerations 10

Finally, here is the program's 'Main()'. The 'do' statement simply runs the given expression. We take the initial grid, add the pre defined pattern and do 10 generations. The output should be something like this:

_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ X X X _ _ _ _ 
_ _ _ X _ _ _ _ _ _ 
_ _ _ _ X _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 


_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ X _ _ _ _ _ 
_ _ _ X X _ _ _ _ _ 
_ _ _ X _ X _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 

........ missing out a few generations here ......

_ _ X _ _ _ _ _ _ _ 
_ X X _ _ _ _ _ _ _ 
_ X _ X _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 


_ X X _ _ _ _ _ _ _ 
_ X _ X _ _ _ _ _ _ 
_ X _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ 


Monday, December 04, 2006

How to write an XSD

Web services are all about communicating with XML messages. The great benefit of XML is that it is a platform neutral technology. These messages shouldn't have any dependency on a particular web service implementation technology (such as .net or java). Unfortunately many of the implementation toolkits (especially ASP.NET) encourage you to think of web services as Remote Procedure Calls (RPC) which can inject unwanted dependencies on the toolkit and often leads to sub-optimal 'chatty' interfaces. That's why it's always best to define your messages using XSD rather than by getting your implementation toolkit (such as visual studio) to spit out type definitions based on your technology specific types (such as .net classes).

The question then becomes how to write effective XSDs. In this document I'd like to give a few pointers. The example for this demonstration is the following XML document:

<Order 
  xmlns="uri:ace-ina.com:schemas:order" 
  xmlns:prd="uri:ace-ina.com:schemas:product" 
  Id="0">
	<OrderLines>
		<OrderLine Id="0">
			<Product Id="0">
				<prd:Name>Bread</prd:Name>
				<prd:Price>0.79</prd:Price>
			</Product>
			<Quantity>2</Quantity>
			<Total>1.58</Total>
		</OrderLine>
		<OrderLine Id="1">
			<Product Id="2">
				<prd:Name>Milk</prd:Name>
				<prd:Price>0.48</prd:Price>
			</Product>
			<Quantity>1</Quantity>
			<Total>0.48</Total>
		</OrderLine>
	</OrderLines>
	<Total>2.06</Total>
</Order>

It's a simple order with an id and a collection of order lines. Each order line defines a product and gives the quantity and total. The namespace of the order is 'uri:ace-ina.com:schemas:order'. A bit of added complication is introduced by defining the product in a separate namespace: 'uri:ace-ina.com:schemas:product'.

Now let's create an XSD that defines the schema for this XML document. The XSD meta-schema is defined in the namespace: 'http://www.w3.org/2001/XMLSchema', and an XSD's root element is always 'schema', so let's start with that:

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema'>
</xs:schema>

We also want to define the namespace of the target document which in this case is 'uri:ace-ina.com:schemas:order'. We need to include that namespace and reference it in the targetNamespace attribute. To enforce that all the defined elements in the XSD should belong to the target namespace we need to set elementFormDefault to 'qualified'.

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	elementFormDefault='qualified'>
</xs:schema>

Next we should define our types. Think of types in your XSD as entities in the same way as you would think of classes in a .net application or tables in a database. In the order document there are two primary types: 'Order' and 'OrderLine'. 'Product' belongs to a seperate namespace and XSD file and we'll be looking at that later. Types that contain attributes and/or elements are known as 'complex types' and are defined in a 'complexType' element. I like to name complex types '<name of target element>Type'. So let's add two complex types to our XSD, OrderType and OrderLineType:

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	elementFormDefault='qualified'>
	<xs:complexType name="OrderType">
	</xs:complexType>
	<xs:complexType name="OrderLineType">
	</xs:complexType>
</xs:schema>

We can add attributes to our types using the 'attribute' element. Both OrderType and OrderLineType have id attributes which we want to be required integer types:

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	elementFormDefault='qualified'>
	<xs:complexType name="OrderType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
	</xs:complexType>
	<xs:complexType name="OrderLineType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
	</xs:complexType>
</xs:schema>

Child elements can be defined as part of a 'sequence', 'choice' or 'all' containing element. 'Sequence' requires that all its elements exist in the given sequence in the target document, 'choice' allows only one of it's child elements to exist and 'all' requires that all or none of the defined elements exist, but that the order is not important. Repeating elements are not allowed in an 'all' group. minOccurs and maxOccurs are used to define optional and repeating elements. In this case we want to define 'OrderLines' and 'Total' for 'OrderType' and 'Product', 'Quantity' and 'Total' for 'OrderLine'. They are all required non-repeating elements so we don't need to specify minOccurs and maxOccurs (the default for both is '1') and we'll use 'Sequence' for all of them. We need to define the type of each element, both the OrderType and OrderLineType Total are defined as 'double' and Quantity is defined as 'integer'. We'll leave the types of 'OrderLines' and 'Product' until later:

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	elementFormDefault='qualified'>
	<xs:complexType name="OrderType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="OrderLines" type=""/>
			<xs:element name="Total" type="xs:double"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLineType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="Product" type=""/>
			<xs:element name="Quantity" type="xs:integer"/>
			<xs:element name="Total" type="xs:double"/>
		</xs:sequence>
	</xs:complexType>
</xs:schema>

Because Total has the same name and type in both OrderType and OrderLineType, we can factor out a global element called Total and reference it from inside OrderType and OrderLineType:

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	elementFormDefault='qualified'>
	<xs:element name="Total" type="xs:double" />
	<xs:complexType name="OrderType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="OrderLines" type=""/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLineType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="Product" type=""/>
			<xs:element name="Quantity" type="xs:integer"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
</xs:schema>

Now let's consider the OrderLines element in OrderType. In the target document, OrderLines contains a collection of OrderLine types, so we need to create a collection type for OrderLines. We can create a new complex type 'OrderLinesType' with a single repeating element 'OrderLine'. A repeating element is created by setting minOccurs to '0' and maxOccurs to 'unbounded'. We can then set the type of OrderLines to 'OrderLinesType'.

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	elementFormDefault='qualified'>
	<xs:element name="Total" type="xs:double" />
	<xs:complexType name="OrderType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="OrderLines" type="OrderLinesType"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLineType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="Product" type=""/>
			<xs:element name="Quantity" type="xs:integer"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLinesType">
		<xs:sequence>
			<xs:element name="OrderLine" type="OrderLineType" minOccurs="0" maxOccurs="unbounded"/>
		</xs:sequence>
	</xs:complexType>
</xs:schema>

We're still missing the product type. This is defined in a seperate namespace 'uri:ace-ina.com:schemas:product' in a seperate XSD document:

<xs:schema 
	xmlns:xs="http://www.w3.org/2001/XMLSchema" 
	xmlns="uri:ace-ina.com:schemas:product" 
	targetNamespace="uri:ace-ina.com:schemas:product" 
	elementFormDefault="qualified">
	<xs:complexType name="ProductType">
		<xs:sequence>
			<xs:element name="Name" type="xs:string"/>
			<xs:element name="Price" type="xs:double"/>
		</xs:sequence>
		<xs:attribute name="Id" type="xs:integer" use="required"/>
	</xs:complexType>
</xs:schema>

To reference a schema from another schema with a different namespace we use 'import'. To reference another XSD with the same namespace, use 'include'. Here the namespace is different so we need to add an 'import' element to our Order XSD. We also need to define the product namespace and give it a prefix, since we already have a default namespace (uri:ace-ina.com:schemas:order). We'll use 'prd' here. We can now define the Product element's type as 'prd:ProductType':

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	xmlns:prd='uri:ace-ina.com:schemas:product'
	elementFormDefault='qualified'>
	<xs:import namespace="uri:ace-ina.com:schemas:product" schemaLocation="Product.xsd" />
	<xs:element name="Total" type="xs:double" />
	<xs:complexType name="OrderType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="OrderLines" type="OrderLinesType"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLineType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="Product" type="prd:ProductType"/>
			<xs:element name="Quantity" type="xs:integer"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLinesType">
		<xs:sequence>
			<xs:element name="OrderLine" type="OrderLineType" minOccurs="0" maxOccurs="unbounded"/>
		</xs:sequence>
	</xs:complexType>
</xs:schema>

The last remaining task is to define our top level global element 'Order' with type 'OrderType':

<xs:schema 
	xmlns:xs='http://www.w3.org/2001/XMLSchema' 
	xmlns='uri:ace-ina.com:schemas:order' 
	targetNamespace = 'uri:ace-ina.com:schemas:order'
	xmlns:prd='uri:ace-ina.com:schemas:product'
	elementFormDefault='qualified'>
	<xs:import namespace="uri:ace-ina.com:schemas:product" schemaLocation="Product.xsd" />
	<xs:element name="Order" type="OrderType" />
	<xs:element name="Total" type="xs:double" />
	<xs:complexType name="OrderType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="OrderLines" type="OrderLinesType"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLineType">
		<xs:attribute name="Id" type="xs:integer" use="required" />
		<xs:sequence>
			<xs:element name="Product" type="prd:ProductType"/>
			<xs:element name="Quantity" type="xs:integer"/>
			<xs:element ref="Total"/>
		</xs:sequence>
	</xs:complexType>
	<xs:complexType name="OrderLinesType">
		<xs:sequence>
			<xs:element name="OrderLine" type="OrderLineType" minOccurs="0" maxOccurs="unbounded"/>
		</xs:sequence>
	</xs:complexType>
</xs:schema>

Defining your web service message types in terms of XSD decouples your web service from a particular implementation technology and aids interoperability. Also, understanding the XSD syntax allows you to read and understand WSDL, create your own client proxies and control the serialization between your implementation types and the XSD.

For a much more complete and extensive discussion on writing XSD schemas see the world wide web consortium's XML Schema Part 0: Primer Second Edition

Wednesday, November 29, 2006

XmlDiff

I had to compare two xml files today. This is trickier than you would at first think. Luckily Microsoft has a neat tool, XmlDiff that makes it quite easy. It can compare two files, or two XmlReaders or even fragments. Also it can spit out an xml diffgram so that you can examine and re-apply any changes. Just use it like this...

[Test]
public void XmlDiffTest()
{
    string source = "<root><child1>some text</child1><child2>more text</child2></root>";
    // note some whitespace, child nodes in different order, comments
    string target = "<root> <!-- I'm a comment --> <child2>more text</child2> " + 
        "<child1>some text</child1>  </root>"; 

    XmlReader expected = XmlReader.Create(new StringReader(source));
    XmlReader actual = XmlReader.Create(new StringReader(target));
    StringBuilder differenceStringBuilder = new StringBuilder();
    XmlWriter differenceWriter = XmlWriter.Create(new StringWriter(differenceStringBuilder));

    XmlDiff diff = new XmlDiff(XmlDiffOptions.IgnoreChildOrder |
        XmlDiffOptions.IgnoreComments |
        XmlDiffOptions.IgnoreWhitespace);

    bool areDifferent = diff.Compare(expected, actual, differenceWriter);
    Assert.IsTrue(areDifferent, string.Format(
        "expected response and actual response differ:\r\n{0}", differenceStringBuilder.ToString()));
}

Friday, November 24, 2006

Playing with the XmlSerializer

Have you ever worked on one of those projects where everything is a huge XML document and the code is littered with string literals containing XPath queries? It's a nasty hole to dig yourself into and it's easy to end up with very brittle and fragile application where the structure of your data is baked into hundreds of string literal XPath queries that aren't checked until run time and are a nightmare to change if your data structure changes. You loose all the benefits of OO, no refactoring or encapsulation and condemn yourself to a life of stepping through the debugger and examining the watch window trying to work out which of your hundreds of concatenated XPath queries aren't quite right.

There's a better way and that is to use xsd.exe to generate classes that match your XML schema and then deserialize your XML into the generated object graph. You can then work with .net types rather than an amorphous XML document with all the compile time type checking, intellisense and other benefits that brings. Xsd.exe comes with Visual Studio, you can easily find it and how to use it by opening the Visual Studio command prompt and typing 'xsd /?'. Serialization and deserialization is handled by the System.Xml.Serialization.XmlSerializer.

The project I'm currently working on requires our piece to communicate with a very complex web service whose WSDL describes more than 380 different types. We initially decided not to deserialize the XML because the serialization process was taking about 7 seconds, an unacceptably long time. Now of course we've dug ourselves into exactly the situation I've described above so I decided to look into the XmlSerializer in a little more depth.

The first thing I did was try and simplify the schema. Although the WSDL's XSD describes those 380 types, the message we actually send only uses a subset, so I've trimmed back the object model to just the types we actually need. It's easy to do, I just commented out the properties that we don't need and since many of these properties are complex types themselves, this often has the benefit of trimming whole branches of the object graph. Doing this I've managed to get the XmlSerialization process down to just under 5 seconds. But 5 seconds is still too long.

The next thing was to do some investigation into where the time was being taken up. I wrote a little test that timed the creation of the serializer and the seralization and deserialization process:

[Test]
public void SerializationTest()
{
	// initialize
    XmlSerializer serializer = null;
    Time("Serializer create", delegate()
    {
        serializer = new XmlSerializer(typeof(MyComplexType));
    });
    
    string inputXmlPath = GetPath(_inputFileName);
    MyComplexType myComplexType = null;

    // deserialize
    Time("Deserialize", delegate()
    {
        using(FileStream stream = new FileStream(inputXmlPath, FileMode.Open, FileAccess.Read))
        {
            myComplexType = (MyComplexType)serializer.Deserialize(stream);
        }
    });

    string outputXmlPath = GetPath(_outputFileName);

    // serialize
    Time("Serialize", delegate()
    {
        using(FileStream stream = new FileStream(outputXmlPath, FileMode.Create, FileAccess.Write))
        {
            serializer.Serialize(myComplexType, stream);
        }
    });
}

private delegate void Function();
private void Time(string description, Function function)
{
    DateTime start = DateTime.Now;
    function();
    DateTime finish = DateTime.Now;
    Console.WriteLine("{0} elapsed: {1}", description, finish - start);
}

The results were as follows:

Serializer create elapsed: 00:00:04.6040060
Deserialize elapsed: 00:00:00.6242720
Serialize elapsed: 00:00:00.1872816

So you can see that the majority of the time is taken by the construction of the serializer itself. What's it doing? I read the docs and did a bit of Googling and found this execellent series of blog posts by Scott Hanselman all about the XmlSerializer.

It turns out that the XmlSerializer emits an assembly that contains a custom serializer for your type when you call it's constructor. If you configure your tests with the following config section:

<configuration>:
  <system.diagnostics>:
    <switches>:
      <add name="XmlSerialization.Compilation" value="1"/>:
    </switches>:
  </system.diagnostics>:
</configuration>:

Then step through the code above and stop after the XmlSerializer contructor is called, you can find the .cs file in your user temp directory (on my machine that's at C:\Documents and Settings\<username>\Local Settings\Temp). You can even load it into Visual Studio, set a breakpoint and debug into it. At first I thought, OK, so I'll just create one serializer and cache it for the lifetime of the application, but after reading Scott's posts I discovered that the XmlSerializer has caching built in. Here's a little test to demonstrate:

[Test]
public void SerializerCachingTest()
{
    XmlSerializer serializer = null;

    for(int i = 0; i < 5; i++)
    {
        Time(string.Format("Creating Serializer {0}", i), delegate()
        {
            serializer = new XmlSerializer(typeof(MyComplexType));
        });
    }
}

Which spits out:

Creating Serializer 0 elapsed: 00:00:05.2907052
Creating Serializer 1 elapsed: 00:00:00
Creating Serializer 2 elapsed: 00:00:00
Creating Serializer 3 elapsed: 00:00:00
Creating Serializer 4 elapsed: 00:00:00

Cached indeed!

The next thing that concerned us was possible contention from multiple threads all trying to use the same cached XmlSerializer concurrently. I wrote a test to kick off ten deserialization requests on ten threads, time them all and time the total elapsed time of the test, here it is:

[Test]
public void ConcurrencyTest()
{
    XmlSerializer serializer = new XmlSerializer(typeof(ProcessUW));
    RunDeserializerHandler deserializerDelegate = new RunDeserializerHandler(RunDeserializer);

    Time("Total", delegate()
    {
        List asyncResults = new List();
        for(int i = 0; i < 10; i++)
        {
            asyncResults.Add(deserializerDelegate.BeginInvoke(serializer, i, null, null));
        }
        foreach(IAsyncResult asyncResult in asyncResults)
        {
            deserializerDelegate.EndInvoke(asyncResult);
        }
    });
}

delegate void RunDeserializerHandler(XmlSerializer serializer, int id);
private void RunDeserializer(XmlSerializer serializer, int id)
{
    string path = GetPath(_inputFileName);

    Time(string.Format("Deserialize {0}", id), delegate()
    {
        using(FileStream stream = new FileStream(path, FileMode.Open, FileAccess.Read))
        {
            MyComplexType myObject = (MyComplexType)serializer.Deserialize(stream);
        }
    });
}

Which gave the following result:

Deserialize 1 elapsed: 00:00:00.9333840
Deserialize 0 elapsed: 00:00:00.9333840
Deserialize 3 elapsed: 00:00:00.3422408
Deserialize 2 elapsed: 00:00:00.8556020
Deserialize 5 elapsed: 00:00:00
Deserialize 6 elapsed: 00:00:00
Deserialize 4 elapsed: 00:00:00
Deserialize 7 elapsed: 00:00:00
Deserialize 8 elapsed: 00:00:00.0155564
Deserialize 9 elapsed: 00:00:00.0155564
Total elapsed: 00:00:00.9644968

Now this is very interesting, not only is the deserialization not contentious (is that the right technical term?) since the total time of the test is only a slightly longer than the longest running individual deserialization, but the XmlSerializer also seems to recognise that it's being asked to do the same thing after the first four attempts and optimises appropriately.

So it after this investigation, it seems that we can use the XmlSerializer in a natural fashion, just constructing it where needed and deserializing / serializing as required. The first time the constructor is called will hit performance, but subsequent uses should be pretty fast. It also looks like the XmlSerializer wont become a bottleneck as our application scales. All in all pretty impressive.

Thursday, November 02, 2006

Using MemoryStream and BinaryFormatter for reuseable GetHashCode and DeepCopy functions

Here's a couple of techniques I learnt a while back to do add two important capabilities to your objects; compute a hash code and execute a deep copy. I can't find the orginal source for the hash code example, but the deep copy comes from Rockford Lhotka's CSLA. Both examples are my implementation of the basic idea. Both techniques utilise the MemoryStream and BinaryFormatter by getting the object to serialize itself to a byte array. To compute the hash code I simply use SHA1CryptoServiceProvider to create a 20 byte hash of the serialized object and get then xor an integer value from that.

public override int public override int GetHashCode()
{
    byte[] thisSerialized;
    using(System.IO.MemoryStream stream = new System.IO.MemoryStream())
    {
        new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter().Serialize(stream, this);
        thisSerialized = stream.ToArray();
    }
    byte[] hash = new System.Security.Cryptography.SHA1CryptoServiceProvider().ComputeHash(thisSerialized);
    uint hashResult = 0;
    for(int i = 0; i < hash.Length; i++)
    {
        hashResult ^= (uint)(hash[i] << i % 4);
    }
    return (int)hashResult;
}

The most common use for a hash code is to make hash tables efficient and to implement Equals(). Note, there's a one in 4,294,967,295 chance that this will provide a false equals (thanks to Richard for pointing that out to me):

public override bool Equals(object obj)
{
    if(!(obj is MyClass)) return false;
    return this.GetHashCode() == obj.GetHashCode();
}

To do a deep copy I simply get the object to serialize itself and deserialize it as a new instance. Be carefull, this technique will serialize everything in this object's graph so make sure you're aware of what is referenced by it and that all the objects in the graph are marked as [Serializable], Here's a generic example that you can reuse in any object that needs deep copy:

public T DeepCopy<T>()
{
    T snapshot;
    using(System.IO.MemoryStream stream = new System.IO.MemoryStream())
    {
        System.Runtime.Serialization.Formatters.Binary.BinaryFormatter formatter = 
            new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
        formatter.Serialize(stream, this);
        stream.Position = 0;
        snapshot = (T)formatter.Deserialize(stream);
    }
    return snapshot;
}

Wednesday, November 01, 2006

Nested files with 'DependentUpon' in Visual Studio

Here's a really neat trick discovered by my colleague Preet that you can use to show a relationship between files in Visual Studio, just like Microsoft does with partial 'designer' classes. You've probably noticed when you add a new windows form in VS 2005 that two files are created: MyForm.cs and MyForm.designer.cs. The designer file contains all the code that's generated by the form designer and the other file is where you write your user code. Preet showed me that you can do the same thing by modifying the .csproj file (aka the MSBuild file) and add a DependentUpon child element to any item that you want to appear below another. Here's a snippet of a .csproj that has three files, all partial classes of 'Foo'. In the solution explorer Foo.1.cs appears nested below Foo.cs and Foo.1.1.cs appears below Foo.1.cs.

  
  <ItemGroup>
    <Compile Include="Foo.1.1.cs">
      <DependentUpon>Foo.1.cs</DependentUpon>
    </Compile>
    <Compile Include="Foo.1.cs">
      <DependentUpon>Foo.cs</DependentUpon>
    </Compile>
    <Compile Include="Foo.cs" />
    <Compile Include="Program.cs" />
    <Compile Include="Properties\AssemblyInfo.cs" />
  </ItemGroup>

The only way of doing this at the moment is by editing the .csproj file, but it's pretty cool, especially if you're writing code generators and GAT tools like I have been recently where you might want generated content to be visually related to user written code.

Friday, October 27, 2006

Finally getting around to using NMock

I don't like to admit it, but I really am quite conservative. I get set in my ways and even if I hear about a cool new way to do stuff it often takes me a while to get around to trying it. A great example is NMock. I've known about NMock for probably a couple of years, but I've stubbornly stuck with coding my own mock objects rather than giving it a spin. That is until today when I finally gave in and downloaded NMock2. It's amazing when you think that I've been a champion of TDD in at least three organisations and I've given presentations and mentored people in TDD techniques, but I haven't investigated such a core tool for doing TDD. But then again I only started using Test Driven this year which I couldn't possibly imagine working without now.

NMock is a mock object framework. You use Mock objects and dependency injection in unit tests to allow you to test a single component rather than the whole stack. It's probably one of most important core concepts behind TDD. NMock is a really neat OO framework that leverages the powerfull .net reflection API to create concrete instances of interfaces at runtime. Say you've got an interface like this:

public interface IFoo
{
	int DoSomething(int id, string name);
}

And you've got a client class that uses IFoo to do something:

public class Client
{
	IFoo _foo;

	public Client(IFoo foo)
	{
		_foo = foo;
	}
	
	public int DoSomething(int id, string name)
	{
		return _foo.DoSomething(id, name);
	}
}

You can use NMock to create a mock object like this (I love the name 'Mockery':-):

Mockery mockery = new Mockery();
IFoo mockFoo = mockery.NewMock();

And then set expectations for your unit tests, so that when you run the test, if the correct parameters
aren't passed an exception is raised:

Expect.Once.On(mockFoo).Method("CreatePipelineAudit").With(9, "the name").Will(Return.Value(4));
Client client = new Client(mockFoo);
int result = client.DoSomething(9, "the name");
Assert.AreEqual(4, result);

One thing got me for about half an hour before I was saved by my colleague Preet is that you can't mix bare arguments with 'Is' parameters in the 'Will' clause. In the 'Will' clause you pass the argument values that you expect to be passed to your mock method as I've done above. But also you can use the very convenient 'Is' class that returns a 'Matcher'. What I didn't realise was that 'With' is overloaded:

IMatchSyntax With(params Matcher[] otherArgumentMatchers);
IMatchSyntax With(params object[] equalArgumentValues);

So you can't mix bare values with 'Is' arguments. So this wont work:

Expect.Once.On(mockFoo).Method("CreatePipelineAudit").With(9, Is.Anything).Will(Return.Value(4));

The only other thing that disapointed me about an otherwise excellent tool, is that you can't mock objects, only interfaces. The code I'm currently working on uses several abstract base classes and it would be really neat if NMock could provide mocks for them. I see that it's already a feature request, and that it was available in NMock 1.0. Let's hope they add it soon.

Thursday, October 26, 2006

More than just code?

Jeff Atwood's Coding Horror is one of my favorite blogs. One of his recent posts argued that most of us are so involved with the detail of our applications that we don't step back enough and ask why we're writing a specific piece of software. He suggests that we should become business people and lawyers and concentrate less on coding, or maybe even stop coding all together. Sure what's the point in writing great code if it never ships or nobody ever knows about it. I often read good arguments that coders should become better writers, salesmen or business people, but the problem with that argument is that it misses two important points:

First is that often the reason people become coders is because they are much stronger dealing with abstract ideas than with other humans. It's widely noted that often the best coders are boderline autistic and in a way it's great that there is such a job as 'programmer' that means lots of these people can earn a good living without having too much painfull interaction with other humans:)

Secondly is the complexity of modern society that demands deep specialisation. There's simply too much knowledge out there for anyone to be a renaisance man in the 21st century. I have enough trouble just keeping up with what's going on in the world of .net development, let alone other languages and platforms. There's simply no way that I've got enough mental bandwidth to be a great lawyer or salesman too. So sure it's good to be at least dimly aware of the reason why you're writing that code, but I expect for the vast majority of corporate developers, the reason they're writing that code is because their boss told them to.

Now, if are one of those exceptional people that do have the mental bandwidth to be a great lawyer salesman and programmer, then you'll probably end up being very successfull anyway. One reads stories of Bill Gates out legalling (is that a word?) his lawyers and he's obviously a great salesman too, but I think for most of us lumpen programmeren just keeping up with our corner of the coding world is probably work enough.

Monday, October 23, 2006

Playing with Providers

'Inversion of control' (also known as 'dependency injection' or the 'dependency inversion principle') is a common OO pattern that allows you to decouple the layers of your application by removing the dependency of client class on a server class. It is not only good practice, but essential for writing unit tests. Rather than hard coding the server's concrete type in the client, you make the client refer to a server interface or abstract class and inject the concrete server at runtime. Sometimes you'll have the server injected by a co-ordinator or service class, but in many cases you want to be able to configure the server so that you can change the server's type without having to recompile the application. This becomes essential if you're writing a reusable framework where you want to allow your users to provide their own servers and indeed, the .NET base class library uses this pattern extensively. In previous versions of .NET you had to roll your own code to read the type information from a config file and then load the correct assembly and instantiate the correct type, but now in .NET 2.0 there's a Providers framework that makes it child's play to load in servers at runtime. The classes live in the System.Configuration.Provider namespace.

OK, my client uses this interface with a single method 'DoIt()':

public interface IServer
{
	void DoIt();
}

I have to define a base provider that implements this interface and extends System.Configuration.Provider.ProviderBase:

public abstract class ServerProvider : ProviderBase, IServer
{
	public abstract void DoIt();
}

I also need a custom configuration section to place my providers in. Note that we need to provide a 'DefaultProvider' property and a 'Providers' property. The DefaultProvider tells us which provider we should use, but allows us to keep multiple providers on hand if we, for example, want to allow the user to select from them at runtime. The Providers property is of type ProviderSettingsCollection that's supplied for us in the System.Configuration namespace. The new custom configuration section feature of .NET 2.0 is also really nice, but that's another post...

public class CustomConfigSection : ConfigurationSection
{
	[ConfigurationProperty("DefaultProvider")]
	public string DefaultProvider
	{
		get { return (string)this["DefaultProvider"]; }
	}

	[ConfigurationProperty("Providers")]
	public ProviderSettingsCollection Providers
	{
		get{ return (ProviderSettingsCollection)this["Providers"]; }
	}
}

Now, in our client we just grab our custom config section and use the System.Web.Configuration.ProvidersHelper class to load our providers, it's that easy. You can then just select the default provider or maybe provide a list for the user to select. I've left out all the error handling code to make it simpler, but you really should check that the stuff gets loaded that you're expecting

Configuration configuration = ConfigurationManager.OpenExeConfiguration(
    ConfigurationUserLevel.None);
CustomConfigSection section = configuration.Sections["CustomConfigSection"] as CustomConfigSection;
ProviderCollection providers = new ProviderCollection();
ProvidersHelper.InstantiateProviders(section.Providers, providers, typeof(ServerProvider));
ServerProvider provider = (ServerProvider)providers[section.DefaultProvider];
...
provider.DoIt();

Here's a sample provider called XmlServerProvider. Note the Initialize method that you have to implement. It takes the name of the provider and a name value collection 'config' that contains any properties that you might require to be set for your provider. In this case, apart from the common name and description properties, the provider also requires a 'filePath' property. You should also check that there aren't any superfluous properties in the configuration.

public class XmlServerProvider : ServerProvider
{
	string _filePath;

	public override void DoIt()
	{
		....
	}
	
	public override void Initialize(string name, System.Collections.Specialized.NameValueCollection config)
    {
        if(config == null) throw new ArgumentNullException("config");
        if(string.IsNullOrEmpty(name))
        {
            name = "XmlServerProvider";
        }
        if(string.IsNullOrEmpty(config["description"]))
        {
            config.Remove("description");
            config.Add("description", "A xml based server");
        }
        base.Initialize(name, config);

        // test that each property exists
        _filePath = config["filePath"];
        if(string.IsNullOrEmpty(_filePath))
        {
            throw new ProviderException("filePath not found");
        }

        // throw an exception if any unexpected properties are present
        config.Remove("filePath");
        if(config.Count > 0)
        {
            string propertyName = config.GetKey(0);
            if(!string.IsNullOrEmpty(propertyName))
            {
                throw new ProviderException(string.Format("{0} unrecognised attribute: '{1}'",
                    Name, propertyName));
            }
        }
    }	
}

And last of all, here's a snippet from the App.config or Web.config file. You have to define your custom config section in the configSections section. Here we're loading the XmlServerProvider, note the name, type and filePath properties.

<configSections>
<section name="CustomConfigSection" type="MyAssembly.CustomConfigSection, MyAssembly" />
</configSections>
<Operation DefaultProvider="XmlServerProvider">
<Providers>
<add
name="XmlServerProvider"
type="MyAssembly.XmlServerProvider, MyAssembly"
filePath="c:/temp/myXmlFile.xml" />
</Providers>
</Operation>

Thursday, October 19, 2006

What is AzMan?

Does your application require a finer grained level of control than simply authorizing users to access a particular web directory or windows form? Do you have complex roles with overlapping tasks that consist of multiple operations? Do you want to be able to disable or enable individual user interface elements according to the user's role definitions? Are your roles complex and likely to change during the operational life span of your application?

The .net framework has a nice API for managing role based security which works on a simple but effective mapping of users to roles:

[User] -- has a --> [Role]

But with complex business requirements where different roles have overlapping tasks within the application and you need to be able to modify roles without recompling, it's often neccessary to have a more complex model that maps operations (individual functions within the application like 'Add order line' for example) to tasks (like 'Order product for user') and tasks to roles (like 'Sales advisor'):

[User] -- has a --> [Role] -- is allowed to execute --> [Tasks] -- are made up of --> [Operations]

This means that the application can simply ask if a given user has permission to execute a certain operation and it can be left to an administration function, with a nice GUI, to assign the operations to tasks and the tasks to roles rather than baking it into the application code.

It's quite common for people to spin their own security sub systems that have this more complex model. I've seen some pretty involved home made security frameworks out in the wild and it creates a considerable development overhead. What's needed is a built-in API for managing this more complex authorization model.

AzMan is a COM based API for managing application security that originally shipped with Windows Server 2003, but is now also available for XP (with the Windows Server 2003 Administration Tools Pack). It allows you to define fine grained operations that can be grouped into tasks that can in turn be assigned to roles as I explained above. The backing store can be either a xml file or Active Directory (can also use ADAM a stand alone Active Directory that can be created for individual applications). AzMan also adds a nice GUI MMC plugin for user/group/role management.

Unfortunately it's a COM based API and as yet it's not supplied with a convenient wrapper, you have to use the interop and there's a good MSDN article here on how to do that (Use Role-Based Security in Your Middle Tier .NET Apps with Authorization Manager).

AzMan can also be used without any extra coding in the ASP.NET 2.0 security model, but since that model is role based you can't leverage any of the operation based features, for that you need to write to the Interop API. To use AzMan in ASP.NET 2.0 simply configure your authorization role provider as the AuthorizationStoreRoleProvider class that's a supplied with the framework (How To: Use Authorization Manager (AzMan) with ASP.NET 2.0)

Wednesday, October 11, 2006

log4net

Today I've been playing with a logging framework called log4net. It's a port of a java logging framework (log4j would you believe), that's really popular. The nice thing about log4net is that it's really simple to use and configure and comes with a huge range of log sinks (known as 'appenders' in log4net speak) straight out of the box, you can even log to telnet. I think the sign of a good framework is one that lets you get up and running really quickly without having to study the full model, but has the extensibility to allow you to do the more complex stuff if you need to as well as being fully configurable without having to recompile you application and log4net seems to be that kind of framework. I haven't used the logging application block from the P&P group, so I can't compare it to that. For me, moving up from writing directly to a file or the event log is a big move forward.

Friday, October 06, 2006

No to #region!

The #region directive in C# was invented so that a code file can be split up into collapsable regions to aid navigation. I can stand them, and here's my list of region irritations...

  1. They're like comments, they don't execute so it's easy to have regions which tell you something completely wrong. How about the region '#region public properties' that contains nothing but private methods. Yeah, I've seen that enough. Martin Fowler in his excellent book Refactoring says "When you feel the need to write a comment, first try to refactor the code so that any comment becomes superfluous". The same goes for regions, if you feel a region comming on, maybe you need to refactor, which brings me to...
  2. If you're code file is so large that it needs regions to keep it organised, maybe your code file is too large. Visual Studio likes to have one class per file and classes shouldn't be so large that they need to be split into regions. Maybe when you feel a region comming on you should try refactoring your class into more smaller classes. What about regions that split up a method into easier to understand segments? You have regions inside methods???? That's too far gone!
  3. What's wrong with the great tools that come with Visual Studio for helping you navigate around your code. I think one look at the Class View is worth a million stupid regions and if what you see in the class view doesn't make any sense then you should really get a copy of 'Refactoring'. Well named methods and classes in a well designed object model should make your code easy to understand and navigate.

So just say no to regions, I'm sick of clicking on those stupid little + signs!

Wednesday, October 04, 2006

Nullability Voodoo

We had a good discussion in the office today about nulls. People often use null in a database to mean something in business terms. For example if 'EndDate' is null it means that the task hasn't ended yet. But this kind of 'nullability voodoo' is bad, you're not being explicit about meaning and someone looking at your database schema has to know more implicit rules beyond what the schema itself can provide. Of course that's always going to be the case, but keeping explicitness (is that a word?) to a maximum will save you lots of time and money later. Nullability usually means something in a business sense that is better represented in some other way. Rather than using the nullability of EndDate to mean that the task hasn't completed, consider giving the task a status instead. I've maintained systems where complex rules about various attributes had to be interpreted to mean some kind of status to know how painfull this can be.

If you must represent nullable types in managed code, avoid using the SqlTypes. I've found numerous problems with them, they don't implicitly cast or behave like the basic value types and in any case, who wants to drag a reference to System.Data up into their domain layer. I haven't used the new nullable types in .net 2.0 so I can't really comment on them, but effectively it's a way of giving nullability to value types and has a nasty hackish smell about it. In any case you should be very carefull of equating TSQL null (an empty value) with C# null (a zero pointer) they mean different things and it can make code very tricky when you constantly have to test for null everywhere.

It's worth checking out the 'null object' pattern if you've got a business case for an entity that has to represent itself as a null value. It means that you can factor all your null processing into one class.

Wednesday, September 27, 2006

Problems with the DataGridViewComboBoxColumn

I've been playing with the new DataGridView that comes with Windows.Forms 2.0. It's really nice because it fully supports in-place editing and supplies a range of built in cell editors. Unfotunately they don't behave exactly as you'd expect (or maybe it's just me) and I've just spent a very fustrating moring with much googling trying to get some simple functionality to work that should have been a ten minute job. The problem is with the combo box cell editor (the DataGridViewComboBoxColumn column type). Now, with a standard windows combo box you can assign its DataSource property to anything that implements IList, by default it displays the ToString() value in the drop down list and the SelectedItem property is the item that's selected, as you'd expect. This makes it really really easy (one line of code easy) to get the user to choose a particular object from a list. There are also DisplayMember and ValueMember properties that allow you to set the property that's displayed and the property that is returned from SelectedItem, but I've never used these, since the default behaviour is exactly what I need 99% of the time.

However, the DataGridViewComboBoxColumn doesn't work like this, although it will display the ToString value if you don't set the DisplayMember, something internally goes wrong when it tries to look up the SelectedItem, you have to set DisplayMember to a public property of your class. Even worse, the default behaviour if you don't set the ValueMember property is to return the DisplayMember, there's no way of getting actual item itself. The only work around is to add a property to your class that returns itself and set that property to the ValueMember. Of course, if your item isn't something you are able to change (such as one of the framework classes) you'll have to cludge together a container object that holds a reference to your item. It's really awkward. Here's a little example I used to try and get it to work...

First, here's the class I want to bind to my DataGridView, note that the Child property is of type Child:

using System;

namespace DataGridViewTest
{
    public class Thing
    {
        string _name;

        public string Name
        {
            get { return _name; }
            set { _name = value; }
        }

        Child _child;

        public Child Child
        {
            get { return _child; }
            set { _child = value; }
        }
    }
}

Here's the Child class, you can see that I've implemented a 'This' propety that returns a reference to itself:

using System;

namespace DataGridViewTest
{
    public class Child
    {
        string _name;

        public string Name
        {
            get { return _name; }
            set { _name = value; }
        }

        public Child(string name)
        {
            _name = name;
        }

        public Child This
        {
            get { return this; }
        }
    }
}

And here's the form code. In the contructor we bind a list of things to the DataGridView using a P&P DataGridViewBinding, set the _childColumn.ValueMember to the This property so that it returns a child to the Child property of Thing, we set the _childColumn.DisplayMember to the Name property of child to display in the drop down list because the default ToString() functionality doesn't work and finally bind the DataSource of the child column to a list of children. showToolStripMenuItem_Click is the event handler for a menu item that just displays the things that have been created and GetChildren simply constructs a list of children.

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using Microsoft.Practices.SoftwareFactories.ServiceFactory.Editors.Grid;

namespace DataGridViewTest
{
    public partial class Form1 : Form
    {
        List _things = new List();
        DataGridViewBinding _thingsBinding = new DataGridViewBinding();

        public Form1()
        {
            InitializeComponent();
            _childColumn.ValueMember = "This";
            _childColumn.DisplayMember = "Name";

            _thingsBinding.Bind(_dataGrid, _things);
            _childColumn.DataSource = GetChildren();
        }

        private void showToolStripMenuItem_Click(object sender, EventArgs e)
        {
            StringBuilder message = new StringBuilder();
            foreach(Thing thing in _things)
            {
                message.Append(string.Format("{0}, {1}\r\n", thing.Name, thing.Child.Name));
            }
            MessageBox.Show(message.ToString());
        }

        private Child[] GetChildren()
        {
            return new Child[]{
                new Child("freddy"),
                new Child("Alice"),
                new Child("Leo"),
                new Child("Ben")
            };
        }
    }
}

I noticed that all the example code in MSDN is devoted to binding the DataGridView to DataSets, I wonder if it's because of this data set centric mind set that the Windows.Forms team is making such basic implementation errors?

Monday, September 25, 2006

Discovering GetService

Whilst looking at the Patterns & Practices GAT code, particularly custom wizard pages, I kept seeing lines like this:

IDictionaryService dictionaryService = GetService(typeof(IDictionaryService)) as IDictionaryService;

I'd never come accross this GetService method before, when I hit F1 I found out that it's a method inherited from System.ComponentModel.Component. Now, I thought I sort of understood the whole IComponent / IContainer thing, it's a pattern used by the IDE to allow you to drag and drop classes (components) on surfaces (other classes) allowing us to graphically compose applications. This is classic RAD stuff and it's really easy to use, you just implement IComponent, your class appears in the Visual Studio toolbox and if you double click on the class in the solution explorer it displays an design service which you can drag any other class that implements IComponent on to. I must admit I hadn't really thought of it much, or used that functionality other than as it's provided out of the box by the forms designer. I'm more of a POCO guy myself, I couldn't really see what benefits implementing classes as components provided other than that neat drag drop composition stuff. 

After Googling for GetService I found this excellent post Lightweight Containers and Plugin Architectures by Daniel Cazzulino where he compares the IContainer pattern with lightweight container architectures described by Martin Fowler in this post. According to Mr Cazzulino the IContainer pattern provides the .net world with a very nicely designed component architecture that can be used for composing and sharing services at runtime and that's where the GetService method comes in. I wont repeat all of Daniel's excellent article which is definately worth reading, especially the class model which describes the relationship between IComponent, IContainer, ISite, IServiceContainer and IServiceProvider. Instead here's a little practical example of how a dumb server can have components, some of which may provide services to the other compoents to be configured at runtime using the IContainer pattern. The core trick is to extend the built-in Container to provide an instance of ServiceContainer:

using System;
using System.ComponentModel;
using System.ComponentModel.Design;

namespace ComponentPlay
{
    public class ServiceProvidingContainer : Container
    {
        IServiceContainer _serviceContainer;

        public IServiceContainer ServiceContainer
        {
            get { return _serviceContainer; }
        }

        public ServiceProvidingContainer()
        {
            _serviceContainer = new ServiceContainer();
            _serviceContainer.AddService(typeof(IServiceContainer), _serviceContainer);
        }

        protected override object GetService(Type service)
        {
            return _serviceContainer.GetService(service);
        }
    }
}

Then you can provide a server with your custom Container:

using System;
using System.ComponentModel;
using System.ComponentModel.Design;

namespace ComponentPlay
{
    public class Server
    {
        IContainer _components = new ServiceProvidingContainer();

        public IContainer Components
        {
            get { return _components; }
        }
    }
}

Then I've defined a simple service interface:

using System;
using System.Collections.Generic;
using System.Text;

namespace ComponentPlay
{
    public interface IMessageProvider
    {
        string Message { get; }
    }
}

And a component that implements the IMessageProvider. The important thing here is to note how it overrides the Site property of Component to add itself to the service container, thus registering itself as an available service to other components (this is staight out of Daniel's post):

using System;
using System.ComponentModel;
using System.ComponentModel.Design;

namespace ComponentPlay
{
    public class MessageProvider : Component, IMessageProvider
    {
        string _message;

        public MessageProvider(string message)
        {
            _message = message;
        }

        public string Message
        {
            get
            {
                return _message;
            }
        }

        public override ISite Site
        {
            get
            {
                return base.Site;
            }
            set
            {
                base.Site = value;
                
                // publish this instance as a service
                IServiceContainer serviceContainer = (IServiceContainer)GetService(typeof(IServiceContainer));
                if(serviceContainer != null)
                {
                    serviceContainer.AddService(typeof(IMessageProvider), this);
                }
            }
        }
    }
}

Here's a component that can use an IMessageService. Note that you cannot guarantee that the service will be available:

using System;
using System.ComponentModel;

namespace ComponentPlay
{
    public class Client : Component
    {
        public void SayHello()
        {
            IMessageProvider mp = (IMessageProvider)GetService(typeof(IMessageProvider));
            if(mp == null)
                Console.WriteLine("IMessageProvider is null");
            else
                Console.WriteLine(mp.Message);
        }
    }
}

And here's a little test program showing it all put together:

using System;
using System.Collections.Generic;
using System.Text;

namespace ComponentPlay
{
    class Program
    {
        static void Main(string[] args)
        {
            Server server = new Server();
            Client client = new Client();
            MessageProvider mp = new MessageProvider("Hello from the message provider");

            server.Components.Add(client);
            server.Components.Add(mp);

            client.SayHello();

            Console.ReadLine();
        }
    }
}

The cool thing is that with this pattern is that the server is simply a container and has to know nothing about what components and services it hosts. Services can register themselves without coupling to either the Server framework or to any other components and components can discover services without any intervention from the Server framework. It's really nice to see yet another usefull pattern that comes out-of-the-box with .net, I'm just surprised that it's not more widely advertised. It's really poorly documented with no real explaination in the MSDN library (or am I just missing something). I'd love to have found a good article explaining the design choices behind this pattern and some implementation examples.

Friday, September 22, 2006

Just what do you do all day?

I've just read an excellent post by Peter Hallam, he works on the C# compiler at Microsoft. He looks at the amount of time professional developers spend writing new code, modifying existing code and understanding existing code. How much time do you think you spend on each of these activities? I wouldn't have guessed the correct figures which are:

Writing new code:             5%

Modifying existing code:    25%

Understanding code:         70%

I instinctively think that I spend most of my time writing existing code, but that's really not true. I'm currently working on building a guidance automation package, an entirely new product, but I've probably been spending the vast majority of my time reading the documentation and reading the code in the Service Factory GAT to understand the GAT framework. And this is on a new product. I once spent two and a half years of my life maintaining a huge buggy spaghetti code base, during that time I probably spent 99% of my time trying to understand existing code and the remaining 1% making small changes.

These figures strongly reinforce my belief that the number one imperative for professional software developers is writing easy to understand maintainable code, even if it doesn't perform as well and even if it takes longer to write. Think about it, if I spend five times as long to write a well factored domain model based application verses someone who uses all the visual studio tools to hack together something with datasets, no tiers and business logic spread throughout, I only have to make a small dent in the understanding code time in order to dramatically improve my productivity and the productivity of those poor unfortunate people who have to maintain my code long after I'm gone (from the project that is, not like dead. I really hope I outlive my software creations:)

I'm going to store Peter Hallam's blog in my top blog posts list so that next time someone argues that refactoring, layering or writing a proper domain model is taking too much time I'll have some great figures to back me up.

Thursday, September 21, 2006

How to examine code and write a class with EnvDTE

Further to my experiments with the Guidance Automation Toolkit, I've been playing with generating code with my custom guidance package. Looking at the Service Factory GAT that's been released by the Patterns and Practices group, they use three different techniques for code generation; T4 templates, EnvDTE and CodeDom. If they use all three, I wondered which one I should be using. I've previously used the CodeDom in other projects and although it's very powerfull, you use it to generate the syntactic structure of the code and can then generate C#, VB or whatever, it is really long winded. T4 templates are at the opposite end of the spectrum, a bit like asp for code generation, you simply write a template of the code you want to generate and put code between <# #> marks that the template engine runs. The problem with it at the moment is that they are really new and the tools are there yet. There's no intellisense or code coloring for it and debugging isn't easy either.

So I decided to have a look at the EnvDTE Visual Studio automation class library for code generation. A lot of the GAT stuff seems to be built around it, so it's a natural fit for code generation duties. Unfortunately the documentation isn't that great, and this little demo of how to navigate a code file and write a class took much longer than it should have. But here it is, It gets the current visual studio environment and enumerates though all the projects and project items. It then examines itself outputting all the code elements and finally writes a new class inside its own namespace. If you try this out, make sure you name the file it's in 'HowToUseCodeModelSpike.cs'.

using System;
using NUnit.Framework;
using EnvDTE;
using EnvDTE80;

namespace Mike.Tests
{
    [TestFixture]
    public class DteSpike
    {
        [Test]
        public void HowToUseCodeModelSpike()
        {
            // get the DTE reference...
            DTE2 dte2 = (EnvDTE80.DTE2)System.Runtime.InteropServices.Marshal.GetActiveObject("VisualStudio.DTE.8.0");

            // get the solution
            Solution solution = dte2.Solution;
            Console.WriteLine(solution.FullName);

            // get all the projects
            foreach(Project project in solution.Projects)
            {
                Console.WriteLine("\t{0}", project.FullName);

                // get all the items in each project
                foreach(ProjectItem item in project.ProjectItems)
                {
                    Console.WriteLine("\t\t{0}", item.Name);

                    // find this file and examine it
                    if(item.Name == "HowToUseCodeModelSpike.cs")
                    {
                        ExamineItem(item);
                    }
                }
            }
        }

        // examine an item
        private void ExamineItem(ProjectItem item)
        {
            FileCodeModel2 model = (FileCodeModel2)item.FileCodeModel;
            foreach(CodeElement codeElement in model.CodeElements)
            {
                ExamineCodeElement(codeElement, 3);
            }
        }

        // recursively examine code elements
        private void ExamineCodeElement(CodeElement codeElement, int tabs)
        {
            tabs++;
            try
            {
                Console.WriteLine(new string('\t', tabs) + "{0} {1}", 
                    codeElement.Name, codeElement.Kind.ToString());

                // if this is a namespace, add a class to it.
                if(codeElement.Kind == vsCMElement.vsCMElementNamespace)
                {
                    AddClassToNamespace((CodeNamespace)codeElement);
                }

                foreach(CodeElement childElement in codeElement.Children)
                {
                    ExamineCodeElement(childElement, tabs);
                }
            }
            catch
            {
                Console.WriteLine(new string('\t', tabs) + "codeElement without name: {0}", codeElement.Kind.ToString());
            }
        }

        // add a class to the given namespace
        private void AddClassToNamespace(CodeNamespace ns)
        {
            // add a class
            CodeClass2 chess = (CodeClass2)ns.AddClass("Chess", -1, null, null, vsCMAccess.vsCMAccessPublic);
            
            // add a function with a parameter and a comment
            CodeFunction2 move = (CodeFunction2)chess.AddFunction("Move", vsCMFunction.vsCMFunctionFunction, "int", -1, vsCMAccess.vsCMAccessPublic, null);
            move.AddParameter("IsOK", "bool", -1);
            move.Comment = "This is the move function";

            // add some text to the body of the function
            EditPoint2 editPoint = (EditPoint2)move.GetStartPoint(vsCMPart.vsCMPartBody).CreateEditPoint();
            editPoint.Indent(null, 0);
            editPoint.Insert("int a = 1;");
            editPoint.InsertNewLine(1);
            editPoint.Indent(null, 3);
            editPoint.Insert("int b = 3;");
            editPoint.InsertNewLine(2);
            editPoint.Indent(null, 3);
            editPoint.Insert("return a + b; //");
        }
    }
}

Thursday, September 14, 2006

Are you a nerd?

I tried this are you a nerd test today. My result was 'Mid Level Nerd'. I probably saved by the fact that I'm a family man and my wife kinda provides me with a social life :)

Wednesday, September 13, 2006

How to do database source control and builds

Most business applications are fundamentally database applications. A business application's database is as much a part of the application as its C# source code. Unfortunately, in many development shops versioning and controlling the database is done very differently from the in memory source code. For example:

  • Often a single 'development' database is shared between the developers. This means that if a developer makes a schema change he runs the risk of breaking his colleague's environments at least until the next build. Often this leads to a fear of changing the database schema.
  • The database is often versioned independently of the in memory source code. This makes is hard to deploy a specific build because often the database backups and the in memory source are are not syncronised.
  • The database is often not source controlled, or if it is, it is done as a single build mega-script. This makes it impossible to package linked database and in memory source changes as a single 'feature' or 'fix'. It makes it hard or impossible to roll out a single failing feature. Also, because the mega-script is often created by choosing the 'script database' feature of the dbms it is checked into source control under the build manger's login or whatever login the build scripts are running under. This means that it is impossible to track database changes. I can't go to a single table's create script in source safe, look for the last time it was checked in and find out who made the change and then read the comments to found out why it was changed. The best I can mange is to look through every single version of a huge script file looking for a change, but then there's no way of knowing who made it or why it was made.

How should you manage your database development then? Well the best thing is to treat it as much as you can just like all your other source code:

  • Every developer should develop against their own database instance.
  • A database backup should be stored with the executables from the build. That database backup should be a backup of a database built from the sql scripts retrieved from source control at the same label as all the other source.
  • The database should be maintained as object scripts. Each table, stored procedure, view and function should have its own script. To make any change in their local database a developer should check out the object they want to change, apply that to their database, unit test the change and then check in the change as a package with any other source for that feature or fix. The developer should label the package and comment it with a link to a feature or defect number.
  • Before a developer starts work on the next feature or fix he should get the source from the last build label (which, with continuous integration should also be the latest version) and restore his local database from the same labelled backup.

In order to do this you need tools that allow you to do the following:

  • Build a database from object level source files. Most dbms will choke if you try to just run the scripts in without first working out a build order by examining the object references. The tool must automate this for you, it's far to onerous to try to do this manually.
  • Be able to upgrade a database to a new schema version by calculating 'alter' scripts.

Here's where I plug a product that a couple of guys I know and worked with at NTL have developed, DbGhost. It allows you to adopt all those good practices that I've listed above. I've got it adopted on several projects now and they still haven't given me any commission or offered me a lucrative consultancy contract:(

Tuesday, September 12, 2006

Windows Live Writer

I've been using Windows Live Writer to write this post and the last one. In fact the last post was something I'd already written for our internal wiki and I was able to cut and paste it from there straight into the WLW window. It worked! It's a really cool tool and makes blogging a lot easier.

How to create a guidance package

After creating my first test guidance package using the Guidance Automation Toolkit (GAT), I've cobbled together some bullet point steps on how to do it. This is really rough at the moment, but I'll be adding to it as my knowledge grows. It took a while because there's nothing similar to it, like a walkthrough, and I didn't like using the meta guidance package because I wanted to understand how it all hung together first.

Creating the initial solution

  • Create a new solution with a class library project.
  • Add References:
EnvDTE
EnvDTE80
Microsoft.Practices.Common
Microsoft.Practices.ComponentModel
Microsoft.Practices.RecipeFramework
Microsoft.Practices.RecipeFramework.Common
Microsoft.Practices.RecipeFramework.Library
Microsoft.Practices.RecipeFramework.VisualStudio
Microsoft.Practices.WizardFramework
System
System.Windows.Forms
  • Tools->Guidance Package Manager->Enable/Disable Packages->Choose Guidance Package Development
  • Create a new class library project, name it Installer
  • Add References
Microsoft.Practices.RecipeFramework
Microsoft.Practices.RecipeFramework.VisualStudio
Microsoft.VisualStudio.TemplateWizardInterface
System
System.Cofiguration.Install
  • Set a project dependence of your guidance package project on the installer project
  • Create a new class called 'InstallerClass' in the installer project.
using Microsoft.Practices.RecipeFramework; 
namespace TestGuidancePackageInstaller
{
    /// <summary>
    /// Installer class for the guidance package
    /// </summary>
    [System.ComponentModel.ToolboxItem(false)]
    public class InstallerClass : ManifestInstaller
    {
    }
}
  • Create the guidance package xml configuration document, MyGuidancePackageName.xml, see documentation for details
<?xml version="1.0" encoding="utf-8" ?>
<GuidancePackage xmlns="http://schemas.microsoft.com/pag/gax-core"
Name="GuidancePackageName"
Caption="My Guidance Package"
Description="A test guidance package"
BindingRecipe="BindingRecipe"
Guid="fdd8f06f-6d6d-4228-96db-f842076764af"
SchemaVersion="1.0">
<Overview Url="Docs\Overview.htm"/>
<Recipes>
<Recipe Name="BindingRecipe">
<Types>
<TypeAlias Name="RefCreator" Type="Microsoft.Practices.RecipeFramework.Library.Actions.CreateUnboundReferenceAction, Microsoft.Practices.RecipeFramework.Library"/>
</Types>
<Caption>Creates unbound references to the guidance package</Caption>
</Recipe>
</Recipes>
</GuidancePackage>
  • Set the properties of MyGuidancePackageName.xml to BuildAction="Content", Copy to Output Directory="Copy if newer"
  • Build the solution
  • On the solution context menu select 'Register Guidance Package'
  • Open a new instance of VS, create a new project, go to Tools->Guidance Package Manager->Enable/Disable Packages
  • Should see you package

Create the Binding Recipe

  • Add a new Recipe element under Recipes
  • Set its name to 'BindingRecipe'
  • Add attribute BindingRecipe="BindingRecipe" to the GuidancePackage root element
  • Add types:
<Types>
<TypeAlias Name="RefCreator" Type="Microsoft.Practices.RecipeFramework.Library.Actions.CreateUnboundReferenceAction, Microsoft.Practices.RecipeFramework.Library"/>
</Types>
  • For each recipe you want to reference, add Actions:
<Action Name="<name of action>" Type="RefCreator" AssetName="<name of recipe to bind>" ReferenceType="<unboundRecipeReference class>" />

Create a recipe

  • Add a new Recipe element under Recipes
  • Set attribtes Name="its name" Bound="false"
  • Add Caption
  • Add HostData, this adds the recipe to the Solution, Project or Item context menus
<HostData>
<Icon ID="<icon number>"/>
<CommandBar="Project"/>
  • Add Arguments to specify the arguments that this recipe requires
  • Add GatheringServiceData to define the wizard that gets the arguments
  • Add Actions to specify what the recipe should do.

How to create and execute a T4 template

  • Create a folder 'Templates' in the guidance package project
  • Create a folder 'Text' in the 'Templates' folder
  • Add a template file with the extension .t4
  • Set the properties of the .t4 file: 'Build Action = Content', 'Copy to output directory = Copy if newer'
  • Write your .t4 template (see documentation on this)
  • Create a new Recipe as above
  • Create Arguments for all the properties of the .t4 template
  • Add an argument for the TargetFileName that adds .cs to the class name argument
<Argument Name="TargetFileName">
<ValueProvider Type="Microsoft.Practices.RecipeFramework.Library.ValueProviders.ExpressionEvaluatorValueProvider, Microsoft.Practices.RecipeFramework.Library" 
Expression="$(ClassName).cs">
<MonitorArgument Name="ClassName"/>
</ValueProvider>
</Argument>
  • Add an argument for the currently selected project
<Argument 
Name="CurrentProject" 
Type="EnvDTE.Project, EnvDTE, Version=8.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a">
<ValueProvider Type="Microsoft.Practices.RecipeFramework.Library.ValueProviders.FirstSelectedProject, Microsoft.Practices.RecipeFramework.Library" />            
</Argument>
  • Add an action to execute the template:
<Action Name="<action name>" 
Type="Microsoft.Practices.RecipeFramework.VisualStudio.Library.Templates.TextTemplateAction, Microsoft.Practices.RecipeFramework.VisualStudio.Library"
Template="Text<name of template>.t4">
<Input Name="<name of template property>" RecipeArgument="<recipe argument name>"/>
… as many input elements as you have properties
<Output Name="Content"/>
</Action>
  • Add an action to write a file to the currently selected project:
<Action Name="<actio name>" Type="Microsoft.Practices.RecipeFramework.Library.Actions.AddItemFromStringAction, Microsoft.Practices.RecipeFramework.Library" Open="true">
<Input Name="Content" ActionOutput="GenerateHelloClassAction.Content" />
<Input Name="TargetFileName" RecipeArgument="TargetFileName" />
<Input Name="Project" RecipeArgument="CurrentProject" />
</Action>

How to use solution and project templates to create a new solution structure from the New->Project menu in VS

  • Add a project folder, 'Templates', to the Guidance Package project.
  • Add a sub folder to 'Templates' called 'Solutions'.
  • Add a file called Solution.vstemplate to the 'Solutions' folder
  • Add an icon named Solution.ico to the 'Solutions' folder
  • Write the Solution.vstemplate. This is a 'multi-project' vstemplate with additions for GAT, example below:
<VSTemplate
Version="2.0"
Type="ProjectGroup"
xmlns="http://schemas.microsoft.com/developer/vstemplate/2005">
<TemplateData>
<Name>Test Guidance Package</Name>
<Description>A guidance package created to learn how to create guidance packages</Description>
<ProjectType>CSharp</ProjectType>
<Icon>Solution.ico</Icon>
<CreateNewFolder>false</CreateNewFolder>
<DefaultName>GatTest</DefaultName>
<ProvideDefaultName>true</ProvideDefaultName>
</TemplateData>
<TemplateContent>
<ProjectCollection>
<ProjectTemplateLink ProjectName="$ProjectName$">Projects\Domain\Domain.vstemplate</ProjectTemplateLink>
</ProjectCollection>
</TemplateContent>
<WizardExtension>
<Assembly>Microsoft.Practices.RecipeFramework.VisualStudio, Version=1.0.60429.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a</Assembly>
<FullClassName>Microsoft.Practices.RecipeFramework.VisualStudio.Templates.UnfoldTemplate</FullClassName>
</WizardExtension>
<WizardData>
<Template xmlns=http://schemas.microsoft.com/pag/gax-template
SchemaVersion="1.0"
Recipe="CreateSolution">
<References>
</References>
</Template>
</WizardData>
</VSTemplate>
  • Note the Recipe="CreateSolution" attribute of the template element under WizardData. This should point to a recipe defined in the MyGuidancePackage.xml file. This recipe is executed when the solution loads so you can use it to gather information from the user and execute any actions to build the solution items.
  • Under the 'Solutions' folder, create a folder called 'Projects'
  • Under the 'Projects' folder, create a folder for each project. Give it the project name
  • Add a ProjectName.vstemplate file to the project folder. Here's an example:
<VSTemplate
Version="2.0"
Type="Project"
xmlns="http://schemas.microsoft.com/developer/vstemplate/2005">
<TemplateData>
<Name>Domain model</Name>
<Description>A domain model for the application</Description>
<Icon>Domain.ico</Icon>
<ProjectType>CSharp</ProjectType>
<CreateNewFolder>false</CreateNewFolder>
<DefaultName>Domain</DefaultName>
<ProvideDefaultName>true</ProvideDefaultName>
</TemplateData>
<TemplateContent>
<Project File="Domain.csproj" ReplaceParameters="true">
<ProjectItem ReplaceParameters="true">Properties\AssemblyInfo.cs</ProjectItem>
</Project>
</TemplateContent>
<WizardExtension>
<Assembly>Microsoft.Practices.RecipeFramework.VisualStudio, Version=1.0.60429.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a</Assembly>
<FullClassName>Microsoft.Practices.RecipeFramework.VisualStudio.Templates.UnfoldTemplate</FullClassName>
</WizardExtension>
<WizardData>
<Template xmlns=http://schemas.microsoft.com/pag/gax-template
SchemaVersion="1.0">
<References>
</References>
</Template>
</WizardData>
</VSTemplate>
  • Add a ProjectName.csproj file. This is a standard project template. Here's an example, but you can take any existring .csproj file as a template (just insert the appropriate $variables$ at the right place) 

<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<PropertyGroup>
<Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration>
<Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform>
<ProductVersion>8.0.30703</ProductVersion>
<SchemaVersion>2.0</SchemaVersion>
<ProjectGuid>$guid1$</ProjectGuid>
<OutputType>Library</OutputType>
<AppDesignerFolder>Properties</AppDesignerFolder>
<RootNamespace>$safeprojectname$</RootNamespace>
<AssemblyName>$safeprojectname$</AssemblyName>
</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' ">
<DebugSymbols>true</DebugSymbols>
<DebugType>full</DebugType>
<Optimize>false</Optimize>
<OutputPath>bin\Debug\</OutputPath>
<DefineConstants>DEBUG;TRACE</DefineConstants>
<ErrorReport>prompt</ErrorReport>
<WarningLevel>4</WarningLevel>
</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' ">
<DebugType>pdbonly</DebugType>
<Optimize>true</Optimize>
<OutputPath>bin\Release\</OutputPath>
<DefineConstants>TRACE</DefineConstants>
<ErrorReport>prompt</ErrorReport>
<WarningLevel>4</WarningLevel>
</PropertyGroup>
<ItemGroup>
<Reference Include="System"/>
<Reference Include="System.Data"/>
<Reference Include="System.Xml"/>
</ItemGroup>
<ItemGroup>
<Compile Include="Properties\AssemblyInfo.cs" />
</ItemGroup>
<Import Project="$(MSBuildBinPath)\Microsoft.CSHARP.Targets" />
</Project>

  • Add a ProjectName.ico icon file
  • Set the properties of all the items added above to Build Action = "Content", Copy to output directory = "Copy if newer"
  • Add a new folder 'Properties' under the project folder
  • Add an AssemblyInfo.cs file under the Properties folder
  • Insert the appropriate $variables$ to replace the values that visual studio automatically provides. Here's an example:
using System.Reflection;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;

// General Information about an assembly is controlled through the following
// set of attributes. Change these attribute values to modify the information
// associated with an assembly.
[assembly: AssemblyTitle("$projectname$")]
[assembly: AssemblyDescription("")]
[assembly: AssemblyConfiguration("")]
[assembly: AssemblyCompany("$registeredorganization$")]
[assembly: AssemblyProduct("projectname")]
[assembly: AssemblyCopyright("Copyright © $registeredorganization$ $year$")]
[assembly: AssemblyTrademark("")]
[assembly: AssemblyCulture("")]

// Setting ComVisible to false makes the types in this assembly not visible
// to COM components. If you need to access a type in this assembly from
// COM, set the ComVisible attribute to true on that type.
[assembly: ComVisible(false)]

// The following GUID is for the ID of the typelib if this project is exposed to COM
[assembly: Guid("$guid1$")]

// Version information for an assembly consists of the following four values:
//
// Major Version
// Minor Version
// Build Number
// Revision
//
[assembly: AssemblyVersion("1.0.0.0")]
[assembly: AssemblyFileVersion("1.0.0.0")]

  • Set the properties of the AssemblyInfo.cs file to Build Action="Content", Copy to output directory = "Copy if newer"
  • Build the solution and Register the guidance package
  • Open a new instance of Visual Studio, select File->New->Project, Your Guidance Automation Package should now appear under 'Guidance Packages'.

Adding documentation to your guidance package

  • Create a new solution folder called 'Docs'.
  • Add a new HTML page called 'Overview.htm'
  • Set the properties of the Overview.htm file to Build Action="Content", Copy to output directory = "Copy if newer"
  • Add the following <Overview> element to your MyGuidancePackage.xml file under the document element:
<Overview Url="Docs\Overview.htm"/>
  • When you choose your guidance package, the Overview.htm page will display in the guidance navigator window