One of the common themes throughout the DDD book is that much of the nuts and bolts of structural domain-driven design is just plain good use of object-oriented programming. This is certainly true, but DDD adds some direction to OOP, along with roles, stereotypes and patterns. Much of the direction for building entities at the class level can, and should, come from test-driven development. TDD is a great tool for building OO systems, as we incrementally build our design with only the behavior that is needed to pass the test. Our big challenge then is to write good tests.
To fully harness TDD, we need to be highly attuned to the design that comes out of our tests. For example, suppose we have our traditional Customer and Order objects. In our world, an Order has a Customer, and a Customer can have many Orders. We have this directionality because we can navigate this relationship from both directions in our application. In the last post, we worked to satisfy invariants to prevent an unsupported and nonsensical state for our objects.
We can start with a fairly simple test:
[Test]
public void Should_add_the_order_to_the_customers_order_lists_when_an_order_is_created()
{
var customer = new Customer();
var order = new Order(customer);
customer.Orders.ShouldContain(order);
}
At first, this test does not compile, as Customer does not yet contain an Orders member. To make this test compile (and subsequently fail), we add an Orders list to Customer:
public class Customer
{
public string FirstName { get; set; }
public string LastName { get; set; }
public string Province { get; set; }
public List<Order> Orders { get; set; }
public string GetFullName()
{
return LastName + ", " + FirstName;
}
}
With the Orders now exposed on Customer, we can make our test pass from the Order constructor:
public class Order
{
public Order(Customer customer)
{
Customer = customer;
customer.Orders.Add(this);
}
And all is well in our development world, right? Not quite. This design exposes quite a bit of functionality that I don't think our domain experts need, or want. The design above allows some very interesting and very wrong scenarios:
[Test]
public void Not_supported_situations()
{
// Removing orders?
var customer1 = new Customer();
var order1 = new Order(customer1);
customer1.Orders.Remove(order1);
// Clearing orders?
var customer2 = new Customer();
var order2 = new Order(customer1);
customer2.Orders.Clear();
// Duplicate orders?
var customer3 = new Customer();
var customer4 = new Customer();
var order3 = new Order(customer3);
customer4.Orders.Add(order3);
}
With the API I just created, I allow a number of rather bizarre scenarios, most of which make absolutely no sense to the domain experts:
Clearing orders
Removing orders
Adding an order from one customer to another
Inserting orders
Re-arranging orders
Adding an order without the Order's Customer property being correct
This is where we have to be a little more judicious in the API we expose for our system. All of these scenarios are possible in the API we created, but now we have some confusion on whether we should support these scenarios or not. If I'm working in a similar area of the system, and I see that I can do a Customer.Orders.Remove operation, it's not immediately clear that this is a scenario not actually coded for. Worse, I don't have the ability to correctly handle these situations if the collection is exposed directly.
Suppose I want to clear a Customer's Orders. It logically follows that each Order's Customer property would be null at that point. But I can't hook in easily to the List<T> methods to handle these operations. Instead of exposing the collection directly, I will expose only those operations which I support through my domain.
Moving towards intention-revealing interfaces
Let's fix the Customer object first. It exposes a List<T> directly, and allows wholesale replacement of that collection. This is the complete antithesis of intention-revealing interfaces. I will now only expose the sequence of Orders on Customer:
public class Customer
{
private readonly IList<Order> _orders = new List<Order>();
public string FirstName { get; set; }
public string LastName { get; set; }
public string Province { get; set; }
public IEnumerable<Order> Orders { get { return _orders; } }
public string GetFullName()
{
return LastName + ", " + FirstName;
}
}
This interface explicitly tells users of Customer two things:
Orders are readonly, and cannot be modified through this aggregate
Adding orders are done somewhere else
I now have the issue of the Order constructor needing to add itself to the Customer's Order collection. I want to do this:
public class Order
{
public Order(Customer customer)
{
Customer = customer;
customer.AddOrder(this);
}
Instead of exposing the Orders collection directly, I work through a specific method to add an order. But, I don't want that AddOrder available everywhere, I want to only support the enforcement of the Order-Customer relationship through this explicitly defined interface. I'll do this by exposing an AddOrder method, but exposing it as internal:
public class Customer
{
private readonly IList<Order> _orders = new List<Order>();
public string FirstName { get; set; }
public string LastName { get; set; }
public string Province { get; set; }
public IEnumerable<Order> Orders { get { return _orders; } }
internal void AddOrder(Order order)
{
_orders.Add(order);
}
There are many different ways I could enforce this relationship, from exposing an AddOrder method publicly on Customer or through the approach above. But either way, I'm moving towards an intention-revealing interface, and only exposing the operations I intend to support through my application. Additionally, I'm ensuring that all invariants of my aggregates are satisfied at the completion of the Create Order operation. When I create an Order, the domain model takes care of the relationship between Customer and Order without any additional manipulation.
If I publicly expose a collection class, I'm opening the doors for confusion and future bugs as I've now allowed my system to tinker with the implementation details of the relationship. It's my belief that the API of my domain model should explicitly support the operations needed to fulfill the needs of the application and interaction of the UI, but nothing more.
Lately I've been trying to return IEnumerable<T> whenever I need a collection that will only be enumerated or databound to something. This prevents me from making changes to the collection outside the context of the collection's parent entity. The problem with doing this is that I might need to write a unit test that looks for a specific item in the collection, checks the count of the collection or otherwise needs to do something that the IEnumerable<T> interface doesn't provide.
With tools like Resharper, It's easy to change the return types of the methods that you're getting the collection from and use an IList<T> or some other collection type that allows you to get at the information I want. However, this can lead to broken encapsulation and other potential problems in code. After all, I wanted to keep the collection encapsulated within the parent entity which is why I chose to use the IEnumerable<T> in the first place.
The good news is that there's a super simple solution to this situation that does not require changing the IEnumerable<T> return type. Have your test code wrap the IEnumerable<T> in an IList<T>.
2: var myCollection = new List<MyObject>(myEnumerator);
3:
4: [Test]
5:publicvoid my_test()
6: {
7: myCollection.Count.ShouldBe(1);
8: myCollection[0].ShouldEqual(myObject);
9://etc.
10: }
If you're doing interaction testing with an interface and a mock object, where the interface receives an IEnumerable<T>, you can still use this trick. For example, if I have this method on an interface defintion:
5: DisplayedProductCodes = new List<Lookup>(lookups);
6:returntrue;
7: });
8:return view;
Line 5 wraps up the IEnumerable<Lookup> into an IList<Lookup> object, letting me test the contents/count/etc on the collection.
Now you never need to worry about whether you can test the IEnumerable<T> when you are passing it around in your code. Just wrap it in an IList<T> at test time and call your tests the way you need to.
I am proud to announce that John Pertersen has joined LosTechies! We are all looking forward to his perspective and insights into software development.
I do most of my UI development – in ASP.NET WebForms and in WinForms – with a Model-View-Presenter setup. It helps me keep my application logic separate from my view implementations, makes it possible to unit test the presenters, etc. I also like to use custom controls – often with their own presenter - to help encapsulate UI related process and keep my UI implementations clean. The challenge with custom controls is getting them to converse to each other and getting the parent form to converse with the controls. My favorite way of solving this challenge is through simple messaging patterns. This gives you a lot of control and ensures your system is nice and decoupled. Of course, there is a cost/benefit tradeoff that needs to be considered. There may not need the indirection and potential complexities that come along with those solutions. The system in question may not need a messaging system, event aggregator, command pattern or whatever else. There are times when its easier and makes more sense to forego these patterns and have the presenters talk directly to each other.
Role Specific Interfaces
When the cost of the messaging pattern architecture out-weighs the benefits, stick to simple abstractions that still keep the presenters decoupled by one layer. This can easily be done with an interface or abstract base class in static languages like C#, Java and C++. However, don't take the easy way out in this abstraction and creating a one-to-one mapping between the abstraction and the implementation. Doing so will create a semantic coupling between the two presenters.
For example, the IProductCodeSelectionPresenter may have the following definition:
1:publicinterface IProductCodeSelectionPresenter
2: {
3:void Initialize();
4:void ProductCodeSelected(ProductCode code);
5: ProductCode GetSelectedProductCode();
6:void SelectionCancelled();
7:void SelectionConfirmed();
8: }
9:
Which of these methods should another presenter call in order to retrieve the ProductCode? Should GetSelectedProductCode be called? Does this method guarantee the view to select a product code was run and that the product code has been specified by the user? Or maybe the ProductCodeSelected method should be called instead, or Initialize or ... This easy-to-create interface may cause semantic coupling by forcing another developer to look at the implementation in order to know which methods should be called, when.
It would be better to define a role that the presenter is playing in the communication and create an interface that is specific to that role. In this situation, the name of the presenter provides some insight to what role the presenter is playing - product code selection. A simple role specific interface for this presenter may look like this:
1:publicinterface IProductCodeSelector
2: {
3: ProductCode GetProductCode();
4: }
With an interface defined like this, a developer calling this code will not have any confusion on what needs to be called. There is no need to look at the implementation of the interface, and semantic coupling has been avoided. Making a call to this interface is easy.
The Interface Segregation Principle (ISP)
The driving principle in making the decision to create the role specific interface is often the Interface Segregation Principle (ISP). This principles says that we should not force a client – the code that is calling out to our interface – to know about methods and properties that it does not need.
In this case, the client code does not need to know that the interface sits on top of a presenter. Therefore, take the name "presenter" out of the interface that the client calls. This gives the interface more flexibility for the future and prevents the client code from knowing that a view and user input is likely to be the implementation. The client code also doesn't need to know about the Initialize, ProductCodeSelected and other methods that the presenter has. These methods are specific to the interactions between the View implementation and the Presenter – a different role that the presenter is playing. By removing these methods from the interface, the client code is no longer bound to the knowledge of which methods should and should not be called, when.
The Dependency Inversion Principle (DIP)
The Dependency Inversion Principle (DIP) may also be at play in this scenario. DIP is not just about creating an abstraction and passing it into a constructor. That would only be dependency abstraction and dependency injection. Rather, DIP talks about abstraction ownership. In the case of a role specific interface, the owner of that interface is the code that depends on it – the client code that calls out to it.
If another presenter, such as a ProductDefinitionPresenter, is the driving force behind the need to create the IProduceCodeSelector interface, then this presenter should own that abstraction. This means that the ProductDefinitionPresenter determines what that interface looks like. What methods and properties are available, and the name of the interface are all driven by the needs of the ProductDefinitionPresenter.
12: var productCode = ProductCodeSelector.GetProductCode();
13://do something with the productCode, here
14: }
15: }
There is not syntax or markup that declares ProductDefinitionPresenter as the owner of this interface, in this example. That responsibility is left to the standards, conventions and organizational means of the system in question and the team that maintains it.
Other Considerations
Model-View-Presenter scenarios are not the only place that roles need to be considered. Any time two or more objects interact and there is a need for them to be decoupled, the roles that the objects are playing need to be considered. There are likely other principles and patterns that come into play when considering a role specific interface, as well. Each scenario's needs must be considered for their own reasons, and ISP and DIP may not always be at play when defining an interface for an object. And role specific interfaces are not always needed. There are other benefits to creating interfaces or other abstractions that can be referenced in place of concrete implementations such as dependency injection, general decoupling, creating service layer or other context specific barriers, etc.
I'm writing a spec with a mock object that mock object returns data to the class under test. In these situations, I don't bother asserting that my mock object's method was called because I know that if it's not called the data I need in the class under test won't be there and I'll end up having other unexpected behavior. This falls under the general guideline of ‘test output and interactions, not input'.
In this specific situation, I am taking a value from user input and using that value to load up some data from a mock repository. I find myself wondering if I should specify the value that is being passed into the mock object's method so that the mock will only return the data I need if the method is called with the right value. To illustrate in code, here are the two different ways I could do this.
1. Always return the data from the stubbed method call:
The difference is on line 5 – the use of Is.Anything vs. a literal value of 1. It seems that the arguments should be specified when it matters what the arguments are... when the arguments are going to determine whether or not the right thing is being done. In this situation, it seems to me that the argument is important. If I'm not specifying the value that was selected when calling the GetGroupLookups, then my code has failed to account for the user's input and it will likely produce the wrong behavior. The counterpoint to this is that the test where this stub definition lives becomes a little more brittle.
So the question is when should I return the data no matter what arguments are used vs. when should I only return the data when the right arguments are used? I know the answer is "it depends", as that's the only valid answer to any code question. :) But I'm looking for some input from the rest of the world on when they do / don't require the right arguments and why.
This past week, I attended a presentation on Object-Role Modeling (with the unfortunate acronym ORM) and its application to DDD modeling". The talk itself was interesting, but more interesting were some of the questions from the audience. The gist of the tool is to provide a better modeling tool for domain modeling than traditional ERM tools or UML class diagrams. ORM is a tool for fact-based analysis of informational models, information being data plus semantics. I'm not an ORM expert, but there are plenty of resources on the web.
One of the outputs of this tool could be a complete database, with all constraints, relationships, tables, columns and whatnot built and enforced. However, the speaker, Josh Arnold, mentioned repeatedly that it was not a good idea to do so, or at least it doesn't scale at all. It could be used as a starting point, but that's about it.
Several times at the end of the talk, the question came up, "can I use this to generate my domain model" or "database". Tool-generated applications are a lofty, but extremely flawed goal. Code generation is interesting as a one-time, one-way affair. But beyond that, code generation does not work. We've seen it time and time again. Even though the tools get better, the underlying invalid assumption does not change.
The fundamental problem is that visual code design tools can never and will never be as expressive, flexible and powerful as actual code. There will always be a mismatch here, and it is a fool's errand to try to build anything more. Instead, the ORM tool looked quite useful as a modeling tool for generating conversation and validating assumptions about their domain, rather than a domain model builder.
Ultimately, the only validation that our domain is correct is the working code. There is no silver bullet for writing code, as there is always some level of complexity in our applications that requires customization. And there's nothing that codegen tools hate more than modification of the generated code However, I'm open to the idea that I'm wrong here, and I would love to be shown otherwise.
I've been working in Ruby for my Albacore project over the last 6 or 8 months, and taking every chance I can find to learn how to really use the language effectively. One of the benefits I'm seeing in a dynamic language like Ruby is the ability to really DRY up your code through it's dynamic/duck type system, and through metaprogramming.
I've noticed in my ruby code that I tend to see repeated patterns of implementation in a different light. Rather than seeing the things that make each repetition of the pattern different, I tend to see the things that make each repetition of the pattern the same. I notice the same structure used with different variable name, the same method calls used with different parameters, and context specific method calls as the outliers that made me duplicate the code in the first place. When I see these patterns, my mind begins to run down the path of "this code is duplicated... how can I eliminate that duplication?" Whereas in C#, I almost immediately see the differences as "these are different calls based on the context and I can't eliminate this repeated pattern of code because of the unique calls each has to make."
I'm not sure why my mind has been operating this way with C#, but I know that is has been doing this for a very long time. I've often written the same pattern of code 6 or 8 times, or more in some cases – especially when it comes to UI code and event handlers from UI controls. I wrote a prime example of repeated patterns in C# just today, on a UI that has 4 ComboBox controls on it. Each combobox has a SelectedIndexChanged event handler that gets the selected value and pushes it to the presenter via a presenter method that is specific to the value being pushed. Here's the code in all it's glorious duplication:
When I wrote this code and looked back at it, I had my usual feeling of "well, these presenter calls are specific the the context of the combox being selected, so I can't really do anything to eliminate this repeated pattern of code." I even went so far as to think "man, if this were Ruby, I wouldn't have any issue killing this repeated pattern." That's when a little voice in the back of my head started shouting at me and I realized that I could eliminate the duplication in C# just as easily as I could in Ruby with the use of anonymous delegates.
Method Blocks And Anonymous Delegates
One of the techniques I often use in Ruby to help dry up repeated code is ruby's method blocks - basically an anonymous delegate in C#. These two code samples are functionality equivalent...
Ruby Method Block With Named Parameter
1: def my_method(&block)
2: name = "derick"
3: block.call(name) unless block.nil?
4: end
5:
6: my_method do |name|
7: puts "the name is: #{name}"
8: end
C# Anonymous Delegate (Lambda) With Named Parameter
1:publicvoid MyMethod(Action<string> block)
2: {
3:string name = "derick";
4:if (block != null)
5: block(name);
6: }
7:
8: MyMethod(name => {
9: Console.WriteLine("the name is: " + name);
10: });
Eliminating This Repeated Pattern
After I finally decided to listen to that little voice shouting at me and use my tools to their full extent, I rewrote the event handlers into the following code, using an Action delegate and anonymous lambda block to execute the context specific presenter calls.
31: LookupSelected(cboGroups, l => _presenter.GroupSelected(l));
32: }
Lessons Learned
This certainly isn't anything extraordinary, mind you. I've written methods with delegates and lambda blocks more often than I can remember. This code is not complex, it's not difficult to write, it's not difficult to read or understand. But that's the beauty of it. It's simple, elegant, and eliminates the repeated pattern that I was creating. There are probably some additional tweaks I could make, honestly, but I also want to keep in mind the readability and understandability of the code – not just how often a pattern is repeated.
The significance of this is not in the code that I wound up writing, but in how I came to that decision. My exposure to ruby and my predisposition to see repeated patterns of code in ruby as duplication that should be eliminated finally made a jump across the neuro-pathways of my brain into C# land. I was able to take a paradigm from a different language and different set of optimizations and capabilities, and redefine my own understanding of the current paradigms and capabilities of this situation. That kind of cross-breading and transfer of knowledge is critical to our ability to come up with new and creative solutions in situations where we believe we already have mastery.
Do yourself a favor – learn a new paradigm of development or whatever your job entails. You'll never truly be able to say "use the right tool for the job" unless you actually know how to use the tools available, and you never know when the paradigms of one tool will cross the boundaries of your experience and begin to show you new solutions to existing problems.
Given my recent experiences with Ruby, my cursory knowledge of Java, and my past experiences with other object oriented systems, I find myself asking a lot of questions about why we do things the way we do in the C#/.NET space. Today's questioning is about one of those fundamental things that I have been preaching for a long time, yet suddenly find myself unable to answer ‘why':
field vs property: does it really matter?
As I sit and think about in the back of my head while writing code or this post, I can't really say that I have any good reason for saying "you should use a property instead of a field" other than the defacto answer of encapsulation. But what if encapsulation doesn't matter when you just want to store and retrieve a simple piece of data? Why is oh so important to use a property as the public API for a class?
Ruby doesn't really have properties. It just has methods that allow you to use an = sign, but those methods are not even required to "set" a value. It's only the conventional use of = methods that say you should. Java doesn't have properties at all. It's convention to use getWhatever and setWhatever methods, but it's better design to not use mutators and I like the general reasons behind that.
So why does it matter if it's a field vs. a property? Someone convince me that it's important. Someone convince me that it's not. Better yet – someone explain the contexts in which it is important and the contexts in which it's not, and someone point out where it's important to hide the data behind the process through a mutator-less API of service methods.
...
Question everything – especially your own assumptions.
Last week marked the one year anniversary our team's first ASP.NET MVC application in production. We really have two different types of production. Internal and external. While an internal application might get used by 2 to 100 people, our external applications get used by millions. After chatting with some members of the team and looking at the source code from a year ago, I'd like to share some thoughts with you. Let's take a look back into the past, all the way back to the year 2009.
What Did We Learn?
Sacrifices Were Made
I'll be the first to admit that the project wasn't where most of us wanted it a year ago, but the sacrifices we did make were consolations we were willing to live with. The easy one to point out is stuffing objects/data in ViewData. This seemed like a good idea at the time (and it was fast), but just gets messy in a hurry. You wont find this in many places anymore. We've been cleaning up that sort of thing while we're in those areas and things look much tidier now.
Flexibility Was Achieved
The ability to cover our code with tests in MVC was much easier than the previous code in WebForms. This has allowed our code to be quite malleable. There are still some areas that are harder to test, but they disappear each week as we simultaneously see our code coverage marks rising.
Treat Routing With Respect
Routes are great, but they're better when created earlier rather than later. If you're starting on something new, be sure you thinking about your route conventions towards the beginning of the project and not as an afterthought. Wanting to change these up after the application is in production, you might have to end up breaking URLs or having to do some re-writing. The other big factor is realizing your site's information architecture and how this is important to your site's URL/directory structure. Big bonus points if you can nail this down right away and not have to worry about it after version 1.
Comments From the Team
With the ability to keep views small and concise I am left with significantly less (duplicated) code to wade through while working with CSS and jQuery. I've noticed a definite decrease in time spent working in individual files allowing me more time to spend on enhancements and improvements. This last year I've designed more pages and worked on far more new features than any other year before, and hoping to continue that trend into the next year. – Jessica
MVC provides the hooks to quickly and easily do what we want. Case in point, the other day using in our dev session looking at overriding a "default" page with a custom page just by the existence of the page (view). With webforms this is possible but it's a pain with the page controller pattern dictating the flow of things. This allows us to more quickly respond to the needs of the business with less code in a more discoverable location - arguing the code we implemented is much clearer than an HttpModule (with webforms). - Tim
What am I Looking Forward to in the Coming Year?
Conventions, Conventions, Conventions
Nothing new here. The more time we spend in our code, the more certain parts of it look alike. Humans are good at spotting patterns and our group is very "humany". (If that word isn't invented yet... patent pending!) In all seriousness though, adhering to conventions can greatly reduce the time to market for new features, just ask the Ruby crowd. We're finding more and more opportunity for conventions all the time.
Builders/Templates
Essentially just more conventions ideas. MVC2 will allow us to use model templates (like input builders in MvcContrib), that we're likely going to be taking advantage of. The typical usage I'm seeing is input forms, but we don't have tons of forms within our project. We do have a lot of code that can make use of jQuery and progressively enhanced displays as well as templating many of our repeated and/or similar models.
Thanks...
Obviously we're not the only ones out there who use MVC. There seems to be many that are blogging about the pros/cons as well as giving tips and tricks on how to make MVC work better. I just want to say thanks, and keep it up!
I have an obscure, personal blog to which I fled back to recently. I posted the following entry:
Well whadaya know? I'm having existential doubts these days. There's a battle that rages inside of me every time I get a little bit of free time: what should I spend my precious little time on? Until recently, I had been trying to get an enterprise-y application going, but it filled me with dread. Every line of code had to be extracted from my body with forceps, put into place and tested. It was a fine equilibrium, a delicate balance that I maintained by making myself believe that this is what I wanted to do. It was more that just something to do on a rainy day; it was more than just a reason to go out and hang out in a coffee shop all afternoon on Sundays. It was what I wanted to do. But I hated it. Well, most of it.
Anyway, that balance tipped over when I lost control of the app. Slowly, I stopped unit testing features. Then I simply started hacking it without too much thought or process. I started drifting. Thoughts would come and distract me. What is the next big thing? How can I become hugely successful and never have to worry about money again? Should I remain a Windows developer, forever branded as a .Net guy, or should I venture to the free ecosystem that is Linux? What about Mono? Oh, I could write server apps with Mono! But what about Lisp? I've always wanted to learn Lisp! I could write a Command & Query type application and start with the Login system and write it in Lisp. Or even better yet, I should write a .Net version of Common Lisp, just like that dude that wrote Clojure on top of the JVM. And of course, it would have to be done entirely by using the command line and VIM... Is there a book on Amazon for VIM?..
And on, and on, and on it would go. Every Sunday. Torture. I would power on my laptop and sit transfixed in front of the monitor, my hands calmly positioned over the keyboard. There would be a shell prompt and a VIM window open, waiting. Oh, the possibilities! And yet, I couldn't come up with anything to do. Eventually, I'd give up and log on to twitter, facebook, news.ycombinator, arstechica, slashdot, news.google, nytime, programmer-looking-for-a-problem-to-solve-that-wont-bore-him-to-death.com... I'd then slam shut the laptop lid and be in a funky mood all day. Hell, that's exactly what happened today. Again!
Except that today I decided it was all over. I would stop wasting the rest of my life pursuing something I'm starting to feel weak at. Focus on my strength. That's what I need to do! But then I read a blog post about using object databases and cracked open my Common Lisp book. Damn it! I feel excited about programming again. It makes me feel like getting my laptop down from the shelf where it's quietly sitting and hack on something.
Why?
I'm clearly passionate about something! I just can't put my finger on it. So here's what I'm going to do. I'm gonna blog about that until I figure out what it is. Yeah. Blogging. That's so 2005! It's so not the next big thing.... oh... here I go again!
It's only been a month, and already my mind is on something else. You guys have read my posts, or if you haven't, go check my post history. I'm all over the place! Why the hell did I want to learn Lisp? Why go away from what I know best? I think we all want to do something significant in life. We all want to get better. But in my case, I clearly need some focus. Instead of learning something new and obscure (and hard!), why not get better at what I already do? I'll admit it: I'm not a spare time or weekend coder. I can't do it. I don't pull the all nighters. I don't stay up until 2am at night. I actually try to get between 8 and 9 hours of sleep every night because that's what makes me feel great during the day. This sort of balance is what I need. When I try to focus too much on code, I end up developing this allergic reaction that causes the kind of posts I just showed you. It's not fun. Yet, I envy all of you who do code all the time. I feel that not doing it slows me down on my journey to become a better programmer.
I suppose the journey is what's important. The important thing is that we go forward. The pace doesn't matter. As for me, I'm not forcing myself to code anymore. I'm enjoying my time with my family and feel I'm a better developer at work because of it.
This being my first blog post at Los Techies, I want first say how excited I am to be part of a vibrant community of developers. For some time now, I have been "going it alone" with my blog efforts. I believe that when afforded an opportunity to add to an ecosystem that has such a positive effect on the development community at large - you don't let that opportunity pass. You can find my former blog at www.johnvpetersen.com.
A little about me...I have been a professional software developer for about 20 years. Currently, my development focus is in the ASP MVC space - and I love it. I have also been dabbling into Ruby on Rails. I have a forthcoming article in Code Magazine where I discuss my efforts at porting Nerd Dinner to Rails. Most recently, I have been making the rounds to local Code Camps (Philadelphia, Harrisburg and New York City) giving talks on the BI Stack, NHibernate and ASP MVC. Over the years, the community has provided me with many opportunities. In that regard, I appreciate the opportunity, whenever possible, to give back. Of particular interest, I enjoy the opportunity to help new developers learn the ropes. To that end, I have been working with the Philadelphia Microsoft Developer Evangelist Dani Diaz in building up a new site called http://www.devready.net/. Should you have the desire to give back, I encourage you to contribute a 20-40 minute screen cast on a topic of your choosing. As of this blog post, the site is still in the ramp-up phase.
I've used a lot of different architectures, patterns and implementations that revolve around the core concept of command-query separation (CQS) and the more recent label of command-query responsibility separation (CQRS). The ideas behind these principles help us create code that targeted to a single purpose, generally side-effect free and easier to work with and maintain. In the last few days, though, I've begun to see how CQRS can be used for performance engineering as well.
Performance Problems With A Common Pattern
A few weeks ago, our product owner reported a performance problem with a control that is used on two screens in our handheld / Compact Framework application. This control is not terribly complicated – it has 4 drop down lists, each one loaded based on the data selected in the previous one. I'm pretty sure every developer has created a series of drop down lists like this at some point in their career. It's not difficult... it just takes a little time and effort to handle all the cases of no items found, auto-selecting if there's only a single item in the list, having a "Select One" or other default option, etc.
After digging into the offending control, I found that it was doing the following for every drop down list on the control:
Data Load / Display:
Load all data from the database into a DataTable
Convert each row of data into the full object it represents
Convert each object into a simple Name/Value Lookup object
Bind the Lookup objects to the drop down list
Data Select / Use (on selected index changed):
Get Value (ID) of the selected lookup item and load the full object for that ID
Run the Data Load / Display for the next drop down list based on the ID of the object
Publish the selected object on an event so the parent form could respond to it as needed
This is a pattern that I see a lot of – whether it's WinForms or WebForms development. It's especially common in a WebForms environment, though, where there is not state on the view implementation. Unfortunately, this pattern and implementation is very problematic when it comes to performance. The actual performance on the control in question was so bad that we resorted to using asynchronous commands to retrieve the data for the drop down lists. This let us keep the UI "responsive" to the user – it prevented the screen from locking up with strange artifacts for the 3 to 5 seconds that it took to load any given drop down list.
Separation Of Concerns
Why would I want to load the entire set of data from the database and deserialize that into the full object model just so I can bind the name and id of the objects to a drop down list and then re-load the same object from the database again? That doesn't make much sense to me – even in a web environment where I should bind nothing more than the name and id in the form. In a WinForms environment, though, I guess I can see "the easy way out" by loading up the objects with my existing data access infrastructure... but that just doesn't make any sense other than being lazy.
Here's the crux of a read-only or view model in this situation: if I'm only going to display the name and id of the objects, then that's all I should load.
Load View Model, Lazy Load Full Object When Its Needed
To solve the performance problems in this control, I decided to use the basic CQRS tenants of separating my view model, which is a read-only representation of my data, from the object model which is a read/write representation. Here's the new approach I took to solve the performance problems, with each of the drop down lists:
Data Load / Display
Load the name and id only, from the database using a DataReader
Populate a generic Lookup object with the name / id of each record
Bind the drop down list to the Lookup objects
Data Select / Use
Get the the id of the selected item in the drop down list
Run the Data Load / Display for the next drop down list based on the id of the selected item
Data Collection
After the entire selection process has been performed, then and only then load the full object that was selected and publish it to the parent form
There are a couple of key things to note in this solution... namely, I'm only loading the name and id for the drop down lists. I only need that information for the drop down list to work, so I'm not going to bother loading anything else. And I'm not loading the full object model until I'm actually ready to use it. If the user is constantly switching the drop down lists to figure out what they need, then loading the full object model after each individual selection will just use up a bunch of time and resources for no good reason. I'm waiting until some level of confidence in the selection can be established and the code is ready to use the object model before loading the full model.
The Performance Improvements
I don't have any scientific performance metrics for this, yet. I'm not sure if I'll need to do that, actually. I do have first hand experience with the existing performance and the new performance, though.
The original code tended to take anywhere from 3 to 5 seconds, on average, to load any given drop down list. The worst performance, though, was one particular query that returned nearly a thousand items for the drop list to display. This would take closer to 6 or 8 seconds to load. ... again, these are all based on my experiences, not actual timers... I can say with certainty, though, that I was never able to use keypad up/down arrows to select items in the drop down list. The control was simply too slow in responding so I would sit there and wait for it to finish loading before clicking the down arrow again.
With the new implementation in place, the control's performance is significantly enhanced. The average time it takes to load the drop down list has dropped to far below a second. Again, I haven't done any real timer / performance testing with this... but I can say with certainty that I can now use the up/down arrow keys on the keypad and the control keeps up with me no matter how fast I'm able to click the keys. Furthermore, the performance is good enough that I have not yet needed to use any asynchronous processing to load or display any data. Even with the one query that returns nearly a thousand records to the drop list, the time to load is less than a second – a barely noticeable stutter in the list being available for selection.
Conclusions And Other Considerations
The principles and patterns that comprise CQRS can be used for a number of different reasons – not the least of which is performance improvements in your code. Whether you are working on Winforms, Webforms, Compact Framework or another system or platform that has read vs. read/write needs, keeping CQRS in mind at all levels of the system can have a significant impact in many different ways.
Of course, this does not come free. There is an increase in the amount of code you have to maintain when you go down this path. You may end up writing two or more different types of data access code and you will have the same data represented in multiple objects and queries in your system. These costs are not to be taken lightly. However, when used judiciously and understood by the entire team the impact of these costs can be mitigated. Keep your data access methods simple and have a clean separation between your full object model and your read only models. Constantly communication with team members and work on well named and organized code. Its your team's communication, collaboration and standards that will help to cut the costs, keep your system clean and maintain it's performance over time.
Some time ago, I noticed a CruiseControl.Net build report with thousands of unit tests passed, zero failed and a dozen or so skipped, suddenly showing that no tests were run:
I immediately thought somebody did something really bad. After some digging, I found an error in the CCNET log file that indicated an error was thrown and swallowed during the parsing of the test results xml file. It was choking on an NUnit Row Test with a null character in a string. Here is the exception:
2010-03-02 13:45:25,567 [Project.Web:DEBUG] Exception: System.Xml.XmlException: '.', hexadecimal value 0x00, is an invalid character. Line 5901, position 160. at System.Xml.XmlTextReaderImpl.Throw(Exception e) at System.Xml.XmlTextReaderImpl.Throw(String res, String[] args) at System.Xml.XmlTextReaderImpl.Throw(Int32 pos, String res, String[] args) at System.Xml.XmlTextReaderImpl.ThrowInvalidChar(Int32 pos, Char invChar) at System.Xml.XmlTextReaderImpl.ParseNumericCharRefInline(Int32 startPos, Boolean expand, BufferBuilder internalSubsetBuilder, Int32& charCount, EntityType& entityType) at System.Xml.XmlTextReaderImpl.ParseNumericCharRef(Boolean expand, BufferBuilder internalSubsetBuilder, EntityType& entityType) at System.Xml.XmlTextReaderImpl.HandleEntityReference(Boolean isInAttributeValue, EntityExpandType expandType, Int32& charRefEndPos) at System.Xml.XmlTextReaderImpl.ParseAttributeValueSlow(Int32 curPos, Char quoteChar, NodeData attr) at System.Xml.XmlTextReaderImpl.ParseAttributes() at System.Xml.XmlTextReaderImpl.ParseElement() at System.Xml.XmlTextReaderImpl.ParseElementContent() at System.Xml.XmlTextReaderImpl.Read() at System.Xml.XmlWriter.WriteNode(XmlReader reader, Boolean defattr) at ThoughtWorks.CruiseControl.Core.Util.XmlFragmentWriter.WriteNode(XmlReader reader, Boolean defattr) at ThoughtWorks.CruiseControl.Core.Util.XmlFragmentWriter.WriteNode(String xml)
Here's an example that again broke our results output the other day.
This version of CruiseControl.Net isn't the newest, and is older than the version of NUnit that is running. This may be fixed by upgrading CCNet, I haven't tried yet though. This is just meant to be a "heads-up" in case you run into the same issue.
Unfortunately, my answer to getting the results to show back up was to remove both row tests. If anybody can add more details to this (affected versions, fixes, workarounds, etc), it would be greatly appreciated by myself and hopefully somebody else.
Now that I’m starting to use Git a lot more, I’ve been thinking about and starting to use branch-per-feature a little bit. This morning I had a term pop into my head as I was thinking about branch-per-feature: “Composable Deployments/Releases”. Now I’m still getting the hang of this whole branch-per-feature thing, so this might be old hat for a lot of you, but I noticed something in some of my recent work with Git and how my workflow was starting to shape up. Here is an example of how using Git and Branch-Per-Feature is starting to make my life easier.
Disclaimer
I’ve only been using Git for a couple weeks, so go easy on me... ;)
Life With Subversion
I manage my church’s web site and have used subversion for managing the code for a long time. As I’m working on new projects for this web site, I deploy them to a “staging” site that is used to review and test out new changes with our Pastors and ministry leaders. In the meantime, small changes and content updates get done and deployed to “production” in parallel to these new, larger projects that get pushed to “staging”. One of the issues I’ve always had is how to easily work on these new projects separately and deploy new features in isolation from one another. Using the typical trunk/branches style of subversion, I’ve always done most work directly in trunk and sometimes creating branches for larger projects. Unfortunately trying to do this with subversion is Not Fun(TM). Also this made it tough to separate “staging” deployments from “production” ones. Of course I could have managed environment-specific branches in subversion and done the merging dance with subversion, but again, Not Fun(TM).
Enter Git
I’ve been using Git for a couple weeks locally to manage local branches for my “day job”, where we still use subversion as our main VCS (for now...muwhahaha :)). But for my church site I decided to make the full move to use Git with Beanstalk [http://beanstalkapp.com], which is where I’ve been doing my subversion hosting for a while now. Beanstalk just recently added Git support and so far it’s working out great.
One of the first things I noticed as I started using Git was how easy it is to branch and merge code. So I figured I’d try out this whole Branch-Per-Feature [http://www.lostechies.com/blogs/derickbailey/archive/2009/07/15/branch-per-feature-source-control-introduction.aspx] thing all the cool kids are talking about these days. Needless to say I’m liking it a lot.
Ok, so getting to the point of this post, here is a peek at the branches I’m managing right now:
master (gets deployed to production)
staging (environment branch used to push to a staging site)
faith (feature branch for a new statement of faith page)
visitors (feature branch for a new visitors page)
<insert other temporary branches here for one-off changes that get merged into master and deployed>
One of the things I *really* like about this is that I can keep work nice and isolated in feature branches and easily merge them as needed into the appropriate environment-specific branch (staging or master) when it’s ready to be deployed. My current workflow is something like this:
test changes in staging branch and deploy to staging environment
The way I’m treating the staging branch is mainly as a “merge and deploy” branch. I’m not usually making any direct changes in the staging branch. So, in theory, once the feature in the faith branch is tested and ready for to be deployed to production, I could just merge that feature branch straight into master and then do my normal production deployment. This seems like a really nice way to work. That’s why I called it “composable deployments/releases” because it’s really nice to be able to “compose” a deployment simply by merging in the appropriate feature branches as needed.
I’m still a Git n00b for the most part, so I’d be very interested in your feedback and improvements. I’m sure my thoughts on this will change the more I learn and use Git, but so far this is really working for me quite well.