LINQ to SQL: Single Data Context or Multiple Data Contexts?


One of the biggest questions that often arises when a LINQ to SQL newbie begins to think about how he/she is going to architect the data layer is whether to split the data model into multiple data contexts or put the entire model into a single context. This question is usually due to concerns about performance and/or difficulty of maintaining the model. The purpose of this article is to debunk those myths and explain why using at most one data context per strongly connected group of tables is not only the best solution, it is the only reasonable solution.

Myths About Large Data Contexts

Myth #1: Large data contexts have low performance

The first time a LINQ to SQL data context is created, the data model is loaded into memory: table mappings are created, foreign key relationships are analyzed, etc. This seems like a lot of overhead…and it is! But it only happens once in an application’s lifecycle! After the data model is loaded for the first time, it is cached in a static variable in the application’s memory. Subsequent instantiations of the data context use this cached variable and load almost instantly. To prove this, I installed two of Microsoft’s sample databases: Northwind (13 tables) and AdventureWorks (70 tables). I then ran a program that instantiated each data context 100,000 times in order to magnify any performance difference between the two contexts and recorded the total time. If the size of the data context affects performance, then the AdventureWorks context should have a higher time since it has 5x more tables than Northwind. Here is the code:

class Program
{
    const int MaxIterations = 100000;

    static void Main(string[] args)
    {
        DateTime start = DateTime.Now;

        for (int i = 0; i < MaxIterations; i++)
        {
            NorthwindDataContext nwdc = new NorthwindDataContext();
        }

        Console.WriteLine("Northwind: " + DateTime.Now.Subtract(start).TotalSeconds + " seconds");

        start = DateTime.Now;

        for (int i = 0; i < MaxIterations; i++)
        {
            AdventureWorksDataContext awdc = new AdventureWorksDataContext();
        }

        Console.WriteLine("AdventureWorks: " + DateTime.Now.Subtract(start).TotalSeconds + " seconds");
    }
}

…and here are the results:

Northwind: 3.025173 seconds
AdventureWorks: 3.0411739 seconds

After 100,000 iterations, they only had a total difference of 16 milliseconds!

Myth #2: Large data contexts are more difficult to maintain

OK, so I can’t disprove this one objectively. Everyone will have a different opinion about this. However, I have worked with LINQ to SQL on many different projects and have had the opportunity to manage small data models, large data models in one data context, and large data models split into multiple contexts. I’ve seen it all, folks…and here’s what I learned:

  1. The designer does not suffer from performance problems as the data context gets more complex.
  2. Although it is slightly more inconvenient to scroll around the designer to edit a larger model, the alternative (splitting it into multiple contexts) is more than just inconvenient: it causes extreme code complications and introduces additional load on the database (these will be discussed in more detail later).
  3. While large data contexts do result in more intellisense options, I have never found this to be a problem in practice. If you have a database with a large and complex schema, you should be splitting logical sections of the database into separate schemas in order to avoid name collisions and to make it easier to understand. If you are doing this, then you can name your LINQ to SQL entities like this: Schema_TableName (example: Person_Person, Person_Address, Order_Order, Order_OrderDetail, etc.). Although this deviates from standard naming conventions a little bit, it is extremely practical, and it makes navigating the intellisense list easy and convenient.
Myth #3: Using multiple data contexts improves modularity

This is not actually a myth since multiple contexts improve modularity by definition, but I am listing it as a myth because 99% of the applications out there have absolutely nothing to gain from this. The only reason this argument would be valid is if you are selling a product that does not always ship with all of its features. For example, if you are selling a blogging application and your customers have to pay extra for the “reader comments” module, then you might benefit from putting the comment-related entities in a separate data context and not shipping it to the people who don’t pay for it. However, you will suffer from increased code complexity, so make sure you think this through.

If your application does not fit this scenario, then what do you possibly stand to gain from modularity? If you truly need this modularity, then the tables you’re worried about probably shouldn’t be sharing the same database to begin with.

Problems Caused by Multiple Data Contexts

Problem #1: No cross-context joins

This is easily the biggest problem with using multiple data contexts. Take the following schema as an example:

User.User(Id, Username, Password)
User.Profile(UserId, FavoriteMusic, FavoriteMovies, FavoriteGames, About)
User.Friend(UserId, FriendId)
Blog.Entry(Id, UserId, Title, Content)
Blog.Comment(Id, BlogId, UserId, Text)

Imagine that you’ve split your data contexts by schema, so you have a UserDataContext that contains the User.User, User.Profile, and User.Friend tables, and a BlogDataContext that contains the Blog.Entry and Blog.Comment tables. Let’s try to print a list of comments along with the username of the commentors. Pretty simple, right? Not with multiple data contexts. Here is the code to do so:

// Get the ID of the blog we're looking at
int blogId = int.Parse(args[0]);

// Get a list of comments to display.
// Notice the call to ToList(). I added this to prevent the query from running
// multiple times later.
var comments;
using (BlogDataContext blogContext = new BlogDataContext()) {
    comments = blogContext.Comments.Where(comment => comment.BlogId == blogId).ToList();
}

// We need to display the USERNAME of the commentors, which is not accessible
// by the BlogDataContext, so now we need to get a list of users from the UserDataContext.
// Notice the use of ToList() again. I had to do this because otherwise the
// UserDataContext would try to compile my comments query as part of a dynamic
// SQL query and an exception would result since it is part of another data context.
var users;
using (UserDataContext userContext = new UserDataContext()) {
    users = userContext.Users.Where(user => comments.Select(comment => comment.UserId).ToList().Contains(user.Id)).ToList();
}

// Now I have a list of comments and a list of users. I need to pull them together
// into one list so I can print them. Notice that the users list will be iterated once
// for each comment. This algorithm is O(comments*users).
var commentsToPrint = comments.Select(comment => new { Username = users.Single(user => user.UserId == comment.UserId), Text = comment.Text);

// Print each comment
foreach (var comment in commentsToPrint) {
    Console.WriteLine(comment.Username + " said: " + comment.Text);
}

You should be completely turned off by that code. It is extremely verbose, difficult to read…and just completely ridiculous. Now take a look at the single context version:

// Get the ID of the blog we're looking at
int blogId = int.Parse(args[0]);

using (MyDataContext myContext = new MyDataContext()) {
    // Get a list of comments to display.
    // Notice the Schema_TableName naming convention.
    var comments = from comment in myContext.Blog_Comments
                   where comment.BlogId == blogId
                   select new {
                       Username = comment.User_User.Username,
                       Text = comment.Text
                   };

    // Print the comments. This algorithm is O(comments).
    foreach (var comment in comments) {
        Console.WriteLine(comment.Username + " said: " + comment.Text);
    }
}

This is much more elegant and very easy to follow even if you have little understanding of LINQ to SQL.

Problem #2: Increased database load

Consider the examples from the previous section. The multiple context example uses two nearly simultaneous database connections, two queries, and multiple iterations through the “users” result set. The single context example uses one connection, one query, and one iteration through the results to print them. Clearly, using multiple contexts results in a higher load on the database and on the application server.

Conclusion

The LINQ to SQL data context was designed to be a one-to-one mapping with your database. Using it in any other way is counter to its design and will result in architectural problems. Your application might benefit from splitting your database into multiple contexts as long as the tables represented by those contexts are completely disconnected from each other.

Emulating CSS Child Selectors in IE6


We all hate IE 6. If you don’t hate IE 6, then you just haven’t worked with it enough or you haven’t worked with newer browsers enough to realize what you’re missing. Alas, it still holds such a large market share that it cannot be ignored, so let’s put the whining aside and dig in…

The problem is simple: IE 6 does not support the CSS child selector. It will ignore the following CSS:

table.green > tr > td {
  background-color: green;
  padding: 20px;
}

If you cannot redesign your layout and/or stylesheet to avoid the need for child selectors, then your only option is to emulate the child selector in some way. There are several methods for doing this, and many of them are well documented across the internet, but I will explain them all right here along with their pros and cons. These are the methods I will cover:

  • jQuery
  • CSS Expressions
  • Use More CSS Classes
  • Dean Edwards’ Script
  • CSS Descendant Selector

jQuery

The first possibility is to use jQuery, a very well designed JavaScript library that uses CSS selectors to manipulate the DOM. You can import the jQuery library and then do something like this:

$("table.green > tr > td").css("background-color", "green").css("padding", "20px");
Drawbacks:
  • Requires jQuery or other JavaScript library that supports CSS selectors.
  • Requires the user to have JavaScript enabled.
  • Separates styling logic from CSS.

CSS Expressions

IE supports a CSS construct called a CSS expression that allows you to inject JavaScript into your CSS styles. You can use them to emulate all kinds of CSS selectors and properties that are not supported by IE 6, but they have some major drawbacks that are explained momentarily. This is how you would use CSS expressions to emulate the child selector:

/* CSS selector recognized by everything but IE 6 */
table.green > tr > td {
  background-color: green;
  padding: 20px;
}

/* CSS selector recognized only by IE 6 */
* html table.green tr td {
  background-color: expression(/green/.test(this.parentNode.parentNode.className)? "green" : "transparent");
  padding: expression(/green/.test(this.parentNode.parentNode.className)? "20px" : "0");
}
Drawbacks:
  • CSS expressions are difficult to read and understand.
  • Experts recommend that you avoid CSS expressions due to performance issues.
  • Requires you to know what the “fallback” style should be (in this case, a transparent background and 0 padding).

Use More CSS Classes

This is the brute force approach. You can simply apply a CSS class to every element that you want to style:

CSS:
table.green td.green {
  background-color: green;
  padding: 20px;
}
HTML:
<table class="green">
  <tr>
    <td class="green">A table cell.</td>
  </tr>
</table>
Drawbacks:
  • Tightly couples your CSS with your HTML, which defeats the purpose of decoupling them in the first place.
  • If your CSS developer is a different person than your HTML/content developer, then this can cause a lot more back-and-forth communication.

Dean Edwards’ Script

Dean Edwards, a respected expert in the JavaScript community, has written a script that attempts to turn IE 6 into a standards-compliant browser. The idea is that you write your CSS files as if you only had to deal with standards-compliant browsers like Firefox, then you import Dean’s script and your problems are solved. His script parses your CSS files for certain constructs (such as the CSS child selector) and manipulates the DOM with JavaScript in order to apply the styling. I welcome you to give it a try, but I have had quite a few issues with it and have decided not to use it.

Drawbacks:
  • Requires the user to have JavaScript enabled.
  • Doesn’t always work correctly, especially when it comes to PNG support.

CSS Descendant Selector

The final method for emulating the CSS child selector is to regress to the descendant selector and then “clear” the styling by using another descendant selector:

table.green tr td { /* Specificity: 0,0,1,3 */
  background-color: green;
  padding: 20px;
}

/* Clear the styling of any table cell that is not a direct descendant of table.green */
table.green tr * td { /* Specificity: 0,0,1,3 */
  background-color: transparent;
  padding: 0;
}

Notice that the specificity of the two selectors is the same. This helps to prevent the “clearing” selector from interfering with other CSS selectors.

This solution seems nice and clean, but it has some issues with nested classes. Take the following HTML as an example:

<table class="green">
  <tr>
    <td>
      <table class="green">
        <tr>
          <td>This is a table cell.</td>
        </tr>
      </table>
    </td>
  </tr>
</table>

This is the result (note that I have added a border to help you see the padding):

Not exactly what we were hoping for, right? That’s because the second CSS selector takes precedence over the first one even though they have the same specificity due to their order in the CSS file. Luckily, there is an easy fix for this. We can just switch them around:

/* Clear the styling of any table cell that is not a direct descendant of table.green */
table.green tr * td { /* Specificity: 0,0,1,3 */
  background-color: transparent;
  padding: 0;
}

table.green tr td { /* Specificity: 0,0,1,3 */
  background-color: green;
  padding: 20px;
}

Now let’s take another look:

Much better! Now we have a solution that works in all browsers!

Drawbacks:
  • Requires you to know what the “fallback” style should be (in this case, a transparent background and 0 padding).
  • Requires you to reverse the ordering of your CSS declarations, which is slightly less readable.

Conclusion

I have presented these options to you so you can choose the method that makes the most sense in your project. However, if you need some guidance, then I would suggest the final method (CSS Descendant Selector) because it only has one real drawback which I have always found to be completely manageable.

Good luck!