RSS

Category Archives: Web Performance

ORM or not to ORM that is the question

ORMs (Object relationship mapper) have come a long way over the last few years.

ORMs used to be very slow and hard to tune but that is not true anymore. However, the decision to use one or not still weighs on every technology leader.

  • If you decide to use one, which one?
  • When you pick one how much of the functionality do you use?
  • How do you tune the queries it generates?
  • Is a full ORM right for you or is a micro ORM more a fit?

This blog entry is not going to answer any of the above questions ūüėČ This is more talking about what we did on BlogTalkRadio, what were our pain points and how we dealt with them.

When we first built BlogTalkRadio’s data access layer 7 years ago, we looked to see if we wanted to use an ORM. Since we were a .Net shop we looked at Entity Framework and nHibernate. Both were not only ¬†in their infancy and were¬†slow and¬†cumbersome.

So in 2006, We decided to build our own Data Access Layer and mapper that called stored procedures exclusively and did all of the result set mappings to classes with private functions that referenced columns by index. Why by column index and not name? Because when we launched, we were on ASP.Net 2.0 where the¬†reflection methods that are used under the covers to convert that name to an index in ADO was VERY expensive at the time. ¬†It wasn’t until 3.5 that it became more efficient. ¬† I think this was also a contributing reason early ORMs for .net were so slow.

 const int C_ID= 0;
 const int C_WIDTH= 1;
 const int C_HEIGHT= 2;
 const int C_TYPE_ID= 3;

private static MyEntity GetMyEntityFromReader(IDataReader reader)
{
  var MyEntityVariable = new MyEntity
   {
     ID = reader.GetInt32(C_ID),
     Width = reader.GetInt32(C_WIDTH),
     Height = reader.GetInt32(C_HEIGHT),
     Type = (Type)reader.GetInt32(C_TYPE_ID)
   };
  return zone;
}

We grew to hundreds of stored procedures and dozens of entities. After a couple of years with this pattern, it was obvious that there were shortcomings with this method. Let’s list the top few issues, shall we.

  1. We could not change the column order or insert a column in a result set. It forced us to be very careful about backward compatability. Adding a field in the middle would throw off the column count. (Note: When we moved to .net 4 framework I did a benchmark of using Index vs Name and found no discernible difference anymore. After that, we stopped using the index and started using name.)
  2. Having standard Entities that are manually mapped forced us to return ALL fields on every query. Every stored procedure that returned show information had all columns returned. So if we only needed a couple of fields in an entity that has 30 attributes, the procedure returned them all. Occasionally we would create a custom mapping for a particular use case but managing that with a team of 3 developers was too great.
  3. By Extension, if we wanted to add one field, we would have to change sometimes 20 or 30 stored procedures that handle specific CRUD operations and specialized queries for parts of the system.
  4. We had to add tons over overloads to the DALC to handle different parameter sets since nothing is dynamic.
  5. It made it very cumbersome to add new Entities. Had to create the Entity, create the DALC class, create all of the methods calling the stored proc, create the mapping of fields to attributes.

So for the first 4 years we worked fine this way. We had a small static team and we understood the shortcomings and worked around them by always ensuring backward compatability. Columns had to be added to the end of the results sets. We did tedious updates to many stored procs.

Our justification was performance. ORMs were slow and we had a site with millions of page views per day. We did not want to slow down the site.

Then, a couple of years ago, we realized that the justification of saying we had better performance without an ORM no longer held water.  Well, that was true for some of them. Some continued to be a dog. In addition, changes in .Net 4.5 framework boosted LINQ query performance significantly.

So with the performance bottleneck no longer a concern we set upon implementing an ORM. We decided on nHibernate. I am embarrassed to say that it was not for the right reasons. Mainly we chose it because some of my developers were very familiar with it and they lobbied for it. My thinking was I would prefer to use something that my engineers knew so implementation would be smoother since they already know some of the gotchas and how to get around them. This reasoning did prove to be correct.

It is very common to choose technology based on familiarity rather than doing your due diligence.

If you are reading carefully,  this article has indicated that everything was stored procedure based. Obviously nHibernate can call stored procedures just fine and map the results but this is like using a sledge hammer to hang a picture. If you just want to do mapping and call SPs I would have used the micro ORM,  Dapper instead.

So the plan was to start small and initially implement nHibernate on small simple queries and CRUD operations in parallel to the existing DALC. Then we expanded it to joining complex entities and even try to replace some complex stored procedures.

However, all was not peaches and cream. We hit and had to solve a few issues. We also had to come up with our going forward patterns and how to handle the existing DALC.

Issues:

  1. In SQL Server I am adamant about using (nolock) or Setting Isolation to Uncommited on queries to keep queries from blocking Inserts and Updates. This actually was a requirement that whatevery ORM we use, had to support this for SQL Server.  We solved it this way:
    private static ISession BeginSession(ISessionFactory sessionFactory,
                                       bool useTransaction)
     {
        var session = sessionFactory.OpenSession();
        session.FlushMode = FlushMode.Never;
        if (useTransaction)
        {
          session.BeginTransaction(IsolationLevel.ReadUncommitted);
        }
        return session;
    }
  2. The biggest issue with using an ORM is the query that gets generated. We initially used nHibernate in a project for our host tools. We built an api layer so this was a good place to start off and not affect the old stuff. We built it and then did the performance check. It was slow. Very slow. So we profiled all of the queries and there were some doozies in there. These are a few of the issues we had to deal with in the queries that it generated.
    1. The ORM had created some that joined the same table 3 times.
    2. Used outer joins instead of inner joins
    3. My favorite was the lower() function around the column in the where clause. WHERE lower(column) = lower(@value). Makes sense in C# to so this:
      queryable.Where(x => x.AspNetUser.Username.ToLower() == this.UserName.ToLower())

      But that keeps an index from being used when you convert it to SQL.

    4. Turning off lazy loading, We had some entities with sub entities and did not want nHibernate going and loading everything. Often, it wasn’t necessary. We wanted to explicitly load the entities and attributes that we wanted when we wanted them.
    5. Reverted some back to SP because they were quite complex and we could not get nHibernate to do what we wanted it to do.
    6. It is easy to get into a situation where you are doing n+1 queries. You request one entity and then it goes and fetches additional data for each item loaded into the collection. We had to eradicate a couple of those.
  3. Maintenance of queries. In our old stored procedure pattern, if there was a performance issue or we needed to change a query, we just had to change it in the Db which is 100 times simpler than making the c# code changes, compiling, deploying, etc.  DB we can hot deploy and there is zero downtime. Not so with compiled .net code. We had to adjust out expectations and process to deal with it.

This, of course begs the debate between  using SPs vs not using any. I will reserve that for another post.

So where we are now is a mix of old DALC and new DALC. As we rewrite pages we will be making the old DALC obsolete. I look forward to hitting the delete button on the project.

I expect that we will move towards all CRUD operations being done purely through nHibernate (Or a different ORM if we decide to change. Gotta love dependency injection (DI)).

Simple queries will move to nHIbernate more and more but the more complex ones may remain as Stored Procedures for quite a while.

Although, lately I have been thinking that using an ORM gives you the ability to become DB agnostic. If we want to use something like MEMSQL where the entire DB sits in memory but there are no stored procs, we can. If we want to keep everything in Mongo, there are ORMs for that. Gives a lot of flexibility. Just remember, with great flexibility comes performance hits so it depends on your application.

In Summary, writing your own gives you control but then you need to do everything yourself.  Using a tool like an ORM gives the developer more power to not think about the DB and just work with objects.

 
Comments Off on ORM or not to ORM that is the question

Posted by on December 7, 2013 in Web Development, Web Performance

 

Tags: , , , ,

Why tools that monitor and profile your website are so important

Over the last 12 months, we have been able to grow the unique visitor volume for BlogTalkRadio.com by 10X AND increase speed from an average of 10 seconds to 5 seconds per page load.

The architecture on how we scaled the site are a topic of another blog post. This blog post pertains to some critical tools that we used to drop the average page load in half even though volume increased dramatically.

If you are not measuring your site speed and the experience for your users, you are throwing darts against a wall in the dark. There are countless studies showing how page performance directly leads to increased visitor engagement and revenue.

The image below shows some of the famous stories where page performance increased revenue.

Our story is very similar. We have definitely seen an increase in revenue and visitor activity with faster page performance.

So what tools do we use to understand how our site is performing?

They key ones that we use are NewRelic and Dynatrace. We have always used Firebug and webpagetest.org but these only show you one off loads which often is not representative of your whole user base.

NewRelic shows us how each page performs, what part of the page life cycle is doing and even drills down to the back end and db calls.

Dynatrace has two parts to it.

  • There is a Free¬†client piece that can be installed on your machine that gives¬†extremely¬†detailed information about how your page loads and about JS execution. It will show where your JS is running poorly.
  • There is an expensive server side that can show you down to the line of code how your pages are performing.

Using these tools, we worked release by release making changes to the hotspots that we found to increase performance.

These tools come in very handy when you release code to ensure that your changes did not cause an unanticipated performance issue. We had such an issue where we released something that was supposed to improve performance but under real conditions created a bug which greatly slowed down the site. We knew within an hour that we had an issue and released a hotfix. Without NewRelic, we would not have known that we had such an issue.

A HUGE win for us which showed the value of the Dynatrace Server tool happened the first day we used the  tool. We noticed that we kept seeing a helper class method call when reading the config variables taking longer than expected. Was taking 10-50ms each time. Some pages it would do this a couple of times.

We make a lot of use of config settings on almost every page of the site. The helper class that we were using was written years ago and used everywhere. Nobody has looked at that class since it was written.

Since Dynatrace was showing this class in the traces, I decided to look into it.

public static class myAppSettings
{
        public static string GetKeyAsString(string keyName)
        {
             try
             {
                 var config = WebConfigurationManager.OpenWebConfiguration("~");                 
                 var appSection = config.GetSection("appSettings") as AppSettingsSection;

                 try
                 {
                      return appSection.Settings[keyName].Value;
                 }
                 catch (NullReferenceException)
                 {
                     return keyName + " is missing";
                 }
             }
            catch
            {
                return WebConfigurationManager.AppSettings[keyName];
            }
       }
 }

Do you see the issue in the above code? This method is called every time we read a config value. This happens more than once per page load.

Issue was “var config = WebConfigurationManager.OpenWebConfiguration(“~”);”
This opens the config file. This code, every time a config variable was read, rather than reading from an in memory dictionary as expected, was opening the config file. Not good.

Since this is a static class, we just changed to open the file once when the class is instatiated as such:

public static class myAppSettings
{
        static Configuration config = WebConfigurationManager.OpenWebConfiguration("~");
        public static string GetKeyAsString(string keyName)
        {
             try
             {            
                 var appSection = config.GetSection("appSettings") as AppSettingsSection;

                 try
                 {
                      return appSection.Settings[keyName].Value;
                 }
                 catch (NullReferenceException)
                 {
                     return keyName + " is missing";
                 }
             }
            catch
            {
                return WebConfigurationManager.AppSettings[keyName];
            }
       }
 }

This seemingly small change yielded the following change in the NewRelic stats that shows the average time it takes on the server to generate the html.

You can see the dramatic drop in time on the day that we released the change. Finding silver bullets like this are very exciting.

This also explained a couple of odd behaviors that we noticed over the last couple of years that we were not able to understand until we started using these tools.
1) I always wondered why, on our site, when we pushed out a new config file, it immediately picked up any changes. I expected that it should only pick it up when the app pool was recycled. Now we know. It was opening it constantly.

2) A change we put in the header caused the web servers to jump in CPU. This change had included a couple of lookups in the config. This led us to think that reading the config was slow even though all of the articles that we read said it should be just reading from a dictionary in memory. Now we know why it was slow.

Some other hot spots that we found using these tools that helped to speed them up were around how we were loading third party widgets.
1) Facebook Comments needed to be made asynchronous and moved to after dom ready. We learned that you cannot move it till after the load event because it wont render.
2) Addthis share bar loads, twitter, google plus, facebook like, etc. Even though the JS that they provide says it is non blocking. It was to a point and loaded oddly on the page. By changing how it loads and moving it later in the page cycle, it enabled us to ensure the page was rendered before loading these third party tools.
3) On some of our pages we pull in json via an ajax request on dom ready and merge it with a template on the page to display information. By merely putting JS at the bottom of the page, it still executes before dom ready. therefore, some third party js was blocking what we wanted to do on dom ready.

Key learnings is not to just load third party code where they recommend. If we did that our pages would still be 10 seconds+. Experiment with moving them around and controlling when they load via $(window).load and $(document).ready.

NOTE: Be very careful of 3rd party code that has document.write in it. If you move that after the dom has been created, it will break most browsers. If Chrome sees it, it creates a new dom and blanks the page.

So without these tools (as well as firebug, Chrome Developer tools, webpagetest.org) we were flying blind as to how our site was performing out in the wild. Very often, it is different than in your lab setting and definately different than just looking at it as a one off page load via firebug or webpagetest.org. Don’t get me wrong, firebug and webpagetest.org are important tools. However, when you look at actual performance en-masse, in the wild, you see a ton of things that you normally wouldnt. Understanding what your site is doing, why it is doing it, and how your user’s experience is affected is critical to the success of your web property.

 

Tags: , , ,