SELECT * FROM XML

This is a 6 year old post which I originally posted in my previous blog site.

May 30 2010 11:49PM

Most people find it very difficult to deal with XML documents in TSQL as there is no way to run a ‘blind’ SELECT * query on an XML document to get a quick view of the content stored in it. A “select TOP N *” query can quickly give you a few records from the table which will give you an idea about the structure of the table and the type of values stored in the columns.  One of the common queries that I run on a table that I am not familiar with is

SELECT TOP 1 * FROM tablename

This query will give me one record that I can review and understand the structure of the table. However, it is really hard to do something similar for an XML document. The “*” operator does not work for XML and hence I can write a query on the XML document only if I know the structure of the XML.

To make this easier, I have come up with a function that can give you a “SELECT * FROM XML” kind of functionality. You can pass an XML document to the function and it will return a tabular representation of the XML data. Here is an example that shows how you can use this function.

declare @x xml
select @x = '
<employees>
    <emp name="jacob"/>
    <emp name="steve">
        <phone>123</phone>
    </emp>
</employees>
'
SELECT * FROM dbo.XMLTable(@x) 

/*
NodeName  NodeType  XPath                        TreeView      Value XmlData
--------- --------- ---------------------------- ------------- ----- -------------
employees Element   employees[1]                 employees     NULL  &amp;lt;employees&amp;gt;..
emp       Element   employees[1]/emp[1]              emp       NULL  &amp;lt;emp name=&amp;quot;..
name      Attribute employees[1]/emp[1]/@name            @name jacob NULL
emp       Element   employees[1]/emp[2]              emp       NULL  &amp;lt;emp name=&amp;quot;..
name      Attribute employees[1]/emp[2]/@name            @name steve NULL
phone     Element   employees[1]/emp[2]/phone[1]         phone 123   &amp;lt;phone&amp;gt;123&amp;lt;..
*/

The ‘XPath’ column may be very helpful as it shows the XPath expression that you can use to retrieve a specific value from the XML document. For example, to retrieve the phone number, you can copy the XPath expression from the above result and directly put it in a query such as:

SELECT @x.value('employees[1]/emp[2]/phone[1]','VARCHAR(20)') AS Phone

/*
Phone
--------------------
123
*/

Here is the complete listing of the function.

I have posted the code to gist so that people playing with this code can extend it and submit revisions.

Next Steps

  1. The function currently does not support namespaces. The next version will add support for namespaces
  2. Let me know your comments and feedback on this function.

What is your favorite Browser?

Which browser is the best? Is it Chrome, IE, EDGE, Firefox or Safari? This is a topic of discussion that I hear frequently around me. Usually people getting into such a discussion highlight the pros of their favorite browser and the cons of the browsers they do not like.

I think it is more of a personal preference and convenience rather than the features or specific quality of a given browser.

I use Chrome as my primary browser. Almost 80% of my browser time is on Chrome. The second most frequently used browser is Firefox. I am probably using Firefox for approximately 15% of my total web time. Then I use IE for the remaining 5% of browser activity.

I am not saying that Chrome is great and IE is nasty. I think I am more comfortable using those 3 browsers the way I do currently. May be a matter of personal preference.

Microsoft Employees using Chrome for Demos

What triggered this post is something I observed when going through a few microsoft videos recently. I was watching the presentation on Azure Redis Cache and noticed that the PM demonstrating Azure web portal is using Chrome browser, not Microsoft’s IE or EDGE.

See the videos at https://azure.microsoft.com/en-in/documentation/videos/index/?services=redis-cache

Then I looked at close to a dozen videos in the series. All of them were showing demos using Chrome. Well, is that a crime? No, certainly not. It is absolutely a personal choice. But the fact I am trying to highlight is that more and more people I see around are  comfortable with Chrome more than they are with IE.

I am not saying that all Microsoft PMs are using Chrome. I did see some demos where the presenter was showing web applications on IE/EDGE. So, it is again a personal choice. I suppose Microsoft does not have a policy to insist its employees to use its own browser. I think that is very good. ‘Freedom of Browser’ is very well appreciated.

New IE versions are always painful

One pain-point I observed over the past many years is that every new version of IE breaks applications. Every time a new version of IE was released, we had to start a project to make our application compatible with the new version of IE. This has been happening from IE7 till today.

Is it their fault? Probably not. But it was painful and time consuming to do those updates every time we had to make the application ‘compatible’ with a new IE version.

However, the same did not happen with other browsers. I do not remember any scenarios where something broke with a new update of Chrome or Firefox. May be it is just a coincidence or may be they were very careful NOT to break the existing applications when releasing a new browser version.

What is your favorite Browser?

I think it will be interesting to hear about your favorite browser. Are you a Chrome addict or IE/EDGE supporter? Or is it FireFox, Safari or any other browser that you mostly use?

A Book A Week Challenge #3 – Test Driven Development (TDD)

Just completed the 3rd week in my ABAW challenge and the book I picked was Professional Test Driven Development with C#: Developing Real World Applications with TDD by James Bender and Jeff McWherter.

If you are new to TDD (Test Driven Development), this is a great resource to understand TDD and get started. I highly recommend this book for TDD beginners.

If you already have some experience with TDD, then this may not help you much. This might, however, introduce you to a few new tools and frameworks that you might find helpful / informative.

tdd

I used to be a big fan of TDD, back in the old days when I used to do a lot of development. The last time I touched it myself was in 2006. In the past 10 years, a lot changed in and around TDD methodologies, tools and frameworks. This book certainly helped me to fill that gap reasonably well.

Back in my development days, I always had difficulty in clearly defining a unit test to my team members. Often I found integration tests being written as unit tests because the boundary was not very clear. This book does a good job in defining the scope and boundaries of a unit test and differentiates it from other type of tests.

The chapter covering mock frameworks is very informative as well. Mock frameworks are largely ignored by many development teams that I interact with, and this book does a very good job in explaining the value of using a mock framework and how it makes everything work together within the TDD process.

A very good understanding of Object Oriented Programming (OOP) concepts is required to be able to implement TDD successfully within any project. While most other books that I looked at assumed that the readers are already familiar with OOP, the authors of this book have put a lot of efforts into helping readers refresh their OOPs understanding and gradually guide them to practical TDD.

Next Week

My plan for next week is one of the following:

See you next week!

ABAW Challenge #2 – Site Reliability Engineering: How google runs production systems

SITE RELIABILITY ENG

It is the second week of ABAW Challenge and the book I picked for this week was Site Reliability Engineering: How Google Runs Production Systems.

I would like to encourage people working on DevOps, Server Administration, DBA and System/Software Architecture and similar roles to read this book.

Probably nobody knows how to run a production system better than the google team. I live in a country where most of us type http://www.google.com on the web browser to check whether the internet is working or not. We trust the availability of the google.com website more than the availability of the internet data connection. Even if http://www.google.com is down, we still believe it is an internet problem, because we trust google to be always up and running.

This book is written by engineers (actually dozens of those engineers) who run the google production systems, the team that is responsible for the availability and performance of google products. This team is called SRE (Site Reliability Engineering Team) within Google.

The main attraction for buying this book and spending a week reading, is the fact that this is written by those engineers who are running my favorite google products. This was part of the efforts to listen to them and to understand their vision, approach, thinking and the way of working. I went over the whole book with full attention and focus to see what I can adopt from what those SRE engineers do within Google.

site reliability engineering

I bought a Kindle version of the book and that is what I read this week. I thought this is going to be a good item to read for my DevOps team, and therefore I bought a printed copy of this book as well.

Overall, I liked this book very much. A lot of the tools, systems and environments described in this book exist only within Google and therefore they did not help much directly. However, this book helped me to understand how those engineers work, create and track SLOs, handling outages, processes and methodologies in place etc. Moving forward, I am going to encourage my team to put more focus on the postmortem reports after every outage, organize and manage those reports and use them as a reference point for training as well as future fixes.

I would like to encourage people working on DevOps, Server Administration, DBA and System/Software Architecture and similar roles to read this book. I would also recommend the DevOps/SRE engineers at Amazon read this as well, because their login page is down for the last 15 minutes and I can’t look at my wish list I compiled there!

amazon login

Next Week

Now that I am feeling more confident about being able to continue this challenge, I am trying to pick the book for next week. The following are on the top of my list and I will pick one of them.

If any of you have read one or more of the above books, I would love to hear your feedback and will make my choice based on that 🙂

ABAW Challenge – The Second Machine Age: Work, Progress and Prosperity in a Time of Brilliant Technologies

Last week I started a challenge, which I call ABAW Challenge (A book a Week) to motivate myself to read a book every week. It was a serious challenge for me, because my hectic work schedule left me little time to focus on anything else. Interestingly, it is the difficulty level which inspired me to go ahead and attempt this almost impossible mission.

The book I picked last week was The Second Machine Age written by Erik Brynjolfsson and Andrew McAfee.

second machine age

The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies

“The Second Machine Age” is a New York Times, Wall Street Journal and Washington Post Bestseller!

I decided to buy an Audible version of the book so that I can listen to it. This allowed me to efficiently use my time for this project, when a normal reading was naturally not possible: such as when driving. The audio recording of the book was 8 hours and 50 minutes and I was able to complete it within a week.

The audible version of the book was narrated by Jeff Cummings and I must say that I loved narration.

Even thought the whole exercise looked like a challenge when I started with this, the journey quickly became very enjoyable and exciting. I did not want to write about this until I was sure that I can continue this exercise for quite some time.

I am a great fan of the industrial revolution, which the authors of the book call ‘First Machine Age’. Years ago, I had read several books on industrial revolution and its history, progress and explosion, watched many movies, documentaries and videos; including the famous Charlie Chaplin Movie Modern Times. Some of the remarks in this book about the first machine age reminded me about all those and it indeed multiplied the fun!

This book is a very interesting read, not only for technology professionals, but also for anyone having an interest in science, technology and computers.

 

 

The secret performance tuning button

secret button

Over the years, I have felt at several occasions that people think there is secret performance tuning button hidden somewhere in the application/environment that a performance tuning expert can locate with his magic wand and turn ON to boost the performance of their application. This is especially evident when they reach out to you indicating how soon they want the problem fixed and how much improvement they expect to see after you turn the button on.

Please note that I do not intend to disrespect their expectations. When someone is responsible for a serious application and he/she is hit with a performance problem, there is certainly an emergency and the rules of emergency are completely different. A performance expert can show some value additions only if he/she can address the emergency in a timely and satisfactory manner. So the expectation of a magic button may be well acceptable in this context.

The fact is that there is no such button or short cut available to turn a performance starving application to performance rich application (in most cases). I have seen such buttons in video games (especially in car racing) but have not seen in real world applications yet.

A performance turning expert usually achieve the desired goals by performing either one or both of the following:

  1. Cut down any unnecessary operations that adds overhead and slows down the application.
  2. Improve the efficiency of the operations by a combination of hardware, software and architecture rework.

Most of the activities involved in a tuning project can be broadly classified into one of the two categories above (usually). But there can always be exceptions which varies from case to case. So I don’t want draw concrete lines here. The goal of this post is to touch this from a very high altitude.

Do-it-yourself options

You don’t need to be a performance tuning expert to get started with some basic troubleshooting. You can start a performance tuning exercise by checking the best practices, check lists and “don’t-do’s”. This is a check list that anyone can use to identify possible traps or shortfalls. Each organization or team may have its own check lists such as database configuration best practices, database development best practices, application programming best practices etc. The best practices may be evolved over period of time gaining knowledge from mistakes done in the past as well as influence from the gradual learning the team is going through. A new hire may bring in a bunch of additional check lists and best practices in to the team which may be the result of his/her experience with the previous projects.

So, in most cases, before an expert can step in and help you (except for emergencies), you can help yourself by quickly checking your code, application, environment, configuration against the known/available best practices and check lists. Most products/platforms publish basic guidelines and check lists to get the best out of their products/platforms.

In many cases, you may be able to resolve some of the performance problems by following the best practices. Keep in mind that there is no single ‘best’ approach for all problems or environments. An approach that works for John may not work for Michael. Each environment, use case, architecture and challenge may be different and the best practices may be used as a set of basic guide lines.

Discovering tuning opportunities

When a performance tuning expert starts looking into your problem, usually he/she will be focusing on identifying possible ‘tuning opportunities’. I believe this is half an art and half a science. So different people can come up with solutions providing different levels of performance results, primarily due to the ‘art’ part of it. I believe, to be able do a good job in performance tuning, one may need to develop an attitude and way of thinking that is performance oriented. It is important to realize that even milliseconds matter. A tuning opportunity that reduces an operation by 1 millisecond may result in huge performance improvement if that operation is used heavily. I was hardly lucky to find single tuning opportunity that saved many seconds with a single change. Instead most of the tuning opportunity that I dealt with were within milliseconds.

Examining the Hardware and Software Layers

I wrote my first program in early 90’s with DBase, which was single user, single threaded, single tier and ran on a single computer. DBase managed the data and user interface of my application. Everything was tightly bound into DBase engine. When a user performs an application function, the command goes directly into the DBase engine and the route was pretty short between the user and the core application code/data.

This is not the scenario today because most applications are accessed by millions of users over the internet simultaneously. The performance of your application depends on a number of hardware, software and network components within and outside your application. On the far end outside your reach exists the end user’s computer, its configuration, the capabilities of the browser running your application, internet speed of the user etc. Within your data centre, a number of hardware components affect the performance of your application such as the data centre bandwidth, capabilities of the router and switches, intranet speed, capabilities of the load balancers, server hardware etc. Within your application, the various components and subsystems such as the presentation layer, business layer, caching layer, database layer, external APIs and any additional layers you may have affects the overall performance of the application.

A tuning expert usually will look at the different layers and subsystems to identify the area where he/she can find the most tuning opportunity. Based on that he/she may plan and execute a tuning plan to achieve the desired performance goals. Sometimes the rework or changes to a single layer may help to achieve the expected performance goals. Many times it may also need rework on multiple layers, introduction or removal of one or more layers as well as major rework on the overall architecture.

Zooming into the layers

In the next several posts, I would like to drill down into each of those layers and share my thoughts and comments on various check lists, best practices, tuning opportunities, common mistakes etc. None of the ideas and approaches I am presenting through these series of posts are claimed to be the ‘best’ or ‘the only’ of its kind.  These ideas and thoughts are shaped through my experience, what I have seen, heard and done over the years and what I have learned from reading, listening or speaking to other experts in the industry.

To be able to serve as ‘Quick Reads’ and avoid boredom, I will try to limit the posts to be around 1000 words. Whoops, It is already 1128 words. Bye, see you soon.