Getting Ready for Cassandra Summit 2016

cassandra

Apache Cassandra is a highly scalable, high performance distributed database, designed to handle large amounts of data across multiple servers providing high availability with no single point of failure.

It is almost an year we started exploring Apache Cassandra. In the last 12 months, we created several prototypes and spent countless number of hours testing the various features cassandra offers. So far we are very satisfied with the results and moving ahead with migrating one of our databases to Cassandra on production.

casssandra summit

Cassandra Summit 2016 is happening in San Jose from 6th to 9th September 2016. I see this as an opportunity to learn more about Cassandra, see real world case studies and meet a bunch of Cassandra experts. This is going to be my first experience attending a conference on Cassandra and I am hopeful that this will be a different learning experience, much different from the various resources we used so far to build our Cassandra expertise.

From SQL Server to Cassandra

This is the first time I am getting really serious about a database management system other than SQL Server. I have been actively focusing on Microsoft SQL Server for over 15 years. I started with SQL Server 6.5/7.0 in the late nineties and it was then a non-stop journey for close to two decades. During this period, I worked on several high volume SQL Server centric applications, wrote two books (and contributed to a third one) and numerous SQL Server articles and blog posts, presented at several SQL Server conferences in India, USA and Europe. I was a Microsoft MVP for 8 years and got several opportunities to be at Redmond, had close interactions with the core SQL Server team and given access to several insiders resources. After spending time and energy on SQL Server for so long, shifting attention to another database management system was certainly painful.

The tests and experiments we did with Cassandra so far show very promising results to address some of the challenges we are having in our environment. We found the scalability features built into Cassandra to be amazing. We already have a distributed database environment built with Microsoft SQL Server, so we have ‘sort of’ addressed the scaling issue. However,  we see that Cassandra can make our life much easier and remove a lot of database administration overheads and bring significant reduction on licensing costs (we all know that SQL Server EE (Enterprise Edition) is usually mocked as Expensive Edition).

Cassandra Summit 2016

Cassandra Summit 2016 is happening at San Jose Convention Center from September 6 to 9. I have already selected the sessions I am going to attend :). Here is my wish list:

  1. Monitoring Cassandra: Don’t miss a thing – by Alain Rodriguez
  2. Cassandra Internals: The Read Path – by Tyler Hobbs
  3. Moving from Experiment to Production – by Christos Kalantzis
  4. Monitoring Cassandra at Scale – by Jason Cacciatore
  5. Cassandra Backups and Restorations using Ansible – by Joshua Wickman
  6. Hey Relational Developer, Let’s go crazy – by Patrick McFadin
  7. Troubleshooting Cassandra – by J. B. Langston
  8. Everything you wanted to learn about Tunable Consistency – by Edward Capriolo
  9. Securing Cassandra for Compliance (or Paranoia) – by Nate McCall
  10. Building a Multi-region cluster at target – by Aaron Ploetz
  11. Always On: Building highly available applications on Cassandra –  by Robbie Strickland
  12. Apache Cassandra multi-datacenter essentials – by Julien Anguenot
  13. Scalable Data Modeling by example – by Carlos Alonso
  14. The Promise and Perils of encrypting Cassandra Data – by Ameesh Divatia

 

Cassandra Summit runs several parallel tracks and it put me in dilemma a number of times while selecting the sessions I want to attend. I have very high expectations from this conference and I will share my experience once I am back.

ABAW Challenge #5 – Alibaba – The House That Jack Ma Built

Another week into the A Book A Week Challenge and the book I picked was Alibaba – The House That Jack Ma Built by @duncanclark

This book narrates how Jack Ma built Alibaba, one of the world’s most valuable companies.

9780062413406w

As mentioned in my previous post, I wanted to ensure that I read a biography every month as part of my ABAW challenge. A book on Jack Ma and Alibaba was obviously the first choice because I have been hearing a lot and following closely Alibaba for quite some time.

Today, Alibaba is one of the most valuable companies in the world and they are no more just an e-commerce portal (as previously thought). This book helped me to shape a better picture of Jack Ma and Alibaba. The narration of the humble beginning of Alibaba and how Jack Ma reacted to the challenges are very inspiring.

 Alibaba – The House That Jack Ma Built is certainly one of the best books I read this year.

My Plan for Next Week

The following books are in my reading list for next week. I will pick one of the books from the list below.

A Book A Week Challenge – Adding some structure and discipline to the reading list

As I am done with the 4th week of A Book A Week challenge,  I am trying to add some structure to the reading list. So far the category of books picked for reading was random and no long term goals were defined in terms of the type of books to add to my reading list.

I spent the last 4 weeks reading the following books

Technology

second machine ageindustries of future

  1. The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies
  2. The Industries of the Future

Software Engineering

SITE RELIABILITY ENGtdd

  1. Site Reliability Engineering: How Google Runs Production Systems
  2. Professional Test Driven Development with C#: Developing Real World Applications with TDD

Structure of my future reading list

As this challenge is getting more and more interesting, I am trying setup a structure that pushes me to read books from specific categories/topics each month. Here is what I have in my mind.

  • Software Engineering – A book covering a branch of software engineering, software development process/methodology etc.
  • Technology – A book covering innovations and advances happening (as well as predicted) in various technology areas.
  • Biography – Biography of a legendary person.
  • Wildcard – A book that does not fall into any of the categories listed above.

I will be happy to hear suggestions from people who may have done similar activities.

 

ABAW Challenge #4 – The Industries of Future

Another week into the ABAW Challenge and the book I picked was The Industries of the Future by Alec Ross. 

This is a very interesting book covering the innovations happening around the world. The author predicts that Genomics, Robotics and Cyber Security are going to be the most prominent industries for the years to come.

 

industries of future

This book is written by Alec Ross who was Senior Advisor for Innovation to Secretary of State Hillary Clinton. During his tenure as Senior Advisor for Innovation he visited a large number of countries and closely studied the technological advances happening around the world.

This books examines the innovations happening in several industries and highlights the advances in Robotics, Genetics, Big Data and Cyber Security as the promising industries for the future.

I bought the Audible version of the book and that was a good choice. The audible version of the book was narrated by the author himself, so I could hear Alec talking so passionately about the innovations around the industries of the future.

Did you read this book?

If you have read this book, I am eager to hear what you think about this book.

My Plan for Next Week

The following books are in my reading list for next week. I will pick one of the books from the list below.

Any other recommendations?

SELECT * FROM XML

This is a 6 year old post which I originally posted in my previous blog site.

May 30 2010 11:49PM

Most people find it very difficult to deal with XML documents in TSQL as there is no way to run a ‘blind’ SELECT * query on an XML document to get a quick view of the content stored in it. A “select TOP N *” query can quickly give you a few records from the table which will give you an idea about the structure of the table and the type of values stored in the columns.  One of the common queries that I run on a table that I am not familiar with is

SELECT TOP 1 * FROM tablename

This query will give me one record that I can review and understand the structure of the table. However, it is really hard to do something similar for an XML document. The “*” operator does not work for XML and hence I can write a query on the XML document only if I know the structure of the XML.

To make this easier, I have come up with a function that can give you a “SELECT * FROM XML” kind of functionality. You can pass an XML document to the function and it will return a tabular representation of the XML data. Here is an example that shows how you can use this function.

declare @x xml
select @x = '
<employees>
    <emp name="jacob"/>
    <emp name="steve">
        <phone>123</phone>
    </emp>
</employees>
'
SELECT * FROM dbo.XMLTable(@x) 

/*
NodeName  NodeType  XPath                        TreeView      Value XmlData
--------- --------- ---------------------------- ------------- ----- -------------
employees Element   employees[1]                 employees     NULL  &amp;lt;employees&amp;gt;..
emp       Element   employees[1]/emp[1]              emp       NULL  &amp;lt;emp name=&amp;quot;..
name      Attribute employees[1]/emp[1]/@name            @name jacob NULL
emp       Element   employees[1]/emp[2]              emp       NULL  &amp;lt;emp name=&amp;quot;..
name      Attribute employees[1]/emp[2]/@name            @name steve NULL
phone     Element   employees[1]/emp[2]/phone[1]         phone 123   &amp;lt;phone&amp;gt;123&amp;lt;..
*/

The ‘XPath’ column may be very helpful as it shows the XPath expression that you can use to retrieve a specific value from the XML document. For example, to retrieve the phone number, you can copy the XPath expression from the above result and directly put it in a query such as:

SELECT @x.value('employees[1]/emp[2]/phone[1]','VARCHAR(20)') AS Phone

/*
Phone
--------------------
123
*/

Here is the complete listing of the function.

I have posted the code to gist so that people playing with this code can extend it and submit revisions.

Next Steps

  1. The function currently does not support namespaces. The next version will add support for namespaces
  2. Let me know your comments and feedback on this function.

What is your favorite Browser?

Which browser is the best? Is it Chrome, IE, EDGE, Firefox or Safari? This is a topic of discussion that I hear frequently around me. Usually people getting into such a discussion highlight the pros of their favorite browser and the cons of the browsers they do not like.

I think it is more of a personal preference and convenience rather than the features or specific quality of a given browser.

I use Chrome as my primary browser. Almost 80% of my browser time is on Chrome. The second most frequently used browser is Firefox. I am probably using Firefox for approximately 15% of my total web time. Then I use IE for the remaining 5% of browser activity.

I am not saying that Chrome is great and IE is nasty. I think I am more comfortable using those 3 browsers the way I do currently. May be a matter of personal preference.

Microsoft Employees using Chrome for Demos

What triggered this post is something I observed when going through a few microsoft videos recently. I was watching the presentation on Azure Redis Cache and noticed that the PM demonstrating Azure web portal is using Chrome browser, not Microsoft’s IE or EDGE.

See the videos at https://azure.microsoft.com/en-in/documentation/videos/index/?services=redis-cache

Then I looked at close to a dozen videos in the series. All of them were showing demos using Chrome. Well, is that a crime? No, certainly not. It is absolutely a personal choice. But the fact I am trying to highlight is that more and more people I see around are  comfortable with Chrome more than they are with IE.

I am not saying that all Microsoft PMs are using Chrome. I did see some demos where the presenter was showing web applications on IE/EDGE. So, it is again a personal choice. I suppose Microsoft does not have a policy to insist its employees to use its own browser. I think that is very good. ‘Freedom of Browser’ is very well appreciated.

New IE versions are always painful

One pain-point I observed over the past many years is that every new version of IE breaks applications. Every time a new version of IE was released, we had to start a project to make our application compatible with the new version of IE. This has been happening from IE7 till today.

Is it their fault? Probably not. But it was painful and time consuming to do those updates every time we had to make the application ‘compatible’ with a new IE version.

However, the same did not happen with other browsers. I do not remember any scenarios where something broke with a new update of Chrome or Firefox. May be it is just a coincidence or may be they were very careful NOT to break the existing applications when releasing a new browser version.

What is your favorite Browser?

I think it will be interesting to hear about your favorite browser. Are you a Chrome addict or IE/EDGE supporter? Or is it FireFox, Safari or any other browser that you mostly use?

A Book A Week Challenge #3 – Test Driven Development (TDD)

Just completed the 3rd week in my ABAW challenge and the book I picked was Professional Test Driven Development with C#: Developing Real World Applications with TDD by James Bender and Jeff McWherter.

If you are new to TDD (Test Driven Development), this is a great resource to understand TDD and get started. I highly recommend this book for TDD beginners.

If you already have some experience with TDD, then this may not help you much. This might, however, introduce you to a few new tools and frameworks that you might find helpful / informative.

tdd

I used to be a big fan of TDD, back in the old days when I used to do a lot of development. The last time I touched it myself was in 2006. In the past 10 years, a lot changed in and around TDD methodologies, tools and frameworks. This book certainly helped me to fill that gap reasonably well.

Back in my development days, I always had difficulty in clearly defining a unit test to my team members. Often I found integration tests being written as unit tests because the boundary was not very clear. This book does a good job in defining the scope and boundaries of a unit test and differentiates it from other type of tests.

The chapter covering mock frameworks is very informative as well. Mock frameworks are largely ignored by many development teams that I interact with, and this book does a very good job in explaining the value of using a mock framework and how it makes everything work together within the TDD process.

A very good understanding of Object Oriented Programming (OOP) concepts is required to be able to implement TDD successfully within any project. While most other books that I looked at assumed that the readers are already familiar with OOP, the authors of this book have put a lot of efforts into helping readers refresh their OOPs understanding and gradually guide them to practical TDD.

Next Week

My plan for next week is one of the following:

See you next week!