How I Became a…SQL Server DBA

Kevin Hill mentioned this idea/series on a SQL community slack channel back in April and I thought it would be a good way to get back to blogging. The timing worked out well as I had just started a new job, my first with the official title of “SQL Server DBA.” So how’d I get here?

In college, I took a single database course. I’d messed around with Microsoft Access a bit, but wanted to get a better handle on what I was doing. The course was not at all what I was expecting. I passed and did OK, but I didn’t completely grasp the material. The class was mostly deep RDBMS theory including “how do we store this on disk” – I wrote minimal amounts of SQL in this course because it wasn’t required.

I graduated and took my shiny new Computer Science diploma to my first job, and within a few months I had a solid handle on Classic ASP, building apps with it and handling some of the server admin stuff on the NT4 boxes that hosted them. I spent a little over 5 years there and got minimal exposure to databases as that wasn’t what my job function demanded – I’d write some queries against DB2 on the mainframe or a SQL Server instance, but that was about it. The DBAs took care of everything else.

After a few years, I moved on from that position as I wanted to relocate for personal reasons. I found a job doing some Java work on an in-house application and system customization/integration for a purchased application that was used as the hub for the company’s core business. In the course of working on those systems, I started doing a lot more SQL work, but at the time I only knew enough to be dangerous.

During a project to upgrade that system, I got a crash course in writing good SQL from Allen White (b|t), and learned much more about how SQL Server works from both him and Kendal Van Dyke (b|t). Allen and Kendal also introduced me to the SQL Server community and my eyes were opened. This was huge.

Over the next several years, I discovered that I was a developer who had DBA tendencies that I just hadn’t realized yet. I started to get involved with the SQL Server community. Talked to so many people. Subscribed to dozens of blogs. Attended SQL Saturdays and PASS Summits.

Then, one evening after we finished unpacking equipment and supplies from one of our Rochester SQL Saturdays, Matt Slocum (b|t) just asked me, point-blank. “So do you wanna be a DBA or what?” Ding! The lightbulb flicked on. I’m already doing a whole bunch of this stuff, and enjoying it – why not go for it?

I refocused my efforts on really understanding how SQL Server works. Looked for ways to leverage my programming experience with a slant toward managing databases. Did a lot more non-production DBA type work (I didn’t a lot of access to production, which was probably a good thing). After searching for a while, I landed a job as a full-time production DBA with a company operating a SaaS platform. It was a bit of a leap but one that I had to take as it was the right thing that came along at the right time. I’m nearly 2 months in now and I’ve learned a ton already. Made a few slip-ups, but that’s to be expected – just have to learn from that and move forward.

Advertisements

I Finally Get Cross Apply!

For years I’ve looked at various queries online in sample code, diagnostic queries using DMVs, and the like and seen CROSS APPLY in the FROM clauses. But I’ve never really managed to comprehend what it was for or how it worked because I never saw a case where it was directly applied to something I was doing.

Finally, this week I had a breakthrough. I was working on updating a bunch of data but it was breaking on a small subset of that data. In this case, I was attempting to JOIN two tables on fields that should have been INTs, but in a very small number of cases one side was using a comma-delimited string. The user told me that someone else had done these updates in the past and didn’t encounter the problem I was having (so I knew that it was something i was doing “wrong”), but given that it was only a handful of broken updates she was OK with manually doing the updates (we were scripting it because we were updating potentially tens of thousands of records).

I am not OK with manually fixing this in the future. I wanted to know how the other DBA had done it before. I dug into some history and found CROSS APPLY. My nemesis. I was determined to figure out how to use it this time.

Setting the Stage

Let’s set up three simple tables to keep track of airports and what state each airport is in. But our developer doesn’t totally get database design and in his state-to-airport mapping table, he allows for a comma-separated list of airports associated with each state.

    CREATE TABLE #States
    ([Id]      INT IDENTITY(1, 1),
    StateName NVARCHAR(30) NOT NULL
    );
    CREATE TABLE #Airports
    ([Id]     INT IDENTITY(1, 1),
    IATACode CHAR(3) NOT NULL
    );
    CREATE TABLE #StateAirports
    (StateId  INT PRIMARY KEY NOT NULL,
    Airports NVARCHAR(50)
    )

This makes getting a list of airports and their associated state names tricky at best if we don’t know about CROSS APPLY. With CROSS APPLY, it’s pretty straightforward.

Solution

Here’s the finished query.

    SELECT s.statename,
        a.iatacode
    FROM #StateAirports SA1
        CROSS APPLY string_split(SA1.airports, ',') AS SA2
        JOIN #Airports A ON A.Id = SA2.value
        JOIN #states S ON S.Id = SA1.stateid

string_split() is a Table Valued Function which we finally got in SQL Server 2016 after far too many years of having to write (or, let’s face it, copy from someone’s blog post) inefficient string splitting functions. Important note: even if your database engine is SQL Server 2016, the database you’re operating in must be at CompatibilityLevel 130

Breaking it down

If we take CROSS APPLY and break it down into its parts, it finally starts to make sense.
* APPLY the string_split() function to the Airports field of the #StateAirports table
* Append the each row of string_split()‘s output to the corresponding row on #StateAirports (similar to a CROSS JOIN but not exactly)

So now I have N rows for each StateId in #StateAirports, where N is the number of values in the comma-separated field. And JOINed to each row is one of the rows from the output of string_split().

    SELECT *
    FROM #StateAirports SA1
        CROSS APPLY string_split(SA1.airports, ',') AS SA2

From there, the query is pretty normal otherwise, JOINing to the other two tables to translate the state & airport ID numbers to their text values.

Hopefully this helps others get a handle on CROSS APPLY and find useful places for it. This had been a head-scratched for me for years, but only because I didn’t have an example that clearly broke down how to use it and what was going on. In hindsight, I probably could have used it in some analysis I did at a previous job but instead resorted to parsing & processing comma-separated data in a PowerShell script.