T-SQL Tuesday #92: Lessons Learned the Hard Way

tsql2sday-300x300This month’s T-SQL Tuesday is hosted by Raul Gonzalez and he’s asked everyone to share things we might be a bit embarrassed about:

For this month, I want you peers to write about those important lessons that you learned the hard way, for instance something you did and put your systems down or maybe something you didn’t do and took your systems down. It can be also a bad decision you or someone else took back in the day and you’re still paying for it…

  • In the stress/performance testing portion of an upgrade of a critical system, we were short on disk space. So, rather than having a separate set of VMs for the performance testing (as we needed to be able to get back to functional testing quickly), we decided to just take VM snapshots of all the servers. Testing was delayed a day or two – but we didn’t switch off the snapshots. Then we started testing and performance was terrific…for about five minutes. Then everything came to a screeching halt. Panicked, we thought we were going to need a pile of new hardware until the VMWare admin realized that our disks were getting hammered and we still had those active snapshots.
    Lesson learned: If you take VM-level snapshots of your database server and let them “soak” for an extended period, you’re gonna have a bad time. Unless you need to take a snapshot of the host OS or instance configuration itself, use a database snapshot instead of a VM-level snapshot.

  • A couple of times, I’ve had under-performing VMs running SQL Server. As I hadn’t been involved in the configuration, I thought everything had been provisioned properly. Turns out…not so much. Memory reservations, storage configuration, power profiles, all set up for suboptimal performance.
    Lesson learned: Ask your VMWare admin if they’ve perused the best practices guide and review things yourself before going down the rabbit hole of SQL Server configuration & query tuning. If the underlying systems aren’t configured well, you’ll spin your wheels for a long time.

  • In doing a configuration review of a rather large (production) instance, I noted that at least one configuration option was still set to the default value – Cost Threshold for Parallelism was stuck at 5. Running sp_BlitzCache, I found that I had quite a few simple queries going parallel and huge CXPACKET waits. CXPACKET isn’t bad per se, but if you’ve got a low-cost query that’s going parallel and waiting on threads where it could be running faster overall single-threaded (verified this was the case for several of the top offenders), increasing the cost threshold can help. I did some checking, verified that it was a configuration change I could make on the fly, and set the value to 50.
    And then everything. Slowed. Down.
    When I made this configuration change on the test instance, it wasn’t much a problem. But that was a much smaller instance, with much less traffic. What I failed to fully comprehend was the impact of this operation. I overlooked that changing this setting (and a number of others I wasn’t aware of) blows out the plan cache. In the case of this instance, about 26Gb of plan cache. Not only was performance impacted while the plan cache was re-filled, we took a hit while all the old plans were being evicted from cache.
    Lesson learned: Even if it seemed OK in test, those “low impact” changes can have a much larger impact on production unless you can make test mirror production in every way. So plan when you make these changes accordingly.

We learn the most from our mistakes. We can learn almost as much from the mistakes of others. Learn from mine.

T-SQL Tuesday #86: SQL Server Bugs & Enhancement Requests

tsql2sday-300x300This month’s T-SQL Tuesday is hosted by Brent Ozar and he’s asked everyone to find interesting bug or enhancement requests in Microsoft Connect related to SQL Server.

The Connect item doesn’t have to have anything to do with T-SQL – it could be about the engine, SSRS, R, the installer, whatever. Now, more than ever, Microsoft has started to respond to Connect requests and get ’em fixed not just in upcoming versions of SQL Server, but even in cumulative updates for existing versions.

Confession time: At the moment, I’m not a DBA (I’m a dev who occasionally does some DBAish stuff), and I don’t spend a lot of time on Microsoft Connect.

However, I do spend a bunch of time in Powershell, fetching data to process/analyze/mangle for people on an ad-hoc basis, running scheduled tasks, or doing  research trying to find things in an application’s code (stored in the database). And I frequently find myself using Invoke-SqlCmd.

I found two open items which, although addressed by Invoke-SqlCmd2, would be of benefit to the community at large if resolved in the SqlServer module.

  1. Powershell invoke-sqlcmd should support parameterized queries – It’s 2017 and we’re still reading about SQL injection attacks caused by developers improperly escaping text input. Parameterized queries have been around for a very long time and go a long way toward mitigating this vulnerability. Unfortunately, with Invoke-SqlCmd‘s current state it can only accept a string as the query to be executed (or an input file), and that query is frequently created by Powershell users by either concatenating strings or doing text replacement. Even in cases where I’ve created all of the content being concatenated or replaced, I still don’t fully trust my input.
  2. Invoke-Sqlcmd does not seem to meaningfully support the -Verbose or -Debug switches – Most Powershell cmdlets allow you get additional information about the cmdlet’s execution sent to the Verbose and Debug output streams, but Invoke-SqlCmd does nothing. Seeing the connection string and other details about the query’s execution would be helpful in some cases. I’d like to see this request expanded to add the -WhatIf switch as well so that when called from scripts that properly support SupportsShouldProcess, the query isn’t actually executed if that switch is specified.

I had planned on logging a Connect item very similar to Mike Fal’s (b|t) request for passing a connection string to Invoke-SqlCmd, but logging a duplicate would be pointless  that’s already been implemented (I warned you that I don’t spend a lot of time on Connect, I didn’t even remember it was out there)! I still wouldn’t mind being able to pass in just an Application Name to append to the connection string, but that may lead down a path of making too many options to fiddle with in the cmdlet. It’s a tough balancing act when you have an audience as large as the SQL Server team’s.

T-SQL Tuesday #83: Why Leave Well Enough Alone?

It’s 2016. So why are we still dealing with T-SQL code and design patterns that were designed 7 versions ago?tsql2sday-300x300

 

In the 15 years I have been using databases professionally, we’re still dealing with:

  • Peoples’ names are split into first name, last name and middle initial fields. Ignoring that this falls afoul of several of the myths programmers believe about names, the first name column was defined as CHAR(10) in a standard installation. How many characters are in the name Christopher (hint: I had to take off a shoe to count them all)?
  • Other arbitrarily short column sizes which cause problems as the system scales out in usage. For example, an event ID field that’s 8 characters: 2 letters and a 6-digit number which is used as a sequence. Guess what happens when you hit the millionth event in that sequence.
  • Processes originally developed as transactions (for good reasons), but not designed in such a way that they scale to today’s demands.
  • NOLOCK hints everywhere. It’s even in newly-developed code for this application.
  • Cursors used anytime a set of records has to be updated with a small bit of conditional logic built in. A set-based operation with appropriate CASE statements would work much better.

The primary system I deal with on a daily basis was originally developed as a DOS application and several of the above examples are drawn from it. Looking at the core tables and columns, it’s easy to identify those that began life in those early days – they all have 8-character names. Time moved on and the system grew and evolved. DOS to Windows. Windows to the web. But the database, and the practices and patterns used in the database, haven’t come along for the ride.

Data schema conversions can be hard and disruptive – you need to update your application, your stored procedures, and provide customers/users with a clean migration path. Code changes require testing. Complexity and cost grows every time you introduce changes. I get that.

But by not keeping up with the advancements of the platform your data resides on and ignoring the evolution of how to work with your data, you do your customers, users, partners, colleagues and yourself a disservice.

How do you improve this? I’m certainly not advocating for scrapping everything and rewriting all of your code. Complete rewrites are generally a bad idea.

What I am saying, however, is:

  • You need to be constantly watching the state of the platforms your software runs on. If you drop support for a particular version (say, dropping SQL Server 2005 support as Microsoft no longer supports it), start evaluating the 2008+ features that are now open to you.
  • Drop support for old versions of SQL Server. Don’t let the past shackle your future.
  • Get outside opinions from trusted sources. Whether it be from your local user group, a short consulting engagement, or bringing in new people. But most importantly, when you seek advice, make use of it. Don’t ask for advice and then ignore it.
  • Don’t accept the status quo. Anytime you’re working in a piece of code, review the whole thing. Can this section be cleaned up? Is it even needed anymore? Has the system scaled in usage/data volume that it needs to be re-thought entirely? Have you learned something new from your favorite SQL Server blog or a SQL Saturday event that you can apply to it?

That last point is where everyone responsible for an application or database can make the most impact. To co-opt Baden-Powell’s last message to the Boy Scouts of the world: Leave the code a little better than you found it. If you do this every time you touch a component of your database, you’ll make enough incremental updates that these 15 year old problems will go away and stay away.

T-SQL Tuesday #61 – Giving Back

T-SQL Tuesday LogoWayne Sheffield (b|t) is hosting this month’s T-SQL Tuesday and his topic is Giving Back to the SQL Community. More specifically, he’s asking how each of us is planning on giving something back to the SQL Community in 2015. He offers up a few suggestions, so I’ll start by addressing those and then move on to additional ideas.

  • Are you going to start speaking at your local user group?
    Yes, I expect that by the end of 2015 I will have spoken to our local chapter at least once. I spoke to various groups at work in 2014 and plan to continue doing so in 2015 as well.
  • Perhaps step up and help run your local user group?
    I was named the Vice President of our local chapter a couple months ago, and I will continue in that capacity.
  • Do you want to start becoming an active blogger – or increase your blogging?
    Yes! At the time of this writing I’ve only published 7 posts here, and I have 6 others in various stages of preparation. I have some ideas brewing, I just need to get things written and then actually press that Publish button. Part of it is fear/insecurity, and I need to get out of my comfort zone a little and Just Do It.
  • Do you plan on volunteering your time with larger organizations (such as PASS), so that SQL Training can occur at a larger level?
    If I have the opportunity to attend PASS Summit in 2015, I will volunteer at the event. When the call for pre-event volunteers go out, I’ll look at what’s needed and try to step a little out of my comfort zone & do something there as well.
  • Other ways of contributing
    • For the 3rd year, I will be helping to organize and run SQL Saturday Rochester in 2015. If you’re reading this, you probably know about SQL Saturday, and have probably even been to one. Next time, bring a friend!
    • I’ve been promoting PASS and our local chapter for a while at work and will be a more vocal in 2015. There are a lot of people with knowledge and experience they can share who aren’t even aware that PASS and the local and virtual user groups exist. I want to help bring those people into the community.

T-SQL Tuesday #58 – Passwords

T-SQL Tuesday LogoThis month’s T-SQL Tuesday topic is passwords. I’m neither a DBA nor server/system admin, so the only passwords I get to manage are my own. But there’s still lots to talk about. Passwords (or rather, weak passwords) have been in the news a lotover the past two weeks, so it’s timely.

This is the password story I’d like to tell my kids, but they’re too young to understand yet.

What’s Your Password?

I can count on both hands the number of accounts I have actually memorized the password for.

  • Personal Laptop
  • Personal email (2 accounts)
  • Active Directory (work)
  • One non-production service account
  • 2 non-AD-integrated applications
  • Amazon
  • 1Password

Pretty much everything else is a very random string that I have no hope of memorizing. For example, n9;r27LBL8x2x6=X. It’s that last item above that lets me get away with it. Between password complexity rules, the increasing sophistication of attackers, and the frequency of major data breaches, it’s almost impossible to get by without some kind of password manager. You need to be changing your passwords regularly, and using strong ones. 1Password helps me with both of those; it shows me how old each password is, and it generates good, random passwords as seen above. All I have to remember is my master password. I rarely type a password now; it’s copy/pasted from 1Password, or automatically entered thanks to the browser extension.

But that’s not enough. A lot of websites insist upon providing more information for account “security” such as the name of my first pet, or what my first car was. But answering those questions truthfully doesn’t provide as much security as one might think, so I use 1Password to generate random strings for these answers and store those Q&A pairs along with the credentials.

Is that enough? No! Even if you did all of that, your credentials were still easy to capture for about two years thanks to the Heartbleed SSL bug.

More Layers

Security should be like an ogre…er…onion. You need layers. Passwords aren’t enough. More and more websites and services are offering Two-Factor Authentication (2FA) now, but they aren’t making it very well-known. Google, Dropbox, WordPress, Evernote, Facebook, Microsoft and GitHub (and that’s just the list I’ve got registered on my phone) will let you further secure your account by requiring you to enter a second code after your password, either one sent to your phone via text message (or phone call) or automatically generated via an app like Google Authenticator (not unlike a SecurID token). It’s an extra step, but it makes things a lot safer. Even if someone were to get your credentials in a Heartbleed-type attack, they’d be pretty useless with 2FA enabled – at least on that site.

Be Careful Out There

Many years ago when I was in college, I thought my password was safe. I wasn’t sharing it with anyone, and it was reasonably complex – for the time. Then over one Christmas break, my account was compromised. I was checking my mail via an unencrypted POP3 connection. The password got sniffed, and someone got into my account. Fortunately, no damage was done but the lesson was learned. With so much of our identities, personal & financial lives kept behind these virtual doors, it’s vital that we take every possible precaution in securing those accounts (unfortunately, I know too many people who are still not doing this). Sure, POP3 was quick & convenient, but it was extremely dangerous.

There are still websites/services that aren’t so careful. If you see a website that limits your password to anything less than about 30 characters, is not case-sensitive, or doesn’t allow you to use certain characters, there’s a chance they’re not storing passwords safely (I won’t get into that with this post, as I’m sure another T-SQL Tuesday author will dive in and do it better than I would). And if any website sends you your password (the one you created for the site) via email, run away (and drop a line to Plain Text Offenders) because they aren’t doing anything to protect you.

Wrap Up

  • Make sure the login form you’re using is properly secured
  • Use strong passwords
  • Use Two-Factor Authentication everywhere you can
  • Watch out for careless site operators
  • Don’t use weak passwords & skip other security measures because “it’s not convenient”

But just so we’re clear: Unlike the hundred-plus celebrities who were compromised, there’s nothing on my iCloud account that anyone should be subjected to.

T-SQL Tuesday #39 – Here’s what my PoSH is cooking

T-SQL Tuesday LogoMy first official entry for T-SQL Tuesday (my first was a guest post hosted by Kendal Van Dyke (b|t), so I’m not really counting it) is brought to you by PowerShell, or PoSH. Ever since I discovered PoSH and really dove into learning it a couple years ago, my co-workers have gotten a bit annoyed by my insistence upon using it for everything. It is my favorite hammer, and around me I see nothing but acres and acres of nails.

I’m not a DBA, so I don’t do as much managing of databases with it as most people joining this party, but I still use PoSH with SQL Server pretty often. I spend a lot of time pulling data from SQL Server & doing crunching/analysis, sometimes merging it with some filesystem data as well.

So what have I done lately?

  • Space usage analysis. I work with a system which generates PDF documents, saves them to the server filesystem, and records the details to a table. But how quickly are we consuming space? When will we run out? I whipped up a quick PoSH script to pull the details off the table, then locate the files on the filesystem, and record everything to another table for slicing & dicing.
  • Quick ad-hoc data dumps. Sometimes we just need to pull some data out of the database, crunch a few numbers, and send it off to someone upstairs. Before PoSH, I’d run the query in SSMS, copy the data, paste it into Excel, drop in a few formulas and maybe a graph, and be done. But I’d spend more time fighting Excel on formatting & getting the columns right than I did getting the data into it. Invoke-SQLCmd piped to Export-CSV solves that really quickly.
  • I’ve been working on upgrading a system we purchased from a vendor and migrating everything to a new server at the same time. Moving & updating XML configuration files, thousands of check images, restarting services, migrating databases. And this isn’t a one-time event – we have to do this over 200 times! The SQL Server portion of this process isn’t particularly involved, but it’s vitally important:
    • After most of the heavy lifting of moving things around is complete, one table has to be updated to point at the new path for the images that were migrated.
    • When all of the migrations are finished, we have to validate that everything moved over properly. A few carefully-selected queries to compare critical tables between the old version of the database and the new version and minds are put at ease that all of our data has come over cleanly. Those queries, along with the other components of the validation, are run via the PoSH script & output to a file for review.
  • For reporting purposes, we load a small subset of data from Active Directory into a pair of SQL Server tables on a nightly basis. Previously, it was only 2 fields but recently this has been expanded to about 8. Once again, PoSH to the rescue! Pull the AD user accounts, select the properties we need, and insert it all into the tables. Originally I used my old standby Invoke-SQLCmd, but with the addition of new fields I got concerned about building queries via string concatenation using arbitrary data retrieved from another system. System.Data.SqlClient & prepared statements to the rescue! It’s more code, but it’s a lot safer.
  • Dave Ballantyne reminded me that I have two PowerShell scripts for SSRS.
    • The first, based upon this script from Randy Alrdich Paulo to deploy reports into SSRS. My addition was the option to pull an RDL file straight from our Subversion repository instead of the person doing the deployment having to check it out.
    • The second is based upon this StackOverflow post, and is used to render reports to file. In one case, I then SFTP the generated report to a vendor immediately after.

These are pretty mundane compared to what I’m sure a lot of people participating will be posting, but the key is this: PoSH is saving me time and by scripting everything I can, I’m significantly reducing the chance for errors. Of course, when there is an error, it’s magnified tremendously – so it all gets tested against a non-production server first.