T-SQL Tuesday #92: Lessons Learned the Hard Way

tsql2sday-300x300This month’s T-SQL Tuesday is hosted by Raul Gonzalez and he’s asked everyone to share things we might be a bit embarrassed about:

For this month, I want you peers to write about those important lessons that you learned the hard way, for instance something you did and put your systems down or maybe something you didn’t do and took your systems down. It can be also a bad decision you or someone else took back in the day and you’re still paying for it…

  • In the stress/performance testing portion of an upgrade of a critical system, we were short on disk space. So, rather than having a separate set of VMs for the performance testing (as we needed to be able to get back to functional testing quickly), we decided to just take VM snapshots of all the servers. Testing was delayed a day or two – but we didn’t switch off the snapshots. Then we started testing and performance was terrific…for about five minutes. Then everything came to a screeching halt. Panicked, we thought we were going to need a pile of new hardware until the VMWare admin realized that our disks were getting hammered and we still had those active snapshots.
    Lesson learned: If you take VM-level snapshots of your database server and let them “soak” for an extended period, you’re gonna have a bad time. Unless you need to take a snapshot of the host OS or instance configuration itself, use a database snapshot instead of a VM-level snapshot.

  • A couple of times, I’ve had under-performing VMs running SQL Server. As I hadn’t been involved in the configuration, I thought everything had been provisioned properly. Turns out…not so much. Memory reservations, storage configuration, power profiles, all set up for suboptimal performance.
    Lesson learned: Ask your VMWare admin if they’ve perused the best practices guide and review things yourself before going down the rabbit hole of SQL Server configuration & query tuning. If the underlying systems aren’t configured well, you’ll spin your wheels for a long time.

  • In doing a configuration review of a rather large (production) instance, I noted that at least one configuration option was still set to the default value – Cost Threshold for Parallelism was stuck at 5. Running sp_BlitzCache, I found that I had quite a few simple queries going parallel and huge CXPACKET waits. CXPACKET isn’t bad per se, but if you’ve got a low-cost query that’s going parallel and waiting on threads where it could be running faster overall single-threaded (verified this was the case for several of the top offenders), increasing the cost threshold can help. I did some checking, verified that it was a configuration change I could make on the fly, and set the value to 50.
    And then everything. Slowed. Down.
    When I made this configuration change on the test instance, it wasn’t much a problem. But that was a much smaller instance, with much less traffic. What I failed to fully comprehend was the impact of this operation. I overlooked that changing this setting (and a number of others I wasn’t aware of) blows out the plan cache. In the case of this instance, about 26Gb of plan cache. Not only was performance impacted while the plan cache was re-filled, we took a hit while all the old plans were being evicted from cache.
    Lesson learned: Even if it seemed OK in test, those “low impact” changes can have a much larger impact on production unless you can make test mirror production in every way. So plan when you make these changes accordingly.

We learn the most from our mistakes. We can learn almost as much from the mistakes of others. Learn from mine.

Advertisements

Spell-checking dbatools with Visual Studio Code

Earlier this week I was working on adding a new feature to Update-DbaTools and while looking at another cmdlet to check syntax/conventions, I noticed an ugly typo in some of the help for it. 100% perfect prose isn’t necessary in the comment-based help for PowerShell cmdlets, but seeing misspellings and such kind of bugs me. Fortunately this is something I can help fix since the module is on Github.

First I needed to find a spell-checker that works with Visual Studio Code to help me spot misspellings. This was slightly trickier than expected, as I use macOS at home and at least one of the first plugins I found was Windows-only. I finally settled on Code Spellchecker.

But as you can see from the marketplace page there, by default this plugin doesn’t know PowerShell. In my user settings file settings.json, I added PowerShell to the cSpell.enabledLanguageIds section so it’s always recognized:

"cSpell.enabledLanguageIds": [
        "c",
        "cpp",
        "csharp",
        "go",
        "javascript",
        "javascriptreact",
        "json",
        "latex",
        "markdown",
        "php",
        "plaintext",
        "powershell",
        "python",
        "text",
        "typescript",
        "typescriptreact",
        "yml",
        "powershell"
    ],

And with that, VSCode was giving me green squiggles under lots of words – both misspelled and not. Code Spellchecker doesn’t understand PowerShell in its default setup, it doesn’t have a dictionary for it. Just to get things started, I added a cSpell.userWords section to my settings.json and the squiggles started disappearing. The list I’m working with so far is posted as a gist on Github:

I’ll keep this updated as I encounter more strings that need to be recognized, whether they’re PowerShell tokens or specific to the dbatools project. In addition to actual PowerShell syntax in there, I’m dropping in strings that are commonly found throughout the module. Eventually I suppose I should get a proper dictionary file or two together, but this works well for a quick & dirty way to get going with a spellcheck & language cleanup for the module.