SQL New Blogger Challenge November 2015 Edition – Week 3 Digest

Posted on 2015-11-19 by Andy Levy

This week’s #sqlnewblogger posts!

Author	Post
@eleightondick	[T-SQL Tuesday] Data modeling: The trouble with prefixes \| The Data Files
@tomsql	Adventures With TomSQL, aka Tom Staab
@EdDebug	Automatically name primary key constraints in SSDT \| the.agilesql.club
@rabryst	Born SQL on Twitter: “Temporal Tables – Under the Covers with the Transaction Log.
@YatesSQL	Community Involvement–Why Wait? \| The SQL Professor
@cjsommer	Identity Column Increment Value (EVEN/ODD) \| cjsommer.com
@DBA_ANDY	Nebraska SQL from @DBA_ANDY: CHECKDB – The database could not be exclusively locked to perform the operation
@ALevyInROC	Selectively Locking Down Data – Gracefully – The Rest is Just Code
@eleightondick	SQLNewBlogger, Week 3 \| The Data Files
@tomsql	Being Our Collective Best
@SQLMickey	T-SQL Tuesday #72 Summary – Data Modeling Gone Wrong \| Mickey’s T-SQL Ponderings

Selectively Locking Down Data – Gracefully

Posted on 2015-11-17 by Andy Levy

I have a situation where I need to retrieve the data in an encrypted column from, but don’t want to give all my users access to the symmetric key used to encrypt that column. The data is of the sort where it’s important for the application to produce the required output, but if a user runs the stored procedure to see what the application is getting from it, it’s not critical that they see this one field.

The catch is that if the stored procedure is written with the assumption that the caller has permission to access the encryption key or its certificate, they’ll get an error. After a bit of research and pondering later, I came up with two options:

Create the stored procedure with EXECUTE AS OWNER (the owner in this case is dbo). This would let all users see the encrypted data; not an ideal solution.
Use SQL Server’s TRY/CATCH construct to gracefully handle the error thrown when the user attempts to open the key, but doesn’t have permission to do so.

Let’s check out option 2. This example is simplified from my actual scenario to demonstrate the idea.


declare @BankId varchar(6) = '123456';

SELECT cast('' as varchar(50)) AS AccountNum,
,AccountName
,AccountOwner
INTO #AccountData
FROM dbo.Accounts
WHERE OriginatingBank = @BankId
AND AccountType = 'Checking'

DECLARE @AcctNo VARCHAR(30);

BEGIN TRY
OPEN SYMMETRIC KEY MyKey DECRYPTION BY CERTIFICATE My_Cert

SELECT @AcctNo = CONVERT(VARCHAR, decryptbykey(AccountNum))
FROM dbo.Accounts
WHERE OriginatingBank = @BankId
AND AccountType = 'Checking'

CLOSE SYMMETRIC KEY MyKey
END TRY

BEGIN CATCH
SET @AcctNo = 'Access Restricted';
END CATCH

UPDATE #AccountData SET AccountNum = @AcctNo;

SELECT * FROM #AccountData;

DROP TABLE #AccountData;

TRY/CATCH in T-SQL works similarly to how it does in languages like C# or PowerShell. It allows you to attempt an operation and take care of any error conditions fairly easily.

In this case, I’m attempting to open the encryption key. But if the user doesn’t have permission to do so, it doesn’t terminate the stored procedure with an error. Instead, it jumps to the CATCH block, where I’ve defined an alternate way of handling the situation. Here, if the user doesn’t have the appropriate permissions, they’ll just get “Access Restricted” for the account number, and access to that sensitive data is a little more tightly controlled – while still letting users access the data they do need.

Don’t Trust the Wizard

Posted on 2015-11-10 by Andy Levy

The one wizard you can trust

If you need to move data from one table into a new table, or even tables in a database into another database, the Import/Export Wizard in SQL Server Management Studio looks pretty tempting. Set up a source & destination, click a few buttons, kick back with a cup of tea and watch the progress bars, right?

It turns out that the wizard just isn’t as smart as it may seem. If you’re not careful, you won’t get what you’re expecting. Let’s check it out.

We’ll start by creating a real simple table in a database, containing a primary key and a computed column.

Create table sample.dbo.SourceTable (
RowId int identity(1,1) not null primary key,
Num1 int not null,
Num2 int not null,
Total as (Num1+Num2)
);

Let’s populate it with a few rows of data, then update some of that data to make sure the computed column is working. Remember, this is just to demonstrate the idea.

insert into sample.dbo.SourceTable (Num1, Num2) values (1,2);
go 100
select top 5 * from sample.dbo.SourceTable order by RowId;
update sample.dbo.SourceTable set Num1 = Num1 * RowId where RowId <= 3;
select top 5 * from sample.dbo.SourceTable order by RowId;

Great! We’ve got data, the computed columns are working, let’s copy it over to a new table in another database. We’ll just going to click Next, Next, Next through the wizard this time around.

Success! Our table has been copied and the data’s all there.

select top 5 * from Sample2.dbo.SourceTable order by RowId;

Let’s do some work on our new table and check out the results.

select top 5 * from Sample2.dbo.SourceTable order by RowId;
update Sample2.dbo.SourceTable set Num2 = Num2 * RowId where RowId < 3;
select top 5 * from Sample2.dbo.SourceTable order by RowId;

Woah! That’s not right. That Total column is supposed to be Num1 + Num2, and last time I checked 2 + 4 was not 4. Let’s keep going and try adding a new record the same way it was done earlier.

insert into Sample2.dbo.SourceTable (Num1, Num2) values (100,200);

Cannot insert the value NULL into column 'RowId', table 'Sample2.dbo.SourceTable'; column does not allow nulls. INSERT fails.

Huh. Now that’s really odd, isn’t it? RowId is supposed to be an identity – we shouldn’t have to populate it. What is going on here? Let’s script out the table.

USE [Sample2]
GO

/****** Object: Table [dbo].[SourceTable] Script Date: 2015-11-10 22:36:23 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[SourceTable](
[RowId] [int] NOT NULL,
[Num1] [int] NOT NULL,
[Num2] [int] NOT NULL,
[Total] [int] NULL
) ON [PRIMARY]

GO

This is all kinds of wrong! What happened to the primary key? Or the computed column? Well, it turns out that the wizard isn’t that smart, and if you just take all the default values, you’re going to get burned. Let’s go back to the wizard and click that Edit Mappings button in the Select Source Tables and Views screen.

Well…that looks like what we got above. And it’s not what we wanted. If we click Edit SQL, this is confirmed – the table being created is not defined the same way the source table is being defined.

Fortunately, we can edit the SQL here and make it match the source table definition, then finish the wizard.

OK, data’s copied – what do we have?

select top 5 * from Sample3.dbo.SourceTable order by RowId;
update Sample3.dbo.SourceTable set Num2 = Num2 * RowId where RowId < 3;
select top 5 * from Sample3.dbo.SourceTable order by RowId;
insert into Sample3.dbo.SourceTable (Num1, Num2) values (100,200);
select * from sample3.dbo.SourceTable where rowid >= 100 order by RowId;

Everything’s there, and it’s working the way it’s supposed to. Lesson learned: don’t blindly trust the defaults, especially the ones in a wizard. Double-check everything, and then verify that your schema works the way you expect it to before doing any business with it.

SQL New Blogger Challenge, November Edition, Week 1 Digest

Posted on 2015-11-04 by Andy Levy

Ed Leighton-Dick has renewed his New Blogger Challenge this month. Here are all (I think) the posts for this week after Ed posted his announcement. If I’ve missed any, please let me know and I’ll update.

Author	Post
@arrowdrive	Anders On SQL: First Timer Summit impressions.
@EdDebug	Deploy SSDT INSERTS in Batches \| the.agilesql.club
@EdDebug	Looking at SSDT upgrade scripts \| the.agilesql.club
@DBA_ANDY	Nebraska SQL from @DBA_ANDY: PASS Summit 2015 Recap
@eleightondick	PASS Summit 2015 Highlights \| The Data Files
@OliverAsmus	PASS Summit 2015: My Experience \| OliverAsmus.com
@EdDebug	ScriptDom Visualizer \| the.agilesql.club
@eleightondick	SQL New Blogger Challenge: Looking back… and a new challenge! \| The Data Files
@Clem1029	Tearing down the wall \| SQLDEV@Clemsplace
@ALevyInROC	Why Ask Why? – The Rest is Just Code
@rabryst	The SQL Server Family

SQL New Blogger Challenge Digest – Week 4

Posted on 2015-04-30 by Andy Levy

This week marks the end of Ed Leighton-Dick’s New Blogger Challenge. It’s terrific seeing everyone sticking with the challenge all month and I’m looking forward to catching up with all the posts. Great job, everyone! Keep going!

Author	Post
@MtnDBA	#SQLNewBlogger Week 4 – My 1st SQLSaturday session \| DBA With Altitude
@Lance_LT	“MongoDB is the WORST!” \| Lance Tidwell the Silent DBA
@ceedubvee	A Insider’s View of the Autism Spectrum: Autism and Information Technology: Big Data for Diagnosis
@Jorriss	A Podcast Is Born
@toddkleinhans	A Tale of SQL Server Disk Space Trials and Tribulations \| toddkleinhans.com
@arrowdrive	Anders On SQL: First “real” job with SQL.
@arrowdrive	Anders On SQL: Stupid Stuff I have done. 2/?. Sometimes even a dev server is not a good dev environment
@way0utwest	April Blogger Challenge 4–Filtered Index Limitations \| Voice of the DBA
@ALevyInROC	Are You Backing Everything Up? \| The Rest is Just Code
@DesertIsleSQL	Azure Data Lake: Why you might want one \|
@EdDebug	BIML is better even for simple packages \| the.agilesql.club
@tpet1433	Corruption – The Denmark of SQL Instances – Tim Peters
@eleightondick	Creating a Self-Contained Multi-Subnet Test Environment, Part II – Adding a Domain Controller \| The Data Files
@MattBatalon	Creating an Azure SQL Database \| Matt Batalon
@pshore73	Database on the Move – Part I \| Shore SQL
@pmpjr	Do you wanna build a cluster?! \| I have no idea what I’m doing
@DwainCSQL	Excel in T-SQL Part 1 – HARMEAN, GEOMEAN and FREQUENCY \| dwaincsql
@AalamRangi	Gotcha – SSIS ImportExport Wizard Can Kill Your Diagrams \| SQL Erudition
@toddkleinhans	How Do Blind People Use SQL Server? \| toddkleinhans.com
@DBAFromTheCold	In-Memory OLTP: Part 4 – Native Compilation \| The DBA Who Came In From The Cold
@AaronBertrand	It’s a Harsh Reality – Listen Up – SQL Sentry Team Blog
@GuruArthur	Looking back at April – Arthur Baan
@nocentino	Moving SQL Server data between filegroups – Part 2 – The implementation – Centino Systems Blog
@MyHumbleSQLTips	My Humble SQL Tips: Tracking Query Plan Changes
@m82labs	Reduce SQL Agent Job Overlaps · m82labs
@fade2blackuk	Rob Sewell on Twitter: “Instances and Ports with PowerShell http://t.co/kwN2KwVDOS”
@DwainCSQL	Ruminations on Writing Great T-SQL \| dwaincsql
@sqlsanctum	Security of PWDCOMPARE and SQL Hashing \| SQL Sanctum
@Pittfurg	SQL Server Backup and Restores with PowerShell Part 1: Setting up – Port 1433
@cjsommer	Using PowerShell to Export SQL Data to CSV. How well does it perform? \| cjsommer.com
@gorandalf	Using SSIS Lookup Transformation in ETL Packages \| Gorandalf’s SQL Blog
@nicharsh	Words on Words: 5 Books That Will Improve Your Writing

Are You Backing Everything Up?

Posted on 2015-04-28 by Andy Levy

We hear the common refrain among DBAs all the time. Back up your data! Test your restores! If you can’t restore the backup, it’s worthless. And yes, absolutely, you have to back up your databases – your job, and the company, depend upon it.

But are you backing everything up?

Saturday night was an ordinary night. It was getting late, and I was about to put my computer to sleep so I could do likewise. Suddenly, everything on my screen was replaced with a very nice message telling me that something had gone wrong and my computer needed to be restarted.

Uh oh.

In 7 1/2 years of using OS X, I’ve had something like this happen maybe 4 times.

After waiting whet felt like an eternity, the system finished booting & I got back into my applications. I opened up PowerPoint, as I had it open before the crash so I could work on my SQL Saturday Rochester slide deck whenever inspiration struck. I opened my file, and was greeted by nothingness. I flipped over to Finder and saw zero bytes displayed as the file size.

Uh oh.

“But Andy,” you say, “you use CrashPlan, right? Can’t you just recover the file from there?” Well, you’re half right. I do use CrashPlan. I even have a local, external hard drive (two, actually) that I back up to in addition to CrashPlan’s cloud service. But I couldn’t recover from any of those.

Because Dropbox is already “in the cloud”, I had opted to not back it up with CrashPlan when I first set it up. After all, it’s already a backup right? It’s not my only copy, it’s offsite, it’s all good.

Not so fast. When my system came back up, Dropbox dutifully synced everything that had changed – including my now-empty file.

So, now what? Fortunately, Dropbox allows you to revert to older versions, and I was able to select my last good version and restore it.

Lessons Learned

I broke The Computer Backup Rule of Three and very nearly regretted it. For my presentation:

I had copies in two different formats – Dropbox & my local (internal) hard drive
I had one copy offsite (Dropbox)
I only had two copies, not three (local and Dropbox).

Even scarier, if Dropbox didn’t have a version history or it had taken me more than 30 days to realize that this file had been truncated, I’d have lost it completely.

Everything else on my computer was in compliance with the Rule Of Three; I just got lazy with the data in my Dropbox and Google Drive folders. I’ve since updated my CrashPlan settings to include my local Dropbox and Google Drive folders so that my presentation should now be fully protected:

Five copies
- Local drive
- Two external drives w/ CrashPlan
- CrashPlan cloud service
- Dropbox/Google Drive (different content in each)
Three formats
- Spinning platters in my possession
- Dropbox/Google Drive
- Crashplan
Two copies offsite
- CrashPlan cloud
- Dropbox/Google Drive

And don’t forget to test those backups before you need to use them. Dropbox, Google Drive and other online file storage/sync solutions are very useful, but you cannot rely upon them as backups. I don’t think you’ll ever regret having “extra” backups of your data, as long as that process is automatic.

SQL New Blogger Digest – Week 3

Posted on 2015-04-24 by Andy Levy

Here are the posts collected from week three of the SQL New Blogger Challenge. It’s been compiled the same way previous weeks’ posts were. Everyone’s doing a great job keeping up with the challenge!

Author	Post
@MtnDBA	#SQLNewBlogger Week 3 – PowerShell Aliases \| DBA With Altitude
@ceedubvee	A Insider's View of the Autism Spectrum: Autism and Information Technology: New Efforts for Kids to Code
@arrowdrive	Anders On SQL: Stupid Stuff I have done. 2/?. Sometimes even a dev server is not a good dev environment
@way0utwest	April Blogger Challenge 3 – Filtered Indexes \| Voice of the DBA
@eleightondick	Creating a Self-Contained Multi-Subnet Test Environment, Part I – Networking \| The Data Files
@ceedubvee	Empower Individuals With Autism Through Coding \| Indiegogo
@MattBatalon	EXCEPT and INTERSECT… \| Matt Batalon
@cjsommer	Follow the yellow brick what? My road to public speaking. \| cjsommer.com
@DBAFromTheCold	In-Memory OLTP: Part 3 – Checkpoints \| The DBA Who Came In From The Cold
@MattBatalon	Introduction to Windowing Functions \| Matt Batalon
@nocentino	Moving SQL Server data between filegroups – Part 1 – Database Structures – Centino Systems Blog
@Lance_LT	My first year as a speaker \| Lance Tidwell the Silent DBA
@MyHumbleSQLTips	My Humble SQL Tips: Tracking Page Splits
@ALevyInROC	Padding Fields for Fixed-Position Data Formats \| The Rest is Just Code
@tpet1433	Sir-Auto-Completes-A-Lot a.k.a. how to break IntelliSense, SQL Prompt and SQL Complete – Tim Peters
@pmpjr	stats, yeah stats. \| I have no idea what I'm doing
@DwainCSQL	Stupid T-SQL Tricks – Part 3: A Zodiacal SQL \| dwaincsql
@cathrinew	Table Partitioning in SQL Server – Partition Switching – Cathrine Wilhelmsen
@gorandalf	The MERGE Statement – One Statement for INSERT, UPDATE and DELETE \| Gorandalf's SQL Blog
@SQLJudo	The Road to SQL Server 2014 MCSE \| Russ Thomas – SQL Judo
@GGreggB	T-SQL Tuesday #65: FMT_ONLY Replacements \| Ken Wilson
@AalamRangi	What is the RetainSameConnection Property of OLEDB Connection in SSIS? \| SQL Erudition
@EdDebug	What Permissions do I need to generate a deploy script with SSDT? \| the.agilesql.club
@_KenWilson	Windowing using OFFSET-FETCH \| Ken Wilson
@DesertIsleSQL	What Does Analytics Mean?
@DesertIsleSQL	Azure ML, SSIS and the Modern Data Warehouse
@DesertIsleSQL	Musing about Microsoft’s Acquisition of DataZen and Power BI
@GuruArthur	Check for database files not in default location

Padding Fields for Fixed-Position Data Formats

Posted on 2015-04-21 by Andy Levy

Fixed-position data formats will seemingly be with us forever. Despite the relative ease of parsing CSV (or other delimited formats), or even XML, many data exchanges require a fixed-position input. Characters 1-10 are X, characters 11-15 are Y and if the source data is fewer than 5 characters, we have to left-pad with a filler character, etc. When you’re accustomed to working with data that says what it means and means what it says, having to add “extra fluff” like left-padding your integers with a half-dozen zeroes can be a hassle.

I received a draft of a stored procedure recently which had to do exactly this. The intent is for the procedure to output the data almost entirely formatted as required, one record per line in the output file, and dump the result set to a file on disk. As it was given to me, the procedure was peppered with CASE statements like this (only more complex) in the SELECT clause:

-- Method 1
select case len(cast(logid as varchar))
when 9 then '0' + cast(logid as varchar)
when 8 then '00' + cast(logid as varchar)
when 7 then '000' + cast(logid as varchar)
when 6 then '0000' + cast(logid as varchar)
when 5 then '00000' + cast(logid as varchar)
when 4 then '000000' + cast(logid as varchar)
when 3 then '0000000' + cast(logid as varchar)
when 2 then '00000000' + cast(logid as varchar)
when 1 then '000000000' + cast(logid as varchar)
when 0 then '0000000000' + cast(logid as varchar)
end as logid
,logtext from cachedb.dbo.logs;

It’s perfectly valid, it works, and there’s nothing inherently wrong with it. But I find it a bit tough to read, and it could become trouble if the format changes later, as additional (or fewer) cases will have to be accounted for. Fortunately, the day I received this procedure was right around the day I learned about the REPLICATE() T-SQL function. Maybe we can make this simpler:

select replicate('0',10-len(cast(logid as varchar))) + cast(logid as varchar) as logid,logtext from cachedb.dbo.logs;

Not bad. But it leaves us with a magic number and similar to the previous example, if the file format changes we have to seek out these magic numbers and fix them. This is easily remedied by defining these field lengths at the beginning of the procedure, so that they’re all in one place if anything needs to change.

-- Method 2
declare @paddedlength int = 10;
select replicate('0',@paddedlength-len(cast(logid as varchar))) + cast(logid as varchar) as logid,logtext from cachedb.dbo.logs;

Yet another approach would be to pad out the value beyond what we need, then trim the resulting string back to the required length. Again, we have to be careful to not leave ourselves with magic numbers; the solution is the same as when using REPLICATE():

-- Method 3
select right('0000000000' + cast(logid as varchar), 10) as logid,logtext from cachedb.dbo.logs;
-- Or, with more flexibility/fewer magic numbers
-- Method 4
declare @paddedlength int = 10;
select right(replicate('0',@paddedlength) + cast(logid as varchar), @paddedlength) as logid,logtext from cachedb.dbo.logs;

All four methods yield the same results, as far as the data itself is concerned. But what about performance? For a table with 523,732 records, execution times were:

2,000ms CPU time, 261,785ms elapsed
2,265ms CPU time, 294,399ms elapsed
2,000ms CPU time, 297,593ms elapsed
2,078ms CPU time, 302,045ms elapsed

Each method had an identical execution plan, so I’m probably going to opt for the code that’s more readable and maintainable – method 2 or 4.

As with any tuning, be sure to test with your own data & queries.

SQL New Blogger Digest – Week 2

Posted on 2015-04-15 by Andy Levy

I didn’t intend for last week’s digest to also be my post for week two of the challenge, but life got in the way and I wasn’t able to complete the post that I really wanted in time. So, that post will be written much earlier in week three and completed well ahead of the deadline.

Here are the posts collected from week two of the SQL New Blogger Challenge. It’s been compiled the same way last week’s was.

Author	Post
@AaronBertrand	#SQLNewBlogger Roundup – SQL Sentry Team Blog
@MtnDBA	#SQLNewBlogger Week 2 – Teach Something New \| DBA With Altitude
@ceedubvee	A Insider’s View of the Autism Spectrum: Autism and Information Technology: Back on the Job Hunt
@DwainCSQL	An Easter SQL \| dwaincsql
@DwainCSQL	An Even Faster Method of Calculating the Median on a Partitioned Heap \| dwaincsql
@arrowdrive	Anders On SQL: Stupid stuff I have done. 1/? Or, How I learned to stop GUIing and love the script
@MattBatalon	Another TRUNCATE vs. DELETE tidbit… \| Matt Batalon
@way0utwest	April Blogging Challenge 2 – Primary Key in CREATE TABLE \| Voice of the DBA
@GuruArthur	Arthur BaanSQL Server error 17310 – Arthur Baan
@Pittfurg	Blog Series: SQL Server Backup and Restores with PowerShell – Port 1433
@fade2blackuk	Checking SQL Server User Role Membership with PowerShell « SQL DBA with A Beard
@gorandalf	Database Compatibility Level 101 \| Gorandalf’s SQL Blog
@nocentino	Designing for offloaded log backups in AlwaysOn Availability Groups – Monitoring – Centino Systems Blog
@SqlrUs	Detaching a Database – File Security Gotcha \| John Morehouse \| sqlrus.com
@MartynJones76	Devon DBA: Check Database Integrity Task Failed … Oh Dear Master Luke!
@toddkleinhans	How Do You Visualize Abstractions? \| toddkleinhans.com
@AalamRangi	How to Have Standard Logging in SSIS and Avoid Traps \| SQL Erudition
@gorandalf	How to Test Existing T-SQL Code Before Changing the Compatibility Level \| Gorandalf’s SQL Blog
@EdDebug	HOWTO-Get-T-SQL-Into-SSDT \| the.agilesql.club
@DBAFromTheCold	In-Memory OLTP: Part 2 – Indexes \| The DBA Who Came In From The Cold
@nicharsh	It’s a Harsh Reality – SQL Sentry Team Blog
@SQLBek	Learn Something New – SSMS Tips & Tricks « Every Byte Counts
@cjsommer	Modify SQL Agent Jobs using PowerShell and SMO \| cjsommer.comcjsommer.com
@MyHumbleSQLTips	My Humble SQL Tips: Full List of SQL Server 2014 DMVs
@MyHumbleSQLTips	My Humble SQL Tips: Running DBCC CHECKDB on TEMPDB
@way0utwest	New Blogger Challenge 1 – Adding a Primary Key \| Voice of the DBA
@uMa_Shankar075	Querying Microsoft SQL Server: In Memory Optimized Table in SQL Server 2014
@Jorriss	Random Thoughts of Jorriss
@pmpjr	Sidenote, the 4200 databases are a different story for another week… \| I have no idea what I’m doing
@ALevyInROC	SQL New Blogger Challenge Weekly Digest \| The Rest is Just Code
@jh_randall	SQL Server Monitoring – Getting it Right – SQL Sentry
@cathrinew	Table Partitioning in SQL Server – The Basics – Cathrine Wilhelmsen
@eleightondick	Teach Something New: PowerShell Providers [T-SQL Tuesday #065] \| The Data Files
@rabryst	The Art of Improvisation – Born SQL
@DBAFromTheCold	The DBA Who Came In From The Cold \| Advice on working as a SQL Server DBA
@Lance_LT	The estimated query plan and the plan cache (Part 2) \| Lance Tidwell the Silent DBA
@SQLJudo	TSQL Tue 65: Memory Optimized Hash Indexes \| Russ Thomas – SQL Judo
@sqlsanctum	T-SQL Tuesday #065 – Teach Something New – APPLY \| SQL Sanctum
@_KenWilson	T-SQL Tuesday #65: FMT_ONLY Replacements \| Ken Wilson
@m82labs	Untangling Dynamic SQL · m82labs
@cathrinew	Using a Numbers Table in SQL Server to insert test data – Cathrine Wilhelmsen
@tpet1433	Why yes I can do table level restores – Tim Peters
@Jorriss	Why You Absolutely Need Alternate Keys: A Unique Constraint Story

SQL New Blogger Challenge Weekly Digest

Posted on 2015-04-08 by Andy Levy

Watching all of the tweets as people posted their first entries in the SQL New Blogger Challenge earlier this week, I quickly realized that keeping up was going to be a challenge of its own. Fortunately, there are ways to reign it in.

My first stop was IFTTT (If This Then That). IFTTT allows you to create simple “recipes” to watch for specific events/conditions, then perform an action. They have over 175 “channels” to choose from, each of which has one or more triggers (events) and actions. I have IFTTT linked to both my Google and Twitter accounts, which allowed me to create a recipe which watches Twitter for the #sqlnewblogger hashtag, and writes any tweets that match it to a spreadsheet on my Google Drive account (I’ll make the spreadsheet public for now, why not?).

The next step is to export the spreadsheet to CSV. I don’t have this automated, and may not be able to (I may have to find another workaround). Once it’s a CSV, I can go to PowerShell to parse my data. I want the end result to be an HTML table showing each post’s author (with a link to their Twitter stream) and a link to the post (using the title of the post itself).

Once I import the CSV file into an object in my PowerShell script, I need to do some filtering. I don’t want to be collecting all the retweets (posts starting with RT), and I should probably exclude any post that doesn’t contain a URL (looking for the string HTTP).

To extract the post URLs, I ran a regular expression against each tweet. Twitter uses their own URL shortener (of course), which makes this pretty easy – I know the hostname is t.co, and after the slash is an alphanumeric string. The regex to match this is fairly simple: [http|https]+://t.co/[a-zA-Z0-9]+

Then, for each URL found in the tweet, I use Invoke-WebRequest to fetch the page. This cmdlet automatically follows any HTTP redirects (I was afraid I’d have to do this myself), so the object returned is the real destination page. Invoke-WebRequest also returns the parsed HTML of the page (assuming you use the right HTTP method), so I can extract the title easily instead of having to parse the content myself. It’ll also give me the “final” URL (the destination reached after all the redirects). Easy!

My full script:

#requires -version 3
[cmdletbinding()]
param ()
set-strictmode -Version latest;
Add-Type -AssemblyName System.Web;
$AllTweets = import-csv -path 'C:\Dropbox\MiscScripts\Sqlnewblogger tweets - Sheet1.csv' | where-object {$_.text -notlike "RT *" -and $_.text -like "*http*"} | select-object -property "Tweet By",Text,Created | Sort-Object -property created -Unique;
$TweetLinks = @();
foreach ($Tweet in $AllTweets) {
    $Tweet.text -match '([http|https]+://t.co/[a-zA-Z0-9]+)' | out-null;
    foreach ($URL in $Matches) {
        $MyURL = $URL.Item(0);
# Invoke-WebRequest automatically follows HTTP Redirects. We can override this with -MaxRedirection 0 but in this case, we want it!
        $URLCheck = Invoke-WebRequest -Method Get -Uri $MyUrl;
        $OrigUrl = $URLCheck.BaseResponse.ResponseUri;
        write-debug $Tweet.'Tweet By';
        Write-debug $URLCheck.ParsedHtml.title;
        write-debug $URLCheck.BaseResponse.ResponseUri;
        $TweetLinks += new-object -TypeName PSObject -Property @{"Author"=$Tweet.'Tweet By';"Title"=$URLCheck.ParsedHtml.title;"URL"=$URLCheck.BaseResponse.ResponseUri;};
    }
}
Write-debug $TweetLinks;
$TableOutput = "<table><thead><tr><td>Author</td><td>Post</td></tr></thead><tbody>";
foreach ($TweetLink in $TweetLinks) {
$TableOutput += "<tr><td><a href=""https://twitter.com/$($TweetLink.Author.replace('@',''))"">$($TweetLink.Author)</a></td><td><a href=""$($TweetLink.URL)"">$([System.Web.HttpUtility]::HtmlEncode($TweetLink.Title))</a></td></tr>";
}
$TableOutput += "</tbody></table>";
$TableOutput;

And now, my digest of the first week of the SQL New Blogger Challenge. This is not a complete listing because I didn’t think to set up the IFTTT recipe until after things started. I also gave people the benefit of the doubt on the timing (accounting for timezones, etc.) and included a few posted in the early hours of April 8th. For week 2, it will be more complete.

Author	Post
@eleightondick	Kevin Kline on Twitter: “Advice to New Bloggers http://t.co/o1jfLOR4QI“
@BarbiducH	Safe exit from WHILE loop using ##global temp tables \| One developer’s SQL blog
@eleightondick	Mike Donnelly on Twitter: “T-SQL Tuesday #065 – Teach Something New http://t.co/LoyFbhVOpw #tsql2sday”
@GuruArthur	Arthur BaanTrace flags in SQL Server – Arthur Baan
@cjsommer	Blogging and Intellectual Property Law \| legalzoom.com
@ceedubvee	A Insider’s View of the Autism Spectrum: Autism and Information Technology: Answering a Blog Challenge (Plus, Why I Like Data)
@arrowdrive	Anders On SQL: A bit about me continued. Anders meets SQL
@SQLJudo	Experience Is Overated \| Russ Thomas – SQL Judo
@SQLBek	T-SQL Tuesday #065 – Teach Something New \| Mike Donnelly, SQLMD
@MtnDBA	#SQLNewBlogger Week 1 “Eye of the Tiger” \| DBA With Altitude
@Lance_LT	The estimated query plan and the plan cache (Part 1) \| Lance Tidwell the Silent DBA
@AalamRangi	How to Use Temp Table in SSIS \| SQL Erudition
@DwainCSQL	An Easter SQL

Limitations

There are a couple limitations and gotchas with this process:

The IFTTT recipe only runs every 15 minutes (all IFTTT triggers run on 15 minute intervals) and only fetches 15 tweets each time it runs (again, IFTTT’s configuration). So if there’s a flood of tweets, they won’t all be captured.
I don’t really check the final destination of a link. For example, one of the first tweets captured contained a link to another tweet, which then linked to a blog post. Could I detect this & resolve the true final destination? Probably. But it’d involve a lot more logic, and it’s getting late.
I also don’t pick up every duplicate link/post. Again, I can probably get by this with some extra logic, but I don’t think it’s necessary right now.
It doesn’t post automatically to my blog, or anywhere else. I have to manually paste the HTML into my blog post(s).
I had to manually remove one link as it didn’t actually point to a post written for the challenge; it was tweeted with the hashtag, so my IFTTT recipe caught it.
I started collecting these tweets mid-day April 7th. If you posted before that, I’ve likely missed your post. You will be picked up for Week Two!

The Rest is Just Code

Ideas are the hard part…the rest is just code

Tag Archives: sqlnewblogger

SQL New Blogger Challenge November 2015 Edition – Week 3 Digest

Selectively Locking Down Data – Gracefully

Don’t Trust the Wizard

SQL New Blogger Challenge, November Edition, Week 1 Digest

SQL New Blogger Challenge Digest – Week 4

Are You Backing Everything Up?

Lessons Learned

SQL New Blogger Digest – Week 3

Padding Fields for Fixed-Position Data Formats

SQL New Blogger Digest – Week 2

SQL New Blogger Challenge Weekly Digest

Limitations

Spread the word:

Spread the word:

Spread the word:

Spread the word:

Spread the word:

Lessons Learned

Spread the word:

Spread the word:

Spread the word:

Spread the word:

Limitations

Spread the word: