Wednesday, July 9, 2014
Excel Needs a Custom Formula Sandbox
Of course it can be also done with a macro, which is what I ended up doing. Since I have a personal macro workbook with a lot of custom functions, so I'm not adverse to creating new ones whenever I need to.
A lot of people are adverse to this approach, as the StackOverflow thread I linked to above indicates. And I can understand why, since macros come with a lot of overhead (compatibility, security, etc.).
That gave me an idea. MS should create a "formula sandbox" area in Excel where users can create custom formulas. For the language, I would suggest limited-functionality versions of C#.NET and VB.NET. The limited functionality would be necessary to ensure that the formulas don't present any security risk.
The custom formula sandbox would be linked to the user, any any custom formulas that he/she uses in the spreadsheet would be saved with that spreadsheet, so that the spreadsheet works for any user.
I think a lot of Excel users would welcome something like that.
Thursday, July 3, 2014
Dual Monitors and Excel
In any case, the program where I most often find myself manually resizing the window to span my two monitors is Excel. A while ago I got sick of doing that, so I wrote a simple macro to do it. Here's the code:
'the control parameter can be omitted. I included it in my macro because that signature is needed for macros 'invoked from the ribbon Sub MaximizeAcrossTwoMonitors(control As IRibbonControl) Dim TargetWidth As Integer TargetWidth = 2880 'fine-tune as needed for your specific setup Dim TargetHeight As Integer TargetHeight = 780 'fine-tune as needed for your specific setup 'if the window is already maximized across two monitors, we're going to assume the user wants to return to regular maximized window state If Application.Width = TargetWidth And Application.Height = TargetHeight Then Application.WindowState = xlMaximized Else With Application .WindowState = xlNormal .Left = 1 .Top = 1 .Width = TargetWidth .Height = TargetHeight End With End If End Sub
You can put that macro in your personal macro workbook and assign it to your quick access toolbar for easy access.
Once Excel is maximized across your workspace, you may want to have two or more workbooks displayed side-by-side. You probably know that you can do that by going to View --> Arrange All --> Vertical.
However that's three steps I found myself struggling to recall the specifics of. So, another simple macro for your quick access toolbar would be:
Sub ArrangeWindowsVertically(control As IRibbonControl) Windows.Arrange ArrangeStyle:=xlVertical End Sub
I hope someone finds this useful.
Sunday, February 10, 2013
Excel as an ETL Tool
This is probably going to seem extremely basic to some people, but I think it’s a very useful technique that many SQL developers might not know about.
You can use Excel as a quick and precise ETL tool. Not for 10GB data files. Not for data that’s not already in Excel. But when you have data in Excel – especially if it’s messy – and you want to get in into SQL Server quickly, there’s a good way of doing that.
Let’s start with some messy Excel data. Here are the current NBA standings that I copied from the NBA’s website.
Now, the data isn’t hideous. It’s organized in a consistent manner. It’s just not normalized. The technique I’m going to describe works on data in all kinds of formats.
Create the Table
The first thing I’m going to do is define the table in SQL. I’m going to write the create table statement in Excel, so that my entire script will be in one place. So in cell M2, I’m going to write:
create table dbo.NbaStandings (Id integer identity,TimePeriod varchar(100), Conference varchar(50), Division varchar(50), Team varchar(50), Wins integer, Losses integer, WinPercent decimal(4, 3), GamesBack decimal(10, 2), ConfRecord varchar(10), DivRecord varchar(10), HomeRecord varchar(10), RoadRecord varchar(10), Last10 varchar(5), Streak varchar(5));
The column definitions are derived from the data, of course. That’s the easy part.
Insert the Data, Part 1
Next I’m going to create the insert statements in Excel, referencing the data in the spreadsheet. In cell M6, I’m going to write:
="insert into dbo.NbaStandings (TimePeriod, Conference, Division, Team, Wins, Losses, WinPercent, GamesBack, ConfRecord, DivRecord, HomeRecord, RoadRecord, Last10, Streak) values ("
But I'm not ready to write more just yet. SQL Server string values have to be enclosed in single quotes. Also, when you concatenate cell references in an Excel formula, you have to use double quotes. I could write the rest of the insert statement that way, but that’s going to be one cumbersome, unreadable formula. Some Excel macros will make it much easier to write.
Write the Macros (or just use mine)
Here are the macros I’m going to create in the Excel VBA editor. I recommend keeping these macros in your personal macro workbook, so you don’t have to add them to every workbook you want to use them in.
This macro will escape single quotes in in a string, and enclose a string in single quotes:
Public Function SqlEncode(TheString) As String
Public Function SqlEncode(TheString) As String
Dim ReturnValue As String
ReturnValue = TheString
ReturnValue = Replace(ReturnValue, Chr(39), Chr(39) & Chr(39)) 'escape apostrophe
ReturnValue = Chr(39) & ReturnValue & Chr(39) 'enclose in apostrophes
SqlEncode = ReturnValue
End Function
This macro will take a single cell, and apply the macro above to its value, optionally adding a comma:
Public Function SqlValue(TheRange As Range, Optional IncludeComma As Boolean = True) As String
Dim ReturnValue As String
ReturnValue = ReturnValue & SqlEncode(TheRange.Text)
If IncludeComma Then ReturnValue = ReturnValue & ", "
SqlValue = ReturnValue
End Function
And this macro will take a range of cells, apply the SqlEncode macro to all of them, and concatenate them with commas:
Public Function SqlValues(TheRange As Range) As String
Dim TheCell As Range
Dim ReturnValue As String
Dim counter As Integer
counter = 0
For Each TheCell In TheRange.Cells
If counter > 0 Then ReturnValue = ReturnValue & ", "
ReturnValue = ReturnValue & SqlEncode(TheCell.Text)
counter = counter + 1
Next
SqlValues = ReturnValue
End Function
With those macros created, I’m ready for the next step.
Insert the Data, Part 2
The first real record on the spreadsheet is on line 6, but the values in the three rows above will also be part of the record. This is where the data not being normalized makes things messy. Here’s my continued formula:
="insert into dbo.NbaStandings (TimePeriod, Conference, Division, Team, Wins, Losses, WinPercent, GamesBack, ConfRecord, DivRecord, HomeRecord, RoadRecord, Last10, Streak) values (" & SqlValue(B$3) & SqlValue(B$4) & SqlValue(B$5)
Notice that I made the row reference for those three values absolute, because I don’t want them to change when I copy the formula to new rows.
The next part of the formula uses the data on the same row my formula is on, so I can use the SqlValues macro to take care of them all:
="insert into dbo.NbaStandings (TimePeriod, Conference, Division, Team, Wins, Losses, WinPercent, GamesBack, ConfRecord, DivRecord, HomeRecord, RoadRecord, Last10, Streak) values (" & SqlValue(B$3) & SqlValue(B$4) & SqlValue(B$5) & SqlValues(B6:L6) & ");"
My formula is now complete. Here's the resulting TSQL:
insert into dbo.NbaStandings (TimePeriod, Conference, Division, Team, Wins, Losses, WinPercent, GamesBack, ConfRecord, DivRecord, HomeRecord, RoadRecord, Last10, Streak) values ('2012-2013 Division Regular Season Standings', 'Eastern Conference', 'Atlantic', 'New York', '32', '16', '0.667', '0', '20-9', '41398', '19-6', '13-10', '41458', 'W 1');
Now I can go ahead and copy the formula into all of the other data rows on the spreadsheet:
I have to change the formula’s references to the conference and division cells several times when copying the formula, because this data set isn’t normalized, as previously pointed out. I could have used an OFFSET formula in Excel to get around having to do that, but I wanted to keep this example relatively simple.
Execute the Script
The entire script to create and populate the table in now in column M of my spreadsheet. I’m going to go ahead and copy that column into new query window in SQL Server:
Once I execute the script, the data is now in a SQL Server table:
Conclusion
Given how de-normalized that data was, this wouldn’t have been possible with a standard ETL engine such as SSIS, without first normalizing the data in Excel.
At my job, I get a lot of data in a format like this or worse, so I’ve found this technique very helpful over the years. I hope others will fine it helpful as well.