The Azure IoT Hub is a two-way communication platform between the solution backend and the IoT devices themselves that allows you to manage, maintain and provision devices into your infrastructure. Today I’d like to discuss the Azure IoT Hub components.
Having a data/disaster recovery site or plan in place is crucial today, whether for compliance or to ensure if anything does happen, then your business will still be able to operate. Today’s episode of Azure Every Day focuses on Azure Site Recovery.
Azure Site Recovery is Microsoft’s business continuity or data recovery service. With this service, you can move your VMs to the cloud, back them up or go site to site. To utilize this service, you’ll need to coordinate and set up replication between the sites and/or servers.
You have some options of how to do this. You can back up an Azure VM to another Azure VM in a different geo locale, or back up a physical server, VM infrastructure or Hyper-V up to Azure. The physical and VMware are real-time replications, as opposed to Hyper-V, where you can get it down to about a 30-second window.
Azure Site Recovery has many great features, such as:
Application Awareness – It knows what you’re running (i.e. SharePoint, SQL Exchange, Active Directory, etc.) Because of this, it’s able to easily stop in one location and start in another in the event of a disaster.
Region to Region Replication – If you want to take your replication from the East Coast to the West Coast, this is built into the service, so it’s easily done.
Encryption – From a security standpoint, it supports encryption at rest and encryption in transit. This is extremely helpful when you’re backing up from one Azure Virtual Machine to another, or from your local VMware infrastructure to Azure. This will all be encryption in transit, as well as at rest, when it lands in Azure.
Some other key features are the auto failover and auto failback capabilities, as well as continuous replication, so your RTO and RPO are easily met by working on this platform. You can also run automated recovery scenarios, so you can test your disaster plan without any impact to your environment.
Are you new to Azure, or looking to make the move and curious about what Azure Data Factory is? Azure Data Factory is Microsoft’s cloud version of a data integration and orchestration tool. It allows you to move your data from one place to another and make changes with it. Here, I’d like to introduce the 4 main steps or components of how Azure Data Factory works.
Step 1 – Connect and Collect
Connect and collect is where you define where you’re pulling your data from, such as SQL databases, web applications, SSAS, etc. You collect that data into one centralized location like Azure Data Lake or Azure Blob Storage.
Step 2 – Transform and Enrich
In this step, you take the data from your centralized storage and enrich it to further expand on your data using HDInsight operation, Spark or Data Lake analytics, for example.
Step 3 – Publish
Next is to publish the data to a place that it can be better used and consumed by the end users. Any BI tool, such as Power BI or reporting services are great choices.
Step 4 – Monitor
This last step is to monitor the data to be sure jobs are running, and data is flowing, properly. It’s also important to monitor to ensure data quality. Monitoring can be done with tools like PowerShell, Microsoft Operations Manager or Azure Monitor, which allow you to monitor inside the Azure portal.
Over the past 7 or 8 years, I’ve gone from “0 to 60” when it comes to database design, development, ETL and BI. Most of the skills I’ve learned were a result of mistakes I’ve made, and I’ve been mostly self-taught with the exception of some more recent “formalized” learning with programs from Pragmatic Works. As I’ve grown more and more satisfaction with the process, I’ve gone on and started working towards my MCSA (2 of 3 complete) in SQL 2012, as well as speaking at SQL Saturdays and local user groups in New England. It’s become one of the most rewarding, exciting and challenging aspects of my career. As a result, I’ve posted some blog articles about some of the challenges I’ve overcome, though not frequently enough, and attempted to become more active in some SQL forums. The list below is far from complete when it comes to all of the best practices I’ve learned over the years, however, many of these lessons and best practices have really helped me to be organized when it comes to good BI architecture. I hope to provide at least one item that will benefit a newbie or even a seasoned pro. So, without further ado, here are the top 10 best practices I’ve learned the hard way…
- Use Source Control
- For anyone who was a developer in their past life, or is one now, this is a no-brainer, no-alternative best practice. In my case, because I come from a management and systems background, I’ve had to learn this the hardway. If this is your first foray into development, get ready, because you’re in for some mistakes, and you’re going to delete or change some code you really wish you didn’t. Whether it be for reference purposes on something you want to change, or something you do by accident, you’re going to need that code you just got rid of yesterday, and we both know you didn’t back up your Visual Studio jobs… Hence, source control. Github and Microsoft offer great solutions for Visual Studio, and Redgate offers a great solution for SSMS. I highly recommend checking them out and using the tools! There are some other options out there that are free, or will save your code to local storage locations, but the cloud is there for a reason, and many of us are on the go, so having it available from all locations is very helpful.
- Standardize Transform Names
- Leaving the default names for the OLE DB connector or the Execute SQL Task seems like something very silly a person might do because it’s so easy to label them for what they actually are, but I have to admit here, I’m guilty of doing so. I’ve found myself in situations where I’ve thrown together a quick ETL package for the purpose of testing, or further work on the database side, and then I’ve forgotten to go back and fix them. Fast-Forward 6 months, and saying to myself, “I think I had a package like that once before”, only to find it, open it, and not have a clue what it’s actually doing. This of course requires me to go through each component and refresh my memory on what I did. Truth be told, in this day of resources, memory, etc, there is absolutely no need to not give all the details needed within the title of the tool being used. Don’t be lazy, you never know when it might bite you!
- Document inside your code and ETL workflow
- If you’re using the script transforms, it opens a pretty standard development window anyone familiar with Visual Studio will recognize. As with any development, good code comes with good documentation. Use it! Your predecessors, if not you, yourself, will be very appreciative down the road. Name your functions appropriately, and explain what you’re doing throughout the code. Further, as you build your workflows, you have the ability to document what each step of the process is doing, use it! This also goes back to point 2. With standardized names for your transforms alongside documenting the workflow as you go, it paints a very nice picture of what your workflow is doing.
- Setup detailed alerts for failures
- The traditional workflows in SSIS allow for users to create mail notifications for successful and unsuccessful steps within the workflow. Of course, depending on how your packages are being run, you could have the same type of notification directly from the SQL Server running a SQL Job and also send notification, however, why use the SQL Alerts to tell you that “Job A” failed, and no real good information when you can have your package tell you exactly which transform or component failed, and what the error was when it failed. There are some limitations to the canned SSIS component as to the methods you can use to send an email, however, there are also some workarounds like the one here on Stackoverflow that shows how to use the script task to connect to and send mail through gmail. Either way, there is plenty of functionality that will help you to be informed of exactly what is happening inside a job and where there are warnings and errors for each component. Taking the time to do this is much better than getting a phone call with complaints about data not being updated properly!
- Standardize File and Folder Names and Locations
- Ok, ok, I know this is getting a bit redundant… but remember, these are the mistakes I made as a newb, and I really want to help you out as you start to use the software more and more, and get more and more complex with workflows. This one was a really big one for me. Because I do all of my work in visual studio, and all of the BI jobs look the same when it comes to the file level, I really needed to be able to show the separation between my ETL jobs, SSRS jobs and SSAS jobs. This also helped me out with my source control structure as well. I separated SSIS, SSRS, and SSAS jobs (you can even go as far as separating Tabular and Multi-dimentional if necessary) into separate folders, then labeled each type with “ETL” or “reports” as part of the file name. It saves me time when I’m opening a recent job because typically I’m working on tweaking the ETL at the same time as developing the reports to get just the right mix.
- Have a “Report Request Form”
- When you first start writing reports, it’s really exciting. You’re delivering value to the business and really helping people do their jobs, especially when you start transforming and aggregating data… but soon, you become more and more relied upon for for those reports, and no two reports are alike it would seem. A common best practice for people who are spitting out report after report is to have a report requirements request form like the one here from SQL Chick. This request form is pretty in-depth, so tweak as necessary, but it will really help you to prioritize and design reports going forward.
- Experiment to fine tune and improve performance
- So, this best practice item is really a whole blog post unto itself, but it’s something to be aware of. Just as a quick for instance, the “OLE DB command” transform is a great tool in theory, however, because of some of the nuances of the tool, if you’re using a large dataset, it can be significantly slower than using the “Execute SQL Task”, but the only way to know this is to compare them side by side, which I had to do, and realized the SQL Task took about 3 min, and the OLE DB Command took about 45 minutes. Moral of the story: if something seems to take a long time, there may, and most likely is, a better way to do it, go out and play!
- Set your backup type to Basic or Bulk-logged on staging tables
- Ok, so normally I would stress the importance of backups (always do them automatically, including logs, and before you change anything), but that would be more of a blog post about maintenance and configuration, but this is more to focus on ETL and Reporting, so let’s talk about the effect loading lots of data will have on your database, or better yet, why not check out this article that has a very good description of the 3 different backup models. The basics are just this… if it’s a staging table or db for data, you can probably get away with the basic model. If it’s a critical db, but you’re doing lots of data loads, the bulk-logged model will ignore SELECT INTO, bcp, INSERT SELECT and BULK INSERT DML operations so your transaction logs don’t get huge, fast.
- Temp Tables will likely make your ETL run faster than staging tables
- I can’t really take credit for this little nugget here. When doing a project with one of my coding buddies, he came in to my office one day and said “Hey, did you know that using temp tables in SQL will allow you to use multiple processors on the server at one time?” I did not know this, and man did it make a difference. What a huge performance boost it was for my project. Now, like everything else, there are exceptions to the rule. Some guys who are much smarter than me had this discussion on a forum that sheds some more light on the topic, and here are some more scenarios where it might not make sense. I think it’s good information overall, and the more information you have, then better off you’ll be when you’re designing your BI Architecture
- Have a development and testing platform
- As with some of the other best practices listed above, for some, this is a no-brainer. For others… we have no brains when it comes to this stuff and we need to have it beaten into our heads (I’m the latter if that wasn’t already clear). I can’t stress enough how much this will save you. You should never, ever, be doing development on a production environment. There are just too many things that can go wrong. Even those “quick” or “minor” changes can really cause a calamity and ruin your day quickly. Now, there can be challenges if you don’t have a proper production/dev/test environment at your office or your client’s location, however, with SQL Server Developer’s Edition now being a free tool, and PCs these days having tons of resources, you should be able to do your testing on even the simplest of computers and get a warm and fuzzy that you’re going to be able to deploy this latest package, report, or code successfully. Performance tuning might not be truly possible to do comparisons against a beefy production server, but you should be able to establish a baseline and have a general idea of how performance will be for various configurations.
Well, that’s it for this post. I really hope you’re able to provide even the slightest hint of learning something new here because that’s always my goal. If you have questions, you can follow me or send me a message on twitter @bizdataviz as I’m always happy to hear how I can write better blog posts to help people out whom are just getting their feet wet.
Recently I was presenting a session on Microsoft’s Power View in what I had intended to be my final time presenting the session before retiring it to the archives. Unfortunately the presentation didn’t go as planned. For the life of me, I couldn’t find my Power View icon in the MS Excel ribbon, and began to go out of my mind! Note, I’ve discussed this topic at various SQL Saturdays and User Groups for the past couple of years, and it started to become somewhat of a second nature for me doing demo after demo of the various components and moving on to the next. So, it’s only natural that phrases like “it was just here” and “what did Microsoft do this time!!” came out of my mouth in front of the group, and I rapidly changed my focus and did what components of the demo I could show in PowerBI as an alternative. Later find out that, in fact, Microsoft did do something. With an assumed effort to get people to use PowerBI and move away from Silverlight, Microsoft made the decision to remove the button from Excel in both Excel 2013 and Excel 2016 as described in the following article by John P White:
He also included a nice “How To” to get it back into Excel if that’s how you want to use it here:
A couple of lessons learned from this and, in a sense, re-learned: 1) every time you think you know what Microsoft is doing, you don’t. 2) do at least one quick dry run of your presentation immediately before the presentation, or as close to it as possible 3) Power View appears to be transitioning to a “legacy” product… I guess it was a good time to retire the session
Now that I’m doing some more development of dashboards in the Power View interface, I thought it might be helpful to post about some of the date “gotchas” I’ve run in to with the development interface. Power View is very nice about handling Date filters if you are able to specify a date you’d like to use as a start point, end point, or 2 dates to post data about between, however, if you’re looking to determine if a date range is say, “This Week”, “This Month”, “This Quarter”, etc… you’ll have quite a bit more trouble. The workaround I’ve found to solve this issue is to create a new column that results in a boolean (true/false) answer. Then within your Power View filtering, you are able to choose “IsPastWeek” = True, etc. For Example, in your formula bar (or whatever tool you’re using, excel, powerbi, or SSAS tabular):
The above formula, simply put, says If the date is later than today minus 7 days, then return true, else, return false. Below is a view of the filters in my Power View Dashboard:
By setting this filter, the ability to look at specific dates or time periods makes it easier to report graphically. Also, the ability to filter on separate objects within the same canvas allows us to easily compare similar time periods side by side.
Some other options for comparing date ranges are:
And many more can be found here: Time Intelligence in Power Pivot in Excel
I was attempting to create a database diagram with the SSMS 2012 diagramming tool today and ran into the following issue:
“Attempted to read or write protected memory. This is often an indication that other memory is corrupt. (Microsoft.VisualStudio.OLE.Interop)”
with the details:
Attempted to read or write protected memory. This is often an indication that other memory is corrupt. (Microsoft.VisualStudio.OLE.Interop)
at Microsoft.VisualStudio.OLE.Interop.IOleCommandTarget.QueryStatus(Guid& pguidCmdGroup, UInt32 cCmds, OLECMD prgCmds, IntPtr pCmdText)
at Microsoft.VisualStudio.Platform.WindowManagement.DocumentObjectSite.QueryStatus(Guid& pguidCmdGroup, UInt32 cCmds, OLECMD prgCmds, IntPtr pCmdText)
at Microsoft.VisualStudio.Platform.WindowManagement.WindowFrame.QueryStatus(Guid& pguidCmdGroup, UInt32 cCmds, OLECMD prgCmds, IntPtr pCmdText)
at Microsoft.VisualStudio.Platform.WindowManagement.WindowManagerService.viewManager_ActiveViewChanged(Object sender, ActiveViewChangedEventArgs e)
at System.EventHandler`1.Invoke(Object sender, TEventArgs e)
at Microsoft.VisualStudio.PlatformUI.ExtensionMethods.RaiseEvent[TEventArgs](EventHandler`1 eventHandler, Object source, TEventArgs args)
at Microsoft.VisualStudio.PlatformUI.Shell.ViewManager.SetActiveView(View view, ActivationType type)
at Microsoft.VisualStudio.Platform.WindowManagement.WindowFrame.ShowInternal(ShowFlags showFlags)
at Microsoft.VisualStudio.ErrorHandler.CallWithCOMConvention(Func`1 method)
at Microsoft.VisualStudio.Shell.ThreadHelper.Invoke[TResult](Func`1 method)
at Microsoft.SqlServer.Management.UI.VSIntegration.Editors.VirtualProject.CreateDesigner(Urn origUrn, DocumentType editorType, DocumentOptions aeOptions, IManagedConnection con, String fileName)
at Microsoft.SqlServer.Management.UI.VSIntegration.Editors.VirtualProject.Microsoft.SqlServer.Management.UI.VSIntegration.Editors.ISqlVirtualProject.CreateDesigner(Urn origUrn, DocumentType editorType, DocumentOptions aeOptions, IManagedConnection con, String fileName)
at Microsoft.SqlServer.Management.UI.VSIntegration.Editors.ISqlVirtualProject.CreateDesigner(Urn origUrn, DocumentType editorType, DocumentOptions aeOptions, IManagedConnection con, String fileName)
at Microsoft.SqlServer.Management.UI.VSIntegration.Editors.VsDocumentMenuItem.CreateDesignerWindow(IManagedConnection mc, DocumentOptions options)
at Microsoft.SqlServer.Management.UI.VSIntegration.Editors.VsDocumentMenuItem.InvokeDesigner(IManagedConnection connection)
at Microsoft.SqlServer.Management.UI.VSIntegration.ObjectExplorer.ToolsMenuItemBase.MenuInvokedHandler(Object sender, EventArgs args)
So, after some digging, I found a post that suggested I upgrade my tools to SSMS 2014, which yielded the same results, and did not work.
The next post suggestion I found recommended replacing the DLLs in the following location:
C:\Program Files (x86)\Common Files\microsoft shared\Visual Database Tools\
C:\Program Files (x86)\Common Files\microsoft shared\MSDesigners8\
Which also did not work.
If anyone knows of a better work around, I’d like some input!
While rehearsing my demo for my upcoming session at SQL Saturday Boston this weekend, #sqlsat500 where I will be lecturing and demonstrating Scratching the Surface of Power View, I noticed a quirky issue where I wasn’t able to drill through my column data, but was fine with row data. Further, after some playing, I was able to drill down if I changed the order of my column values, or tried using other fields. Everyone hates NULL data, and I think this is just another reason why. As it turns out, there are some NULLs that happen to show up as the first column when you drill down, so the ability drill “up” is lost, and the only way to get back to your top-level data is to close and re-open the report.
Using the AdventureWorksDW2014 Data, and building a basic example of creating a matrix, then adding the Drill Down properties, my selections look like this:
As you can see. A fairly simple example, just trying to capture the essence of Drilling Down and back out again. When I click to drill down in to the “Black” column, I get the following:
Note, the first column heading has no title, and there isn’t an available “Drill Up” arrow either. At this point, I’m stuck. I can’t go back up to my original report, and now need to close and re-open to start over. If I change the order of the columns so that “Style” is listed first, above “Color”, the Drill Down and Drill Up work fine. In order to workaround the issue, I replaced all the NULLs in the table with “NA” and Viola! Works perfectly. I’m not sure if this is by design from Microsoft in order to force the data to have values, but since it’s their dataset to begin with, I’m assuming it’s a bug.
Hope this helps anyone else running into this issue!
Have you run in to issues where your SSRS column headers just won’t scroll with the data? How about losing the column headers when you go to your next page? Well, there seems to be some ambiguity about this topic, and anyone who has searched for an answer to their problem has probably run into a bunch of rudimentary answers on how to fix said issues. I’m living proof that said “fixes” don’t seem to always work consistently, and I have gone ahead and created my own set of steps to address the issue at hand. I can’t promise it will fix your issues, but in every instance I had the problem, it worked.
For starters, I would read the following page from MSDN to get an understanding of how row and column headings work:
As you can see, this should be pretty simple, but in reality, I can’t get it to work consistently, and it seems many others have tried and failed.
For starters… these settings in the Tablix Properties just seem to cause more issues than help:
If you modify the “Row Headers” and “Column Headers” properties in here (select them), then you will get the following error when you attempt to run your report later… so don’t use them with this method:
This error will ruin your day when you’re using the advanced properties… trust me… I know!
As referenced in the MSDN page, the way to gain better control over the column headers is to use the advanced properties for Row and Column Groups:
After changing to “Advanced Mode”, you’ll see essentially 2 different types of top row. Either way, the top row with contain the word “static” the major differentiation will whether or not the word is in parenthesis.
What makes the parenthesis critical is that if they’re missing, it means the header row has been deleted. If the header row has been deleted, my advice is to re-create the report. You will spend much less time in recreating it than you will chasing your tail trying to fix it!
On the flip side, if that top row has no parenthesis, you should be able to tackle this issue.
Select your top row and go to your properties window:
The key fields you want to look at here are “FixedData”, “KeepTogether”, “KeepWithGroup”, and “RepeatOnNewPage”. By default, I typically set FixedData = True; KeepTogether = True; KeepWithGroup = After; and RepeatOnNewPage = True.
From there, if it’s not quite right, you can play with those 4 settings to resolve your issue. Also, I would pay close attention to the bottom section of the MSDN article with regard to “Renderer Support” as there are only a few particular types of renderings for which the repeating rows and headers will work.
I hope this helps!
Recently I came across a challenging issue with calculating a time span within a group inside an SSRS Report. I am working on an issue where I need to be able to calculate labor pace operations per hour, and the time a person works on a particular labor task. My report data looks like the following:
As you can see, this person submitted their first weight at 7:34 AM with a weight of 13.805 lbs. In SSRS, it’s fairly straightforward to have a total for the weight, however, a total for the shift proved to be a little more challenging. I was hoping I could simply use the SUM() function in hopes that SSRS would know how to handle the time format and just give me a total time. No such luck. Through a series of trials and errors with the TimeSpan function, I realized that I could use the Min() and Max() functions with a simple math equation. Ultimately, my formula wound up being the following:
when applying the function, it yielded the a total of:
My next challenge was formatting the total so that it could be used in an equation with the total weight in order to determine my average lbs per hour. After several hours of trial and error with various TimeSpan and DateDiff/DatePart functions, I found simple was best and ended with this formula to get my results:
this got me an average of 34.58 lbs/hr and was able to use the formula throughout the report.