All posts by cseferlis

Overview of Power BI Embedded

Everyone is familiar with Power BI Desktop, Cloud and On-Prem. But not as many are familiar with Power BI Embedded. So, what is it? Power BI embedded allows your company to embed the dashboards and reports in your in-house developed applications, and you only need one Power BI account to be able to have a Power BI embedded environment.

This Azure service is separate from Power BI Premium or Pro and is built for compute, rather than per user, as with other Power BI iterations. The design is to focus on your applications and your customer, instead of the management and maintenance of things.

You have options when setting up your Azure tenant. You can use your existing tenant ID, create a new application for the tenant or a tenant for a specific customer. There are 3 straightforward steps to get you up and running:

1.  Set up your Azure Power BI Embedded environment within Azure. Then set up your tenets, user requirements and workspaces.

2.  Then you’re going to embed your content by going to your backend and set up your application and connect to Azure through the REST API that Azure provides. This is all secure and encrypted traffic going over SSL. If you’re using the authentication when you’re displaying your reports and dashboards, then you’re doing this through your backend application authentication system, rather than the Azure application authentication system.

3.  Lastly, you’re going to release your reports and dashboards to production. You’ll need to decide what compute requirements you need and then set up your tiered pricing, pick your plan and you’re ready to go.

Why Your Infrastructure Belongs in the Cloud

You haven’t moved to the cloud yet? In this Azure Every Day installment, I’d like to tell you the top 5 reasons why you may want to move your infrastructure to the cloud.

1. Cost – Many people can take advantage of operational cost savings by not having to invest in a bunch of hardware that sits unused. In the cloud, you only pay for what you use.

2.  Business Continuity – With the cloud, you have better, more guaranteed up-time without having to worry about in-house appliances or certain infrastructures or servers. You also get easier administration. The cloud locations in Azure are set up so you can easily maintain and migrate your systems. And there’s no need for a second data center, giving you high availability, as well as more cost savings.

3.  Agility – You don’t have to spend money having something running all the time. It’s easy to spin up and spin down as you need it. You also have the ability to scale at an exponential rate. You can start small, but quickly build in traffic or performance capabilities or whatever you need.

4.  Management and Maintenance – You can drastically reduce the time needed to maintain and manage your environment, as well as have one central area for monitoring and maintaining your systems. You’ll save time wasted on running back ups and maintaining servers.

5.  Improved Security – Cloud providers have it in their best interest to be secure. There are over 300,000 open security jobs in the US alone. Where do you think those people want to work when there’s top quality companies paying top dollar? You guessed it – cloud companies.

Most Important Components of Azure Data Factory

Are you new to Azure and not know what Azure Data Factory is? Azure Data Factory is Microsoft’s cloud version of an ETL or ELT tool that helps you get your data from one place to another and to transform it. Today, I’d like to tell you about the high-level components within Azure Data Factory. These components pull together a data factory that helps your data flow from its source and have an ultimate end-product for consumption.

  • Pipeline – A pipeline is a logical grouping of activities that performs a grouping of work. An example of an activity may be: you’re copying on-premise data from one data source to the cloud (Azure Data Lake for instance), you then want to run it through an HDI Hadoop cluster for further processing and analysis and put it into a reporting area. The components will be contained inside the pipeline and would be chained together to create a sequence of events, depending upon your specific requirement.
  • Linked Service – This is very similar to the concept of a connection string in SQL Server, where you’re saying what is the source and destination of your data.
  • Trigger – A trigger is a unit of processing that determines when a pipeline needs to be run. These can be scheduled or set off (triggered) by a different event.
  • Parameter – Essentially, the information you can store inside a pipeline that will pass in an argument when you need to fill in what that dataset or linked service is.
  • Control Flow – The control flow in a data factory is what’s orchestrating how the pipeline is going to be sequenced. This includes activities you’ll be performing with those pipelines, such as sequencing, branching and looping.

What is Internet of Things (IoT) and Why It Matters to IT

The Internet of Things (IoT) has become a growing topic both inside and outside of the workplace. It has the ability to change how we live and how we work. Many are already on board with IoT, and global IoT revenues are projected to reach over a trillion dollars by 2020. If you’re not there yet, I’d like to talk today about what IoT is and how it’s being used.

Internet of Things, or IoT, is defined as a device that used to be a stand-alone device, but is now connected to the internet. Consumer based devices include Google Home, Alexa, Smart Watches, Fitbits and home thermostats. These products are already changing the way the owners of these devices live.

From a business standpoint, with the help of services like Azure IoT Hub, this gets much bigger, with a much larger impact on how people work. Large engine monitoring devices for trains and planes, for example, have millions of components that are being monitored all the time, therefore, showing real-time statistics about what is happening on those devices.

Chevron is using Microsoft Azure IoT Hub in the backend as they build out their IoT infrastructure for monitoring their oil, and deep well, drilling devices. John Deer is mounting IoT devices on their equipment that tracks things such as where, how far apart or how deep seeds are being planted, so farmers can get planting information right from these devices.

Yes, IoT is a big deal and it’s helping manufacturing and other companies, as well as consumers alike, by expanding the capabilities of things that are, or can be, connected to the internet.

Top 5 Takeaways from the Microsoft ICA Boot Camp

I was a recent attendee at the Microsoft International Cloud Architect Boot Camp, where I had the opportunity to participate in hands-on sessions, working closely with Microsoft teams and specialists, as well as other Microsoft Partners. This boot camp contained exclusive content that Pragmatic Works gets access to as a partner and as a preferred service within the Microsoft stack.

Here, I’d like to share my top 5 takeaways from this event:

1. Commitment to Security – As a cloud solution architect, I’m asked many questions around security and Microsoft Azure. One thing that amazed me was the commitment that Microsoft has made to security. They spend over a billion dollars each year on security to ensure they are secure from all threats. Microsoft is also the #1 attack to surface in the world. They are truly committed to making sure that your data and surfaces are secure.

2. Security Certifications – Microsoft has passed over 70 regulatory and government certifications when it comes to security and standardized processes. Their second-place competitor, AWS, has only completed 44 of these certifications. Getting these certifications and adhering to certain security and regulatory standards can be expensive, but there is a significant benefit for enterprise, government and small/medium-sized businesses.

3. Right-sizing Their Environment – This can be a challenge for many companies. Microsoft’s internal teams have gone completely to Azure and are managing their platforms within Azure for SQL databases, virtual machines and all other services Azure offers. By doing some specific right-sizing and keeping watch on what’s offered, they lowered their workloads and kept their CPU at the 95th percentile, and more importantly, they were able cut down on spending for their internals needs – to the tune of over 3 million dollars a month!

4. Differentiators from AWS – AWS is currently the #1 cloud platform as far as revenue and volume. But Microsoft is quickly catching up and they’ve identified several differentiators from AWS. Some key differentiators, such as Azure Recovery Zones and other such services, which have been slow to come up, will have releases to general audiences by the end of 2018. MS does not see any other differentiators that will allow AWS to continue to hold that lead.

5. Connections/Partnerships – By having Office 365, Dynamics 365, and Skype and LinkedIn connections, as well as the commitments to partners and ISVs, gives Microsoft a competitive advantage over AWS in what their ecosystem looks like. A common complaint heard is how AWS doesn’t work well with, or cater to, partners, leaving them to figure it out themselves.

Azure Site Recovery in 3 Minutes

Having a data/disaster recovery site or plan in place is crucial today, whether for compliance or to ensure if anything does happen, then your business will still be able to operate. Today’s episode of Azure Every Day focuses on Azure Site Recovery.

Azure Site Recovery is Microsoft’s business continuity or data recovery service. With this service, you can move your VMs to the cloud, back them up or go site to site. To utilize this service, you’ll need to coordinate and set up replication between the sites and/or servers.

You have some options of how to do this. You can back up an Azure VM to another Azure VM in a different geo locale, or back up a physical server, VM infrastructure or Hyper-V up to Azure. The physical and VMware are real-time replications, as opposed to Hyper-V, where you can get it down to about a 30-second window.

Azure Site Recovery has many great features, such as:
Application Awareness – It knows what you’re running (i.e. SharePoint, SQL Exchange, Active Directory, etc.) Because of this, it’s able to easily stop in one location and start in another in the event of a disaster.
Region to Region Replication – If you want to take your replication from the East Coast to the West Coast, this is built into the service, so it’s easily done.
Encryption – From a security standpoint, it supports encryption at rest and encryption in transit. This is extremely helpful when you’re backing up from one Azure Virtual Machine to another, or from your local VMware infrastructure to Azure. This will all be encryption in transit, as well as at rest, when it lands in Azure.

Some other key features are the auto failover and auto failback capabilities, as well as continuous replication, so your RTO and RPO are easily met by working on this platform. You can also run automated recovery scenarios, so you can test your disaster plan without any impact to your environment.

Introduction to Azure Data Factory

Introduction to Azure Data Factory

Are you new to Azure, or looking to make the move and curious about what Azure Data Factory is? Azure Data Factory is Microsoft’s cloud version of a data integration and orchestration tool. It allows you to move your data from one place to another and make changes with it. Here, I’d like to introduce the 4 main steps or components of how Azure Data Factory works.

Step 1 – Connect and Collect

Connect and collect is where you define where you’re pulling your data from, such as SQL databases, web applications, SSAS, etc. You collect that data into one centralized location like Azure Data Lake or Azure Blob Storage.

Step 2 – Transform and Enrich

In this step, you take the data from your centralized storage and enrich it to further expand on your data using HDInsight operation, Spark or Data Lake analytics, for example.

Step 3 – Publish

Next is to publish the data to a place that it can be better used and consumed by the end users. Any BI tool, such as Power BI or reporting services are great choices.

Step 4 – Monitor

 

This last step is to monitor the data to be sure jobs are running, and data is flowing, properly. It’s also important to monitor to ensure data quality. Monitoring can be done with tools like PowerShell, Microsoft Operations Manager or Azure Monitor, which allow you to monitor inside the Azure portal.

Top 10 Best Practices about SSIS and SSRS I’ve learned the hard way

Over the past 7 or 8 years, I’ve gone from “0 to 60” when it comes to database design, development, ETL and BI.  Most of the skills I’ve learned were a result of mistakes I’ve made, and I’ve been mostly self-taught with the exception of some more recent “formalized” learning with programs from Pragmatic Works.  As I’ve grown more and more satisfaction with the process, I’ve gone on and started working towards my MCSA (2 of 3 complete) in SQL 2012, as well as speaking at SQL Saturdays and local user groups in New England.  It’s become one of the most rewarding, exciting and challenging aspects of my career.  As a result, I’ve posted some blog articles about some of the challenges I’ve overcome, though not frequently enough, and attempted to become more active in some SQL forums.  The list below is far from complete when it comes to all of the best practices I’ve learned over the years, however, many of these lessons and best practices have really helped me to be organized when it comes to good BI architecture.  I hope to provide at least one item that will benefit a newbie or even a seasoned pro.  So, without further ado, here are the top 10 best practices I’ve learned the hard way…

  1. Use Source Control
    1. For anyone who was a developer in their past life, or is one now, this is a no-brainer, no-alternative best practice.  In my case, because I come from a management and systems background, I’ve had to learn this the hardway.  If this is your first foray into development, get ready, because you’re in for some mistakes, and you’re going to delete or change some code you really wish you didn’t.  Whether it be for reference purposes on something you want to change, or something you do by accident, you’re going to need that code you just got rid of yesterday, and we both know you didn’t back up your Visual Studio jobs… Hence, source control.  Github and Microsoft offer great solutions for Visual Studio, and Redgate offers a great solution for SSMS.  I highly recommend checking them out and using the tools!  There are some other options out there that are free, or will save your code to local storage locations, but the cloud is there for a reason, and many of us are on the go, so having it available from all locations is very helpful.
  2. Standardize Transform Names
    1. Leaving the default names for the OLE DB connector or the Execute SQL Task seems like something very silly a person might do because it’s so easy to label them for what they actually are, but I have to admit here, I’m guilty of doing so.  I’ve found myself in situations where I’ve thrown together a quick ETL package for the purpose of testing, or further work on the database side, and then I’ve forgotten to go back and fix them.  Fast-Forward 6 months, and saying to myself, “I think I had a package like that once before”, only to find it, open it, and not have a clue what it’s actually doing.  This of course requires me to go through each component and refresh my memory on what I did.  Truth be told, in this day of resources, memory, etc, there is absolutely no need to not give all the details needed within the title of the tool being used. Don’t be lazy, you never know when it might bite you!
  3. Document inside your code and ETL workflow
    1. If you’re using the script transforms, it opens a pretty standard development window anyone familiar with Visual Studio will recognize.  As with any development, good code comes with good documentation.  Use it!  Your predecessors, if not you, yourself, will be very appreciative down the road.  Name your functions appropriately, and explain what you’re doing throughout the code.  Further, as you build your workflows, you have the ability to document what each step of the process is doing, use it!  This also goes back to point 2.  With standardized names for your transforms alongside documenting the workflow as you go, it paints a very nice picture of what your workflow is doing.
  4. Setup detailed alerts for failures
    1. The traditional workflows in SSIS allow for users to create mail notifications for successful and unsuccessful steps within the workflow.  Of course, depending on how your packages are being run, you could have the same type of notification directly from the SQL Server running a SQL Job and also send notification, however, why use the SQL Alerts to tell you that “Job A” failed, and no real good information when you can have your package tell you exactly which transform or component failed, and what the error was when it failed.  There are some limitations to the canned SSIS component as to the methods you can use to send an email, however, there are also some workarounds like the one here on Stackoverflow that shows how to use the script task to connect to and send mail through gmail.  Either way, there is plenty of functionality that will help you to be informed of exactly what is happening inside a job and where there are warnings and errors for each component.  Taking the time to do this is much better than getting a phone call with complaints about data not being updated properly!
  5. Standardize File and Folder Names and Locations
    1. Ok, ok, I know this is getting a bit redundant… but remember, these are the mistakes I made as a newb, and I really want to help you out as you start to use the software more and more, and get more and more complex with workflows.  This one was a really big one for me.  Because I do all of my work in visual studio, and all of the BI jobs look the same when it comes to the file level, I really needed to be able to show the separation between my ETL jobs, SSRS jobs and SSAS jobs.  This also helped me out with my source control structure as well.  I separated SSIS, SSRS, and SSAS jobs (you can even go as far as separating Tabular and Multi-dimentional if necessary) into separate folders, then labeled each type with “ETL” or “reports” as part of the file name.  It saves me time when I’m opening a recent job because typically I’m working on tweaking the ETL at the same time as developing the reports to get just the right mix.
  6. Have a “Report Request Form”
    1. When you first start writing reports, it’s really exciting.  You’re delivering value to the business and really helping people do their jobs, especially when you start transforming and aggregating data… but soon, you become more and more relied upon for for those reports, and no two reports are alike it would seem.  A common best practice for people who are spitting out report after report is to have a report requirements request form like the one here from SQL Chick.  This request form is pretty in-depth, so tweak as necessary, but it will really help you to prioritize and design reports going forward.
  7. Experiment to fine tune and improve performance
    1. So, this best practice item is really a whole blog post unto itself, but it’s something to be aware of.  Just as a quick for instance, the “OLE DB command” transform is a great tool in theory, however, because of some of the nuances of the tool, if you’re using a large dataset, it can be significantly slower than using the “Execute SQL Task”, but the only way to know this is to compare them side by side, which I had to do, and realized the SQL Task took about 3 min, and the OLE DB Command took about 45 minutes.  Moral of the story: if something seems to take a long time, there may, and most likely is, a better way to do it, go out and play!
  8. Set your backup type to Basic or Bulk-logged on staging tables
    1. Ok, so normally I would stress the importance of backups (always do them automatically, including logs, and before you change anything), but that would be more of a blog post about maintenance and configuration, but this is more to focus on ETL and Reporting, so let’s talk about the effect loading lots of data will have on your database, or better yet, why not check out this article that has a very good description of the 3 different backup models.  The basics are just this… if it’s a staging table or db for data, you can probably get away with the basic model.  If it’s a critical db, but you’re doing lots of data loads, the bulk-logged model  will ignore SELECT INTO, bcp, INSERT SELECT and BULK INSERT DML operations so your transaction logs don’t get huge, fast.
  9. Temp Tables will likely make your ETL run faster than staging tables
    1. I can’t really take credit for this little nugget here.  When doing a project with one of my coding buddies, he came in to my office one day and said “Hey, did you know that using temp tables in SQL will allow you to use multiple processors on the server at one time?”  I did not know this, and man did it make a difference.  What a huge performance boost it was for my project.  Now, like everything else, there are exceptions to the rule.  Some guys who are much smarter than me had this discussion on a forum that sheds some more light on the topic, and here are some more scenarios where it might not make sense.  I think it’s good information overall, and the more information you have, then better off you’ll be when you’re designing your BI Architecture
  10. Have a development and testing platform
    1. As with some of the other best practices listed above, for some, this is a no-brainer.  For others… we have no brains when it comes to this stuff and we need to have it beaten into our heads (I’m the latter if that wasn’t already clear).  I can’t stress enough how much this will save you.  You should never, ever, be doing development on a production environment.  There are just too many things that can go wrong.  Even those “quick” or “minor” changes can really cause a calamity and ruin your day quickly.  Now, there can be challenges if you don’t have a proper production/dev/test environment at your office or your client’s location, however, with SQL Server Developer’s Edition now being a free tool, and PCs these days having tons of resources, you should be able to do your testing on even the simplest of computers and get a warm and fuzzy that you’re going to be able to deploy this latest package, report, or code successfully.  Performance tuning might not be truly possible to do comparisons against a beefy production server, but you should be able to establish a baseline and have a general idea of how performance will be for various configurations.

Well, that’s it for this post.  I really hope you’re able to provide even the slightest hint of learning something new here because that’s always my goal.  If you have questions, you can follow me or send me a message on twitter @bizdataviz as I’m always happy to hear how I can write better blog posts to help people out whom are just getting their feet wet.

Where the **** did my Power View go?

Recently I was presenting a session on Microsoft’s Power View in what I had intended to be my final time presenting the session before retiring it to the archives.  Unfortunately the presentation didn’t go as planned.  For the life of me, I couldn’t find my Power View icon in the MS Excel ribbon, and began to go out of my mind!  Note, I’ve discussed this topic at various SQL Saturdays and User Groups for the past couple of years, and it started to become somewhat of a second nature for me doing demo after demo of the various components and moving on to the next.  So, it’s only natural that phrases like “it was just here” and “what did Microsoft do this time!!” came out of my mouth in front of the group, and I rapidly changed my focus and did what components of the demo I could show in PowerBI as an alternative.  Later find out that, in fact, Microsoft did do something.  With an assumed effort to get people to use PowerBI and move away from Silverlight, Microsoft made the decision to remove the button from Excel in both Excel 2013 and Excel 2016 as described in the following article by John P White:

https://www.itunity.com/article/rethinking-business-intelligence-sharepoint-sql-server-2016-2936

He also included a nice “How To” to get it back into Excel if that’s how you want to use it here:

http://whitepages.unlimitedviz.com/2015/07/power-view-excel-2016/

A couple of lessons learned from this and, in a sense, re-learned: 1) every time you think you know what Microsoft is doing, you don’t. 2) do at least one quick dry run of your presentation immediately before the presentation, or as close to it as possible 3) Power View appears to be transitioning to a “legacy” product… I guess it was a good time to retire the session