In this video blog post I covered the serving layer step of building your Modern Data Warehouse in Azure. There are certainly some decisions to be made around how you want to structure your schema as you get it ready for presentation with whatever your business intelligence tool of choice, for this example I used Power BI, so I discuss some of the areas you should focus on:
What is your schema type? Snowflake or Star, or something else?
Where should you serve up the data? SQL Server, Synapse, ADLS, Databricks, or Something Else?
What are your Service level agreements for the business? What are your data processing times?
Can you save cost by using an option that’s less compute heavy?
I’d like to tell you about some very cool intelligence features within
the Azure SQL Database. Azure SQL Database technologies deliver
intelligent capabilities through a range of built-in machine learning
and adaptive technologies that monitor and manage performance and
security for you.
Using telemetry from millions of databases running in Azure over the
years, Microsoft has built this capability of training a truly
intelligent and autonomous database that gives you the ability to learn
and adapt to your workload. This intelligent performance gives you the
deeper insight into database performance. Plus, it eliminates the hassle
of making ongoing improvements, allowing you to focus more on driving
your business and less on “chores”.
Features like query performance insights and automatic tuning continuously monitor database usage and detect disruptive events and then they take steps to improve performance.
Three examples of the intelligent performance that can collectively optimize your memory usage and improve overall query performance are things like:
Row mode memory grant feedback – this
gives you the ability to expand on batch-mode memory grant feedback by
adjusting memory grant sizes for both batch and row mode operators.
Approximate query processing – this is
designed to provide aggregations across large datasets where
responsiveness is more critical than absolute precision, and it will
return an approximate value with the focus on performance.
Table variable deferred compilation – this
improves plan quality and overall performance for queries, referencing
table variable by propagating cardinality estimates that are based on
actual table variable row counts. In turn, this optimizes your
downstream plan operations.
Along all those features, Azure SQL Database intelligent protection
allows you to efficiently and productively meet your data’s security and
compliance requests by proactively monitoring for potential threats and
vulnerabilities. You can flag things such as PII or a cross-scripting
attack or something like that. There are detection mechanisms in there
that can help you avoid these.
Through features like information protection, vulnerability
assessment and threat detection, you can proactively discover and
protect sensitive data, as well as uncover potential vulnerability and
detect anomaly activities that could indicate a threat to your data.
In short, Microsoft has built these intelligent features over years
of machine learning and is applying it to all their Platform as a
Service, as well as some of their on-premises, offerings. These are
really cool features and we’ve got great response about them and how
well they work.
I recommend you give these features a try, but remember, always try them out in your test or dev environments prior to bringing them into production.
your current SQL Database service tier is not well suited to your
needs, I’m excited to tell you about a newly created service tier in
Azure called Hyperscale. Hyperscale is a highly scalable storage and
compute performance tier that leverages the Azure architecture to scale
out resources for Azure SQL Database beyond the current limitations of
general purpose and business critical service tiers.
The Hyperscale service tier provides the following capabilities:
Support for up to 100 terabytes of database size (and this will grow over time)
Faster large database backups which are based on file snapshots
Faster database restores (also based on file snapshots)
Higher overall performance due to higher log throughput and faster transaction commit time regardless of the data volumes
The ability to rapidly scale out. You can provision one or more read
only nodes for offloading your read workload for use as hot standbys.
You can rapidly scale up your compute resources (in constant time)
to accommodate heavy workloads, so you can scale compute up and down as
needed just like Azure Data Warehouse
Who should consider moving over to the Hyperscale tier? This is not
an inexpensive tier, but it’s a great choice for companies who have
large databases and have not been able to use Azure databases in the
past due to its 4-terabyte limit, as well as for customers who see
performance and scalability limitations with the other 2 service tiers.
It is primarily designed for transactional or OLTP workloads.
However, it does support hybrid and OLAP workloads, but something to
keep in mind when designing out your databases and services. It’s also
important to note that elastic pools do not support the Hyperscale
How does it work?
You separate the compute and storage out into 4 separate nodes similar to Azure Data Warehouse.
The compute node is where the relational engine lives or where the querying process happens.
The page server node is where the scaled-out storage engine resides
and where database pages are served out to the compute nodes on demand
and keeps pages updated as transactions update data, so these nodes are
moving the data around for you.
The log service node is where the log records are kept as they come
in from the compute node and kept in a durable cache, then they’re
forwarded along to additional compute nodes and caches to ensure
consistency. When all this is spread out and everything is consistently
spread across the compute nodes, it will get stored in Azure storage for
long term storage of your logs.
Lastly, the Azure storage node is where all the data is pushed from
the page servers. So, all the data that eventually lands in the database
gets pushed over to Azure storage and this is also the storage that
gets used for backups, as well as where the replication between
availability groups happens.
This Hyperscale tier is an exciting opportunity for those customers
that don’t have their requirements fulfilled with prior service tiers.
It’s another great Microsoft offering that’s worth checking out if you
have had these service tier issues up to now. And it helps to leave a
line of distinction between Azure Data Warehouse and Azure Database
because you now can scale out/up and tons of data, but it’s still built
out for the transactional processing, as opposed to Azure Data Warehouse
which is more of the analytical or massively parallel processing.
Last week I posted about the Azure Reserve VM Instance where you could save some money in Azure. Another similar way to save is with Azure SQL Database Reserved Capacity. With this you can save 33% compared to license included pricing by pre-buying SQL Database pre-cores for a 1- or 3-year term.
This can be applied to a single subscription or shared across your enrollments, so you can control how many subscriptions can use the benefit, as well as how the reservation is applied to the specific subscriptions you choose.
The reservation scope to a single subscription allows you to apply it to that SQL Database resource(s) within the selected subscription. A reservation with a shared scope can be shared across subscriptions in the enrollment and there’s some flexibility involved like Managed Instances where you can scale up/down.
Some other points about SQL Database Reserved Capacity I’d like to share:
It provides V-cores with the size flexibility you need. You can scale those up/down within a performance tier, so do note that you must stay within the same performance tier and in the same region as well, without impacting your reserved capacity pricing.
You can temporarily move your hot databases between pools and single databases as part of your normal operations; again, within the same region and performance tier without losing the reserved capacity benefit.
You can keep an unapplied buffer in your reservation, so you can effectively manage performance spikes without exceeding your budget. Just another way to keep an eye on your Azure spend.
You can stack savings, so with the Azure Hybrid benefit licensing, you can save up to an additional 55%, bringing you to a total savings of over 80% by stacking these benefits.
With Hybrid Enterprise Edition, customers using Software Assurance can use 4 cores in the cloud for every one core they’ve purchased.
A couple things to note:
It cannot be applied to an MSDN subscription or a non-pay as you go subscription; so basically, it applies to Enterprise and pay as you go subscriptions.
Currently only applies to single databases in elastic pools. Managed Instances are still in Preview; when they are in GA (by the end of 2018 as of this post), it will also be covered by 4 Managed Instances as well.
For questions about how this licensing works, contact your Microsoft rep.
If you’re using SQL 2008 or 2008R2, you need to be aware that the extended support for those ends on July 9, 2019. This means the end of regular security updates which leads to more vulnerabilities, and the software won’t be updated, and you’ll have out of compliance risks. As such, here are some new options for SQL 2008 End of Support:
The best option would be with either a migration or an upgrade, but Microsoft has some options in place to help people out as they understand this can be easier said than done when you have applications that need to be upgraded and you must figure out how best to handle that.
That being said, upgrading provides better performance, efficiency, security features and updates, as well as new technology features and capabilities within the whole stack of SQL products (SSIS, SSRS, SSAS).
Here are some options that Microsoft is offering to help with the end of support of 2008/2008R2:
First, they are going to extend security updates available for free in Azure for 2008/2008R2 for 3 more years. So, if you simply move your workload to an IaS VM in Azure, you’ll be supported without requiring application changes. You’ll have to pay for those virtual machine costs but it’s still a good deal to get you started.
You can migrate your workloads to Managed Instances, which will be in GA by the end of 2018. This will be able to support all applications out there, so you can start the transition up into Azure.
You can take advantage of the Azure hybrid licensing model to migrate to save money on licensing. With this you can save up to 55% on some of your PaaS SQL Server costs, but only if you have Enterprise Edition and Software Assurance.
For on-premises servers that need more time to upgrade, you’ll be able to purchase extended security service plans to extend out 3 years past the July 2019 date. So, if you’re struggling to get an application upgraded and validated, they’ll extend that out for a fee. Again, this is for customers with Software Assurance or subscription licenses under an Enterprise Agreement. These can be purchased annually and to cover only the servers that need updates.
Extended security updates will also be available for purchase as they get closer to the end of support, for both SQL Server and Windows Server.
Again, the first choice would be to upgrade or migrate those databases and move to Azure, but there are some challenges with doing so, and if none of those options work, there are some great options to extend your support.
Some of you may not know when or how to scale up your queries or scale out your processing. Today I’d like to help with understanding when and how using Azure Analysis Services. First, you need to decide which tier you should be using. You can do that by looking at the QPUs (Query Processing Units) of each tier on Azure. Here’s a quick breakdown:
Developer Tier – gives you up to 20 QPUs
Basic Tier – is a mid-scale tier, not meant for heavy loads
Standard Tier (currently the highest available) – allows you more capability and flexibility
Let’s start with when to scale up your queries. You need to scale up when your reports are slow, so you’re reporting out of Power BI and the throughput isn’t working for your needs. What you’re doing with scaling up is adding more resources. The QPU is a combination of your CPU, memory and other factors like the number of users.
Memory checks are straightforward. You run the metrics in the Azure portal and you can see what your memory usage is, if your memory limited or memory hard settings are being saturated. If so, you need to either upgrade your tier or adjust the level within your current tier.
CPU bottlenecks are a bit tougher to figure out. You can get an idea by starting to watch your QPUs to see if you’re saturating those using those metrics and looking at the logs within the Azure portal. Then you want to watch your processor pool job que length and your processing pool busy, non-IO threads. This should give you an idea of how it’s performing.
For the most part, you’re going to want to scale up when the processing engine is taking too long to process the data to build your models.
Next up, scaling out. You’ll want to scale out if you’re having problems with responsiveness with reporting because the reporting requirements are saturating what you currently have available. Typically, in cases with a large number of users, you can fix this by scaling out and adding more nodes.
You can add up to 7 additional query replicas; these are Read-only replicas that you can report off, but the processing is handled on the initial instance of Azure Analysis Services and subsequent queries are being handled as part of those query replicas. Hence, any processing is not affecting the responsiveness of the reports.
After it separates the model processing from query engine, then you can measure the performance by watching the log analytics and query processing units and see how they’re performing. If you’re still saturating those, you’ll need to re-evaluate whether you need additional QPUs or to upgrade your tiers.
Something to keep in mind is once you’ve processed your data, you must resynchronize it across all of those queries. So, if you’re going to be processing data throughout the day, it’s a good idea not only to run those queries, but also to strategically synchronize them as well.
Also important to know is that scale out does require the Standard Edition Tier; Basic and Developer will not work for this purpose. There are some interesting resources out there that allow you to auto scale. It will be based on a schedule using a PowerShell runbook. It uses your Azure automation account to schedule when it’s going to scale up or out based on the needs of the environment. For example, if you know Monday mornings you’re going to need additional processing power to run your queries efficiently, you’ll want to set up a schedule for that time and then you can scale it back.
Another note is that you can scale up to a higher tier, but you cannot scale those back automatically if you’re running a script. But with this ability it does allow you to be prepared for additional requirements in that environment.
I hope this helped with questions you have about scaling up and out.
What do you know about Azure Common Data Services? Today I’d like to talk about this product for apps which was recently re-done by Microsoft to expand upon the product’s vision. Common Data Services is an Azure-based business application platform that enables you to easily build and extend applications with your customer’s business data.
Common Data Services helps you bring together your data from across the Dynamics 365 Suite (CRM, AX, Nav, GP) and use this common data service to more easily extract data rather than having to get into the core of those applications. It also allows you to focus on building and delivering the apps that you want and insights and process automation that will help you run more efficiently. Plus it integrates nicely with PowerApps, Power BI and Microsoft Flow.
Some other key things:
If you want to build Power BI reports from your Dynamics 365 CRM data, there are pre-canned entities provided by Microsoft.
Data within the Common Data Services (CDS) is stored within a set of entities. An entity is just a set of records that’s used to store data, similar to how a table stores data within a database.
CDS should be thought of as a managed database service. You don’t have to create indexes or do any kind of database tuning; you’re not managing a database server as you would with SQL Server or a data warehouse. It’s designed to be somewhat of a centralized data repository from which you can report or do further things with.
PowerApps is quickly becoming a good replacement for things like Microsoft Access as it comes with along with functionality and feature sets. A common use for PowerApps is extending that data rather than having to dig into the background platform.
This technology is easy to use, to share and to secure. You set up your user account as you would with Azure Services, giving specific permissions/access based on the user.
It gives you the metadata you need based on that data and you can specify what kind of field or column you’re working with within that entity.
It gives you the ability of logic and validation; you can create business process rules and data flows from entity to entity or vice versa or from app to entity to PowerApps.
You can create workflows that automate business process, such as data cleansing or record updating; these workflows can run in the background without having to manage manually.
Gets good connectivity with Excel which makes it user friends for people comfortable with that platform.
For power users, there’s an SDK available for developers, which allows you to extend the product and build some cool custom apps.
I don’t think of this as a replacement for Azure SQL DW or DB but it does give you the capability to have table-based data in the cloud that has some nice hooks into the Dynamics 365 space, as well as outputting to PowerApps and Power BI.
Are you familiar with Azure Cosmos DB? Cosmos DB is Microsoft’s globally distributed, multi-model database. With the click of a button, it allows you to elastically and independently scale throughput and storage across any number of Azure’s geographic regions, so you can put the data where your customers are.
Cosmos DB has custom built APIs that allow you a multitude of data sources, like SQL Server, Mongo DB and Azure tables, as well as offering 5 consistency models. It offers comprehensive Service Level Agreements (SLAs) with money back guarantees for availability (99.99% to be exact), latency, consistency and throughput; a big deal when you need to serve your customers at optimum performance.
Cosmos DB is a great option for many different use cases:
Companies that are doing IOT and telematics. Cosmos DB can ingest huge bursts of data, and process and analyze that data in near real-time. Then it will automatically archive all the data it ingests.
Retail and Marketing. Take an auto parts product catalog, for example, with tons of parts within the catalog, each with its own properties (some unique and some shared across parts). The next year, new vehicles or new parts model come out, with some similar and different properties. All that data adds up very quickly. Cosmos DB offers a very flexible schema in a hierarchical structure that can easily change the data around as things change.
Gaming Industry. Games like Halo 5 by Microsoft are built on a Cosmos DB platform, because they need performance that is quickly and dynamically scalable. You’ve got things like millisecond read-times, which avoids any lags in game play. You can index player related data and it has a social graph database that’s easily implemented with flexible schema for all social aspects of gaming.
Azure Cosmos DB ensures that your data gets there and gets there fast, with a wealth of features and benefits to make your life easier. And it’s easy to set up and manage.
I was a recent attendee at the Microsoft International Cloud Architect Boot Camp, where I had the opportunity to participate in hands-on sessions, working closely with Microsoft teams and specialists, as well as other Microsoft Partners. This boot camp contained exclusive content that Pragmatic Works gets access to as a partner and as a preferred service within the Microsoft stack.
Here, I’d like to share my top 5 takeaways from this event:
1. Commitment to Security – As a cloud solution architect, I’m asked many questions around security and Microsoft Azure. One thing that amazed me was the commitment that Microsoft has made to security. They spend over a billion dollars each year on security to ensure they are secure from all threats. Microsoft is also the #1 attack to surface in the world. They are truly committed to making sure that your data and surfaces are secure.
2. Security Certifications – Microsoft has passed over 70 regulatory and government certifications when it comes to security and standardized processes. Their second-place competitor, AWS, has only completed 44 of these certifications. Getting these certifications and adhering to certain security and regulatory standards can be expensive, but there is a significant benefit for enterprise, government and small/medium-sized businesses.
3. Right-sizing Their Environment – This can be a challenge for many companies. Microsoft’s internal teams have gone completely to Azure and are managing their platforms within Azure for SQL databases, virtual machines and all other services Azure offers. By doing some specific right-sizing and keeping watch on what’s offered, they lowered their workloads and kept their CPU at the 95th percentile, and more importantly, they were able cut down on spending for their internals needs – to the tune of over 3 million dollars a month!
4. Differentiators from AWS – AWS is currently the #1 cloud platform as far as revenue and volume. But Microsoft is quickly catching up and they’ve identified several differentiators from AWS. Some key differentiators, such as Azure Recovery Zones and other such services, which have been slow to come up, will have releases to general audiences by the end of 2018. MS does not see any other differentiators that will allow AWS to continue to hold that lead.
5. Connections/Partnerships – By having Office 365, Dynamics 365, and Skype and LinkedIn connections, as well as the commitments to partners and ISVs, gives Microsoft a competitive advantage over AWS in what their ecosystem looks like. A common complaint heard is how AWS doesn’t work well with, or cater to, partners, leaving them to figure it out themselves.