Azure Synapse Link allows you to connect to your transactional system directly to run analytical and machine learning workloads while eliminating the need for ETL/ELT, batch processing and reload wait times.
In this vLog, I explain how to turn the capability to use Link on in CosmosDB, and what’s happening under the covers to give access to that analytical workload without impacting the performance of your transactional processing system.
In my latest video blog I discuss getting started on the newly Generally Available Spark Pools as a part of Azure Synapse, another great option for Data Engineering/Preparation, Data Exploration, and Machine learning workloads
Without going too deep into the history of Apache Spark, I’ll start with the basics. Essentially, in the early days of Big Data workloads, a basis for machine learning and deep learning for advanced analytics and AI, we would use a Hadoop cluster and move all these datasets across disks, but the disks were always the bottleneck in the process. So, the creators of Spark said hey, why don’t we do this in memory and remove that bottleneck. So they developed Apache Spark as an in memory data processing engine as a faster way to process these massive datasets.
When the Azure Synapse team wanted to make sure that they were offering the best possible data solution for all different kinds of workloads, Spark gave the ability to have an option for their customers that were already familiar with the Spark environment, and included this feature as part of the complete Azure Synapse Analytics offering.
Behind the scenes, the Synapse team is managing many of the components you’d find in Open-Sourced Spark such as:
Apache Hadoop Yarn – for the management of the clusters where the data is being processed
Apache Livy – for the job orchestration
Anaconda – a package manager, environment manager, Python/R data science distribution and a collection of over 7500 open source packages for increasing the capabilities of the Spark clusters
I hope you enjoy the post. Let me know your thoughts or questions!
In my latest video blog I discuss and demonstrate some of the ways to connect to external data in Azure Synapse if there isn’t a need to import the data to the database or you want to do some ad-hoc analysis. I also talk about using COPY and CTAS statements if the requirement is to import the data after all. Check it out here
In this vLog I give an overview of Azure Data Explorer and the Kusto Query Language (KQL). Born from analyzing logs behind Power BI, ADX is a great way to take large sets of data and quickly analyze those datasets and get actionable insights on that data.
Find more details about Azure Data Explorer here: https://azure.microsoft.com/en-us/services/data-explorer/
And get started with these great tutorials: https://docs.microsoft.com/en-us/azure/data-explorer/create-cluster-database-portal
In this vLog, I cover the reasons why you might consider using Azure Data Factory, a mature cloud service for orchestration and processing of data over the newly GA Azure Synapse Studio.
Synapse has all of the same features as Azure Data Factory, but if you have a large development team working on ELT operations, or a simple data processing activity, it could make sense for the less-cluttered Azure Data Factory.
Take a look at the vLog here and let me know your thoughts on other scenarios for you!
In this vLog I describe some of the leading Master Data Management (MDM) tools in Azure, a necessary part of any Data Warehouse deployment, but often overlooked.
To be a solution architect, you’ve got to have great written and oral skills to be able to leverage your technical experience in helping to develop a solution with a team. In this video I cover some items you will want to focus on if you’re looking for a career change where you can leverage your technical acumen:
In this video blog post I covered the serving layer step of building your Modern Data Warehouse in Azure. There are certainly some decisions to be made around how you want to structure your schema as you get it ready for presentation with whatever your business intelligence tool of choice, for this example I used Power BI, so I discuss some of the areas you should focus on:
What is your schema type? Snowflake or Star, or something else?
Where should you serve up the data? SQL Server, Synapse, ADLS, Databricks, or Something Else?
What are your Service level agreements for the business? What are your data processing times?
Can you save cost by using an option that’s less compute heavy?
If
you’re like most, security is at the forefront of your mind for your
organization. You need the right tools and the right team to keep up
with the balance of increasing number of sophisticated threats and with
security teams being inundated with requests and alerts.
Today I’d like to tell you about Microsoft’s reimagined SIEM tool Azure Sentinel.
Over the past 10 to 15 years, Security Information and Event Management
(SIEM) has become extremely popular as an aggregation solution for
security and events that happen in our network.
There are also software tools, hardware appliances and managed
service providers that can help support your corporate needs to better
understand the level of risks in real-time and over a span of time. They
do things such as log aggregation, event correlation and forensic
analysis and offer features for alerting, dashboarding and compliance
checks.
These are great resources to help secure our environment, our users and devices. But unfortunately, the reality is security
teams are being inundated with requests and alerts. Compile this with
the noteworthy shortage of security professionals in the world – an
estimated 3.5 million unfilled security jobs by 2021 – this is a major
concern.
Microsoft decided to take a different approach with Azure Sentinel. Azure Sentinel provides intelligent security analytics at cloud scale for your entire enterprise. It makes it easy to collect data across your entire hybrid organization on any cloud, from devices to users to applications to servers. Azure Sentinel uses the power of AI to ensure you’re quickly identifying real threats.
With this tool:
You’ll eliminate the burden of traditional SIEMs as you’re eliminating the need to spend time on setting up, maintaining and having to scale the infrastructure to support other SIEMs.
Since it’s built on Azure, it offers virtually limitless cloud scale while addressing all your security needs.
Now let’s talk cost. Traditional SIEMs have proven to be expensive to
own and operate, often requiring you to commit upfront and incur high
cost for infrastructure maintenance and data ingestion. With Sentinel, you pay for what you use with no up-front costs. Even better, because
of Microsoft’s relationships with so many enterprise vendors (and more
partners being added) it easily connects to popular solutions, including Palo Alto networks, F5 networks, Symantec and Checkpoint offerings.
Azure Sentinel integrates with Microsoft Graph Security API,
enabling you to import your own threat intelligence feeds and to
customize threat detection and alert rules. There are custom dashboards that give you a view to allow you to optimize whatever your specific use case is.
Lastly, if you’d like to try this out for free, Microsoft is
allowing you to connect to your Office 365 tenant to do some testing and
check it out in greater detail. This product is currently in
preview, so there may be some kinks but I’m looking forward to seeing
how it develops in the future, as a true enterprise-class security
solution for your environment, whether in the cloud, on premises, in
data centers or remote users or devices.