All posts by cseferlis

Overview of Azure Synapse Link featuring CosmosDB

Azure Synapse Link allows you to connect to your transactional system directly to run analytical and machine learning workloads while eliminating the need for ETL/ELT, batch processing and reload wait times.

In this vLog, I explain how to turn the capability to use Link on in CosmosDB, and what’s happening under the covers to give access to that analytical workload without impacting the performance of your transactional processing system.

Check it out here and let me know what you think!

Getting started with Spark Pools in Azure Synapse

In my latest video blog I discuss getting started on the newly Generally Available Spark Pools as a part of Azure Synapse, another great option for Data Engineering/Preparation, Data Exploration, and Machine learning workloads

Without going too deep into the history of Apache Spark, I’ll start with the basics. Essentially, in the early days of Big Data workloads, a basis for machine learning and deep learning for advanced analytics and AI, we would use a Hadoop cluster and move all these datasets across disks, but the disks were always the bottleneck in the process. So, the creators of Spark said hey, why don’t we do this in memory and remove that bottleneck. So they developed Apache Spark as an in memory data processing engine as a faster way to process these massive datasets.

When the Azure Synapse team wanted to make sure that they were offering the best possible data solution for all different kinds of workloads, Spark gave the ability to have an option for their customers that were already familiar with the Spark environment, and included this feature as part of the complete Azure Synapse Analytics offering.

Behind the scenes, the Synapse team is managing many of the components you’d find in Open-Sourced Spark such as:

  • Apache Hadoop Yarn – for the management of the clusters where the data is being processed
  • Apache Livy – for the job orchestration
  • Anaconda – a package manager, environment manager, Python/R data science distribution and a collection of over 7500 open source packages for increasing the capabilities of the Spark clusters

I hope you enjoy the post. Let me know your thoughts or questions!

Connecting to External Data with Azure Synapse

In my latest video blog I discuss and demonstrate some of the ways to connect to external data in Azure Synapse if there isn’t a need to import the data to the database or you want to do some ad-hoc analysis. I also talk about using COPY and CTAS statements if the requirement is to import the data after all. Check it out here

Comparing Azure Synapse, Snowflake, and Databricks for common data workloads

In this vLog post I discuss how Azure Synapse, Databricks and Snowflake compare when it comes to common data workloads:

Data Science

Business Intelligence

Ad-Hoc data analysis

Data Warehousing

and more!

Where does Azure Data Explorer fit in the rest of the Data Platform?

In this vLog I give an overview of Azure Data Explorer and the Kusto Query Language (KQL). Born from analyzing logs behind Power BI, ADX is a great way to take large sets of data and quickly analyze those datasets and get actionable insights on that data.

Find more details about Azure Data Explorer here: https://azure.microsoft.com/en-us/services/data-explorer/

And get started with these great tutorials: https://docs.microsoft.com/en-us/azure/data-explorer/create-cluster-database-portal

Should I Choose Azure Data Factory or Synapse Studio

In this vLog, I cover the reasons why you might consider using Azure Data Factory, a mature cloud service for orchestration and processing of data over the newly GA Azure Synapse Studio.

Synapse has all of the same features as Azure Data Factory, but if you have a large development team working on ELT operations, or a simple data processing activity, it could make sense for the less-cluttered Azure Data Factory.

Take a look at the vLog here and let me know your thoughts on other scenarios for you!

Tips on becoming a solution architect

To be a solution architect, you’ve got to have great written and oral skills to be able to leverage your technical experience in helping to develop a solution with a team. In this video I cover some items you will want to focus on if you’re looking for a career change where you can leverage your technical acumen:

The Modern Data Warehouse in Azure Part 4: The Serving Layer

In this video blog post I covered the serving layer step of building your Modern Data Warehouse in Azure. There are certainly some decisions to be made around how you want to structure your schema as you get it ready for presentation with whatever your business intelligence tool of choice, for this example I used Power BI, so I discuss some of the areas you should focus on:

  • What is your schema type? Snowflake or Star, or something else?
  • Where should you serve up the data? SQL Server, Synapse, ADLS, Databricks, or Something Else?
  • What are your Service level agreements for the business? What are your data processing times?
  • Can you save cost by using an option that’s less compute heavy?

Microsoft Reimagines Traditional SIEMs with Azure Sentinel

If you’re like most, security is at the forefront of your mind for your organization. You need the right tools and the right team to keep up with the balance of increasing number of sophisticated threats and with security teams being inundated with requests and alerts.

Today I’d like to tell you about Microsoft’s reimagined SIEM tool Azure Sentinel. Over the past 10 to 15 years, Security Information and Event Management (SIEM) has become extremely popular as an aggregation solution for security and events that happen in our network.

There are also software tools, hardware appliances and managed service providers that can help support your corporate needs to better understand the level of risks in real-time and over a span of time. They do things such as log aggregation, event correlation and forensic analysis and offer features for alerting, dashboarding and compliance checks.

These are great resources to help secure our environment, our users and devices. But unfortunately, the reality is security teams are being inundated with requests and alerts. Compile this with the noteworthy shortage of security professionals in the world – an estimated 3.5 million unfilled security jobs by 2021 – this is a major concern.

Microsoft decided to take a different approach with Azure Sentinel. Azure Sentinel provides intelligent security analytics at cloud scale for your entire enterprise. It makes it easy to collect data across your entire hybrid organization on any cloud, from devices to users to applications to servers. Azure Sentinel uses the power of AI to ensure you’re quickly identifying real threats.

With this tool:

  • You’ll eliminate the burden of traditional SIEMs as you’re eliminating the need to spend time on setting up, maintaining and having to scale the infrastructure to support other SIEMs.
  • Since it’s built on Azure, it offers virtually limitless cloud scale while addressing all your security needs.

Now let’s talk cost. Traditional SIEMs have proven to be expensive to own and operate, often requiring you to commit upfront and incur high cost for infrastructure maintenance and data ingestion. With Sentinel, you pay for what you use with no up-front costs. Even better, because of Microsoft’s relationships with so many enterprise vendors (and more partners being added) it easily connects to popular solutions, including Palo Alto networks, F5 networks, Symantec and Checkpoint offerings.

Azure Sentinel integrates with Microsoft Graph Security API, enabling you to import your own threat intelligence feeds and to customize threat detection and alert rules. There are custom dashboards that give you a view to allow you to optimize whatever your specific use case is.

Lastly, if you’d like to try this out for free, Microsoft is allowing you to connect to your Office 365 tenant to do some testing and check it out in greater detail. This product is currently in preview, so there may be some kinks but I’m looking forward to seeing how it develops in the future, as a true enterprise-class security solution for your environment, whether in the cloud, on premises, in data centers or remote users or devices.