Improve Your Security Posture with Azure Secure Score

Security is a top priority for every business and we can never have enough of it, right? But at what point does it become too much to administer and prioritize security threats? I’m excited to tell you about a newly announced offering called Azure Secure Score which is part of the Azure Security Center.

If you’re unfamiliar, the Azure Security Center is a centralized place where you can get security recommendations based on the workloads you’ve deployed. In September at Ignite, Microsoft announced Secure Score as a security analytics tool that provides visibility of your organization’s security posture, as well as help you understand how secure your workloads are by assigning them a score.

The new Secure Score helps you prioritize and triage your response to security recommendations. It takes into consideration the severity and impact of the recommendation and based on that info it assigns a numerical value to show how fixing the recommendation can improve your security posture.

Once you implement a recommendation, the score and the overall Secure Score updates.

The main goals of Secure Score are:

  • To provide the capabilities that allow you to visualize the security posture.
  • Quickly triage and make suggestions to provide impactful actions that increase your security posture.
  • Measures the workload of the security over time.

So, how does Azure Security Center and Secure Score work?

  • Azure Security Center constantly reviews your active recommendations and calculates your Secure Score based on these.
  • The score of a recommendation is derived from its severity and security best practices that will affect your workload security over time.
  • It looks at your security and where you sit over a period. It’s not an immediate result and it won’t immediately change but it’s going to help you build up your score as you implement any recommendations and then you can silence them.
  • The Secure Score is calculated based on the ratio between your healthy resources and your total resources. If the number of healthy resources is equal to your total resources, you get the highest score value.
  • The overall score is an accumulation of all your recommendations. You can view your overall Secure Score across your subscriptions or management groups depending on the scope you select. The score will also vary based on the subscriptions selected and the active recommendations on them.

Remember, this is a marathon, not a sprint. It takes time to do the remediation, whether it be patching machines or closing ports or shutting off services. There are so many remedies offered that will make you more secure down the road. With this offering, you get a ‘scorecard’ for yourself and a look at what’s most imperative to implement first.

Be sure to check out the Azure Security Center. There are a lot of free options there as well as options to add additional services at a cost.

New Development Feature for Azure Stream Analytics

Gaining insights from our data, especially in real-time is an important part of any business. Today I’d like to talk about some new development options for Azure Stream Analytics. If you’re not clear on what Azure Stream Analytics is, it’s a fully managed cloud solution in Azure that allows you to rapidly develop and deploy low cost solutions to gain real-time insights from devices, sensors, infrastructure and applications.

Stream Analytics is part of the Azure IoT suite which brings IoT to life and allows you to easily connect your devices and analyze previously untapped data and integrate business systems. The IoT workspace is expanding as it offers so much capability and information for things like production floors, jet engines and automobiles, just to name a few. I did another blog on some of the features here.

Today my focus is a new feature that allows you to do some local testing within Visual Studio to query logic with live data without needing to run in the cloud. You can test your queries locally while using live data streams from the sources such as Event Hub, IoT Hub or Blob Storage. Also, you can use the Stream Analytics time policies to be able to start and stop queries in a matter of seconds.

This offers a big improvement in development productivity, as you can save a lot time on the inner loop of query logic testing.

Some major benefits are:

  • The behavior query consistency, so you get the same experience when you’re using Visual Studio or the cloud interface.
  • Much shorter test cycles. You normally can expect a lag in cloud development. Now testing queries directly in Visual Studio in your local environment presents the opportunity to show the shape of the data coming in to help you easily adjust the query and see some immediate results.

A couple of caveats with deployment in this new feature:

  • The local testing feature should only be used for functional testing purposes. It doesn’t replace the performance or scalability tests that you would do inside the cloud.
  • It really should not be used for production purposes since it doesn’t guarantee any kind of SLA.
  • Also note, that when you’re on your machine, you can rely on local resources but when you deploy to the cloud, you can scale out to multiple nodes which allows you to add more streams and additional resources in order to process those.
  • Cloud deployment ensures things like check pointing, upgrades and other features that you need for production deployments, as well as provides the infrastructure to run your jobs 24/7.

So, remember, this new enhancement is just for testing purposes to help shorten the query and development cycle and avoid the lag in other testing and development tools. But a cool, time saving feature to investigate, and Microsoft is adding more features to Azure Steam Analytics.

Azure Data Factory – Data Flow

I’m excited to announce that Azure Data Factory Data Flow is now in public preview and I’ll give you a look at it here. Data Flow is a new feature of Azure Data Factory (ADF) that allows you to develop graphical data transformation logic that can be executed as activities within ADF pipelines.

The intent of ADF Data Flows is to provide a fully visual experience with no coding required. Your Data Flow will execute on your own Azure Databricks cluster for scaled out data processing using Spark. ADF handles all the code translation, spark optimization and execution of transformation in Data Flows; it can handle massive amounts of data in very rapid succession.

In the current public preview, the Data Flow activities available are:

  • Joins – where you can join data from 2 streams based on a condition
  • Conditional Splits – allow you to route data to different streams based on conditions
  • Union – collecting data from multiple data streams
  • Lookups – looking up data from another stream
  • Derived Columns – create new columns based on existing ones
  • Aggregates – calculating aggregations on the stream
  • Surrogate Keys – this will add a surrogate key column to output streams from a specific value
  • Exists – check to see if data exists in another stream
  • Select – choose columns to flow into the next stream that you’re running
  • Filter – you can filter streams based on a condition
  • Sort – order data in the stream based on columns

Getting Started:

To get started with Data Flow, you’ll need to sign up for the Preview by emailing adfdataflowext@microsoft.com with your ID from the subscription you want to do your development in. You’ll receive a reply when it’s been added and then you’ll be able to go in and add new Data Flow activities.

At this point, when you go in and create a Data Factory, you’ll now have 3 options: Version 1, Version 2 and Version 2 with Data Flow.

Next, go to aka.ms/adfdataflowdocs and this will give you all the documentation you need for building your first Data Flows, as well as work and play around with some samples already built. You can then create your own Data Flows and add a Data Flow activity to your pipeline to execute and test your own Data Flow in debug mode in the pipeline. Or you can use Trigger Now in the pipeline to test your Data Flow from a pipeline activity.

Ultimately, you can operationalize your Data Flow by scheduling and monitoring your Data Factory pipeline that is executing the Data Flow activity.

With Data Flow we have the data orchestration and transformation piece we’ve been missing. It gives us a complete picture for the ETL/ELT scenarios that we want to do in the cloud or hybrid environments, your on prem to cloud or cloud to cloud.

With Data Flow, Azure Data Factory has become the true cloud replacement for SSIS and this should be in GA by year’s end. It is well designed and has some neat features, especially how you build your expressions which works better than SSIS in my opinion.

When you get a chance, check out Azure Data Factory and its Data Flow features and let me know if you have any questions!

Intro to Azure Databricks Delta

If you know about or are already using Databricks, I’m excited to tell you about Databricks Delta. As most of you know, Apache Spark is the underlining technology for Databricks, so about 75-80% of all the code in Databricks is still Apache Spark. You get that super-fast, in-memory processing of both streaming and batch data types as some of the founders of Spark built Databricks.

The ability to offer Databricks Delta is one big difference between Spark and Databricks, aside from the workspaces and the collaboration options that come native to Databricks. Databricks Delta delivers a powerful transactional storage layer by harnessing the power of Spark and Databricks DBFS.

The core abstraction of Databricks Delta is an optimized Spark table that stores data as Parquet files in DBFS, as well as maintains a transaction log that efficiently tracks changes to the table. So, you can read and write data, stored in the Delta format using Spark SQL batch and streaming APIs that you use to work with HIVE tables and DBFS directories.

With the addition of the transaction log, as well as other enhancements, Databricks Delta offers some significant benefits:

ACID Transactions – a big one for consistency. Multiple writers can simultaneously modify a dataset and see consistent views. Also, writers can modify a dataset without interfering with jobs reading the dataset.

Faster Read Access – automatic file management organizes data into large files that can be read efficiently. Plus, there are statistics that enable speeding up reads by 10-100x and data skipping avoids reading irrelevant information. This is not available in Apache Spark, only in Databricks.

Databricks Delta is another great feature of Azure Databricks that is not available in traditional Spark further separating the capabilities of the products and providing a great platform for your big data, data science and data engineering needs.