Azure Database for MariaDB in Preview

October 30, 2018 cseferlis 1 Comment

Microsoft has recently announced another Platform as a Service (PaaS) offering with the release of MariaDB in Preview in Azure. I’d like to tell you more about that offering and what are some of its advantages.

First, a little history on MariaDB. MariaDB is a community developed fork of the MySQL. Essentially, when Oracle purchased MySQL from Sun, some of the developers from MySQL were concerned that the acquisition would make changes or lead down the road where it would no longer be open source.

So, they went ahead and forked off to MariaDB with the intent to maintain high compatibility with MySQL. Also, the contributors are required to share their copyright with MariaDB foundation rights, which in a nutshell means they want this foundation to always be open source.

Now take the open source technology of MariaDB, which is a proven and valuable method for many companies, and compile that with the fact that it’s in the Azure platform as a managed offering, consumers using this offering from Azure get to take advantage of some of the standard capabilities in the PaaS model, such as:

Built-in high availability with no extra cost
3 tiers (basic, general purpose and memory optimized) which you can choose depending on your workload, transactional or analytical processing.
99% availability SLAs
Capabilities for predictable performance done by built in monitoring and alerting, allowing the ability to quickly assess the effects of scaling V-Cores up or down based on current or projected performance needs – through automation or manually in seconds.
The secure protection of sensitive data at rest and in motion. Uses 256-bit encryption on secured disks in the Azure data centers and enforces an SSL connection for data in transit. Note: you can turn off the SSL requirement if your application doesn’t support it – I don’t recommend it, but it can be done.
Automatic backups so you can have point-in-time restore for up to 35 days

These are all the standard advantages when you turn on in Azure Database Platform as a Service offerings, like SQL DB or Mongo DB. Azure Database for MariaDB is just another option added by Microsoft to the portfolio for databases. And a great option to check out if you use MariaDB. This is now in Preview but I’m sure it will be made generally available pretty soon.

Azure, Big Data, Database

How to Gain Up to 9X Speed on Apache Spark Jobs

October 27, 2018 cseferlis 1 Comment

Are you looking to gain speed on your Apache Spark jobs? How does 9X performance speed sound? Today I’m excited to tell you about how engineers at Microsoft were able to gain that speed on HDInsight Apache Spark Clusters.

If you’re unfamiliar with HDInsight, it’s Microsoft’s premium managed offering for running open source workloads on Azure. You can run things like Spark, Hadoop, HIVE, and LLAP among others. You create clusters and spin them up and spin them down when you’re not using them.

The big news here is the recently released preview of HDInsight IO Cache, which is a new transparent data caching feature that provides customers with up to 9X performance improvement for Spark jobs, without an increase in costs.

There are many open source caching products that exist in the ecosystem: Alluxio, Ignite, and RubiX to name a few big ones. The IO Cache is also based on RubiX and what differentiates RubiX from other comparable caching products is its approach of using SSD and eliminating the need for explicit memory management. While other comparable caching products leverage the reservation of operating memory for caching the data.

Because the SSDs typically provide more than 1 gigabit/second of bandwidth, as well as leverage operating system in-memory file cache, this gives us enough bandwidth to load big data compute processing engines like Spark. This allows us to run Spark optimally and handle bigger memory workloads and overall better performance, by speeding up these jobs that read data from remote cloud storage, the dominant architecture pattern in the cloud.

In benchmark tests comparing a Spark cluster with and without the IO Cache running, they performed 99 SQL queries against a 1 terabyte dataset and got as much as 9X performance improvement with IO Cache turned on.

Let’s face it, data is growing all over and the requirement for processing that data is increasing more and more every day. And we want to get faster and closer to real time results. To do this, we need to think more creatively about how we can improve performance in other ways, without the age-old recipe of throwing hardware at it instead of tuning it or trying a new approach.

This is a great approach to leverage some existing hardware and help it run more efficiently. So, if you’re running HDInsight, try this out in a test environment. It’s as simple as a check box (that’s off by default); go in, spin up your cluster and hit the checkbox to include IO Cache and see what performance gains you can achieve with your HDInsight Spark clusters.

Azure, Compliance, Security

Using Azure to Drive Security in Banking Using Biometrics

October 23, 2018 cseferlis Leave a comment

In the digital world we live in today, it’s getting harder to verify identity in industries such as banking. We now do less and less transactions in person. No longer do we go into banks with passbook in hand and make deposits or withdrawals face to face with a bank teller. Many of us have moved from ATM transactions to digital banking.

With this move, banks have tried many approaches of 2-factor authentication, some better than others and obviously the need is there for secure forms of authentication for the users. Let me tell you how Azure is driving identity security in banking using biometric identification. By combining biometrics with artificial intelligence, banks are now able to take new approaches at verifying the digital identity of their customers and prospects.

If you don’t know, biometrics is the process of uniquely identifying a person’s physical and personal traits. These are then recorded into a database and those images or features are captured into an electronic device and are used as a unique form of identification. Some methods we use biometrics are fingerprint and facial recognition, hand geometry, iris or eye scan and even odor or scents.

Because of their uniqueness, these are much more reliable in confirming a person’s identity than a password or access card. So, how do you verify a person is who they say they are if they’re not in person? Microsoft partners are now leveraging some of the Azure platform offerings to do this—things such as Cognitive Service’s Vision API and Azure Machine Learning tools for performing multi-factor authentication in the banking industry.

The way this works is the user provides a government issued ID (a license or passport for example) and they validate it against standards provided by the ID issuer, so they’re building an algorithm for verification of that ID and putting that into a database. So, when someone submits an ID from a particular state, we know what that ID is supposed to look like and we look for all the distinguishing features of that ID.

To take this a step further, the second factor is they’re using facial recognition software on things like your phone or computer, like Face ID for the iPhone. It will take your photo, but it will also take a video of you and force you to move your head in certain motions in order validate that is it you – you’re not wearing a mask or something – and that you’re alive.

It takes a picture of your ID and matches it to your facial constructions and compares them side by side; this becomes your digital signature. This is considered extremely secure as now you have two forms of verification and you’re using biometrics. Crazy stuff when you think about it but in the digital world we live in, you must go to these lengths to verify someone’s identity when they are not right in front you.

This is still in the early phase of what we’ll see but it’s cool to see how it’s being used and will be interesting to see how it progresses in the future. We’ve got great consultants working with Cognitive Services and Machine Learning. Anything data or Azure related, we’re doing it.

Azure, SQL Server

Introducing Azure SQL Database Hyperscale Service Tier

October 16, 2018 cseferlis 1 Comment

If your current SQL Database service tier is not well suited to your needs, I’m excited to tell you about a newly created service tier in Azure called Hyperscale. Hyperscale is a highly scalable storage and compute performance tier that leverages the Azure architecture to scale out resources for Azure SQL Database beyond the current limitations of general purpose and business critical service tiers.

The Hyperscale service tier provides the following capabilities:

Support for up to 100 terabytes of database size (and this will grow over time)
Faster large database backups which are based on file snapshots
Faster database restores (also based on file snapshots)
Higher overall performance due to higher log throughput and faster transaction commit time regardless of the data volumes
The ability to rapidly scale out. You can provision one or more read only nodes for offloading your read workload for use as hot standbys.
You can rapidly scale up your compute resources (in constant time) to accommodate heavy workloads, so you can scale compute up and down as needed just like Azure Data Warehouse

Who should consider moving over to the Hyperscale tier? This is not an inexpensive tier, but it’s a great choice for companies who have large databases and have not been able to use Azure databases in the past due to its 4-terabyte limit, as well as for customers who see performance and scalability limitations with the other 2 service tiers.

It is primarily designed for transactional or OLTP workloads. However, it does support hybrid and OLAP workloads, but something to keep in mind when designing out your databases and services. It’s also important to note that elastic pools do not support the Hyperscale service tier.

How does it work?

You separate the compute and storage out into 4 separate nodes similar to Azure Data Warehouse.
The compute node is where the relational engine lives or where the querying process happens.
The page server node is where the scaled-out storage engine resides and where database pages are served out to the compute nodes on demand and keeps pages updated as transactions update data, so these nodes are moving the data around for you.
The log service node is where the log records are kept as they come in from the compute node and kept in a durable cache, then they’re forwarded along to additional compute nodes and caches to ensure consistency. When all this is spread out and everything is consistently spread across the compute nodes, it will get stored in Azure storage for long term storage of your logs.
Lastly, the Azure storage node is where all the data is pushed from the page servers. So, all the data that eventually lands in the database gets pushed over to Azure storage and this is also the storage that gets used for backups, as well as where the replication between availability groups happens.

This Hyperscale tier is an exciting opportunity for those customers that don’t have their requirements fulfilled with prior service tiers. It’s another great Microsoft offering that’s worth checking out if you have had these service tier issues up to now. And it helps to leave a line of distinction between Azure Data Warehouse and Azure Database because you now can scale out/up and tons of data, but it’s still built out for the transactional processing, as opposed to Azure Data Warehouse which is more of the analytical or massively parallel processing.

Azure, Infrastructure, Networking, Security, Strategy

What is Azure Automation?

October 8, 2018 cseferlis Leave a comment

So, what do you know about Azure Automation? In this post, I’ll fill you in on this cool, cloud-based automation service that provides you the ability to configure process automation, update management and system configuration, which is managed across your on-premises resources, as well as your Azure cloud-based resources.

Azure Automation provides complete control of deployment operation and decommissions of workloads and resources for your hybrid environment. So, we can have a single pane of glass for managing all our resources through automation.

Some features I’d like to point out are:

It allows you to automate those mundane, error-prone activities that you perform as part of your system configuration and maintenance.
You can create Runbooks in PowerShell or Python that help you reduce the chance for misconfiguration errors. And it will help lower operational costs for the maintenance of those systems, as you can script it out to do it when you need instead of manually.
The Runbooks can be developed for on-premises or Azure resources and they use Web Hooks that allow you to trigger automation from things such as ITSM, Dev Ops and monitoring systems. So, you can run these remotely and trigger them from wherever you need to.
On configuration management side, you can build these desired state configurations for your enterprise environment. This will help you to set a baseline for how your systems will operate and will identify when there’s a variance from the initial system configuration, alerting you of any anomalies that could be problematic.
It has a rich reporting back end and alerting interface for full visibility into what’s happening in your Windows and Linux systems – on-premises and in Azure.
Gives you update management aspects (in Windows and Linux) to help you define the aspects of how updates are applied, and it helps administrators to specify which updates will be deployed, as well as successful or unsuccessful deployments and the ability to specify which updates should not be deployed to systems, all done through PowerShell or Python scripts.
It can share capabilities, so when you’re using multiple resources or building those Runbooks for automation, it allows you to share the resources to simplify management. You can build multiple scripts but use the same resources over and over as references for things like role-based access control, variables, credentials, certificates, connections, schedules and access to source control and PowerShell modules. You can check these in and out of source control like any kind of code-based project.
Lastly, and one of the coolest features in my opinion, where these are templates you’re deploying out in your systems, everyone has some similar challenges. There’s a community gallery where you can go and download templates others have created or upload ones you’ve created to share. With a few basic configuration tweaks and review to make sure they’re secure, this is a great option for making the process faster by finding an existing script and cleaning it up and deploying it in your systems and environment.

So, there’s a lot you can do with this service and I think it’s worth checking out as it can make your maintenance and management much simpler.

Azure, Infrastructure, Networking, Review, Security

What is Azure Firewall?

October 5, 2018 cseferlis Leave a comment

I’d like to discuss the recently announced Azure Firewall service that is now just released in GA. Azure Firewall is a managed, cloud-based network security service that protects your Azure Virtual Network resources. It is a fully stateful PaaS firewall with built-in high availability and unrestricted cloud scalability.

It’s in the cloud and Azure ecosystem and it has some of that built-in capability. With Azure Firewall you can centrally create, enforce and log application and network connectivity policies across subscriptions and virtual networks, giving you a lot of flexibility.

It is also fully integrated with Azure Monitor for log analytics. That’s big because a lot of firewalls are not fully integrated with log analytics which means you can’t centralize these logs in OMS, for instance, which would give you a great platform in a single pane of glass for monitoring many of the technologies being used in Azure.

Some of the features within:

Built in high availability, so there’s no additional load balances that need to be built and nothing to configure.
Unrestricted cloud scalability. It can scale up as much as you need to accommodate changing network traffic flows – no need to budget for your peak traffic, it will accommodate any peaks or valleys automatically.
It has application FQDN filtering rules. You can limit outbound HTTP/S traffic to specified lists of fully qualified domain names including wildcards. And the feature does not require SSL termination.
There are network traffic filtering rules, so you can create, allow or deny network filtering rules by source and destination IP address, port and protocol. Those rules are enforced and logged across multiple subscriptions and virtual networks. This is another great example of having availability and elasticity to be able to manage many components at one time.
It has fully qualified domain name tagging. If you’re running Windows updates across multiple servers, you can tag that service as an allowed service to come through and then it becomes a set standard for all your services behind that firewall.
Outbound SNAT and inbound DNAT support, so you can identify and allow traffic originating from your virtual network to remote Internet destinations, as well as inbound network traffic to your firewall public IP address is translated (Destination Network Address Translation) and filtered to the private IP addresses on your virtual networks.
That integration with Azure Monitor that I mentioned in which all events are integrated with Azure Monitor, allowing you to archive logs to a storage account, stream events to your Event Hub, or send them to Log Analytics.

Another nice thing to note is when you set up an express route or a VPN from your on premises environment to Azure, you can use this as your single firewall for all those virtual networks and allow traffic in and out from there and monitor it all from that single place.

This was just released in GA so there are a few hiccups, but if none of the service challenges effect you, I suggest you give it a try. It will only continue to come along and get better as with all the Azure services. I think it’s going to be a great firewall service option for many.

Azure, Big Data, Strategy

Shell Chooses Azure Platform for AI

October 2, 2018 cseferlis Leave a comment

Artificial Intelligence (AI) is making its way into many industries today, helping to solve business problems and helping with efficiency. In this post, I’d like to share an interesting story about Shell choosing Azure for their AI platform. Shell Oil Company chose to use C3 IoT for their IoT device management and Azure for their predictive analytics.

Let’s look at how Shell is using this technology:

The operations that are required to fix a drill or piece of equipment in the field is much more significant when it’s unexpected. Shell can use AI to look at when maintenance is required on compressors, valves and other equipment that’s used for oil drilling. This will help to reduce unplanned downtime and repair efforts. If they can keep up with maintenance before equipment fails, they can plan downtime and do so at much less cost.
They’ll use AI to help steer the drill bits through shale deposits to find the best quality shale deposits.
Failures of equipment of great size, such as drilling equipment, can have a lot of related damage and danger. This technology will improve the safety of employees and customers by helping to reduce unexpected failures.
AI enabled drills will help chart a course for the well itself as it’s being drilled, as well as providing constant data from the drill bits on what type of material is being drilled through. The benefits here are 2-fold; they will get data on quality deposits and reduce the wear and tear on the drill. If the drill is using an IoT device to detect a harder material, they’ll have the knowledge to drill in a different area or to figure out the best path to reduce the wear and tear.
It will free up the geologists and engineers to be able to manage more drills at one time, making them more efficient, as well as reactive to deal with problems as they arise while drilling.

As with everything in Azure, this platform is a highly scalable platform that will allow Shell to grow with what is required, plus have the flexibility to take on new workloads. With IoT and AI, these workloads are very easily scaled using Azure as a platform and all the services available with it.

I wanted to share this interesting use case about Shell because it really displays the capabilities of the Azure Platform to solve the mundane and enable the unthinkable.

BizDataViz

Monthly Archives: October 2018

Azure Database for MariaDB in Preview

How to Gain Up to 9X Speed on Apache Spark Jobs

Using Azure to Drive Security in Banking Using Biometrics

Introducing Azure SQL Database Hyperscale Service Tier

The Hyperscale service tier provides the following capabilities:

How does it work?

What is Azure Automation?

Some features I’d like to point out are:

What is Azure Firewall?

Some of the features within:

Shell Chooses Azure Platform for AI

Let’s look at how Shell is using this technology:

Azure and SQL Data Blog

Share this:

Share this:

Share this:

The Hyperscale service tier provides the following capabilities:

How does it work?

Share this:

Some features I’d like to point out are:

Share this:

Some of the features within:

Share this:

Let’s look at how Shell is using this technology:

Share this:

Azure and SQL Data Blog