Category Archives: IoT

Accelerate Your AI with Machine Learning on Azure Data Box Edge

In some past blogs I’ve discussed Azure Data Box and how the Data Box family has expanded. Today I’ll talk about Azure Data Box Edge (in preview) and elaborate on the machine learning service that it provides in your premises with the power of Azure behind it.

If you don’t know, Azure Data Box Edge is a physical hardware device that sits in your environment and collects data from environment sources like IOT data and other sources where you might take advantage of the AI features offered by the device. It then takes the data and sends it to Azure for more processing, storage or reporting purposes.

Microsoft recently announced Azure Machine Learning hardware accelerated models provided by Project Brain Wave on the Data Box Edge. Because most of our data is in real world applications and used at the edge of our networks – like image and videos collected from factories, retail stores or hospitals – it can now be used for things such as manufacturing defect analysis or inventory out of stock detection in diagnostics.

By applying machine learning models to the data on Data Box Edge, it provides lower latency (and savings on bandwidth cost) as we don’t have to send all the data to Azure for analysis. But it still offers that real time insight and speed to action for critical business decisions.

You can enable data scientists to simplify and accelerate the building, training and deployment of machine learning models using the Azure Machine Learning Service which is already generally available. They can access all these capabilities in their favorite Python environment, using the latest open source frameworks such as PyTorch, TensorFlow and sci-kit-learn.

These models can run on CPUs and GPUs, but this preview expands that out to field programmable gate array processes (FPGA), which is the processor on the Data Box Edge.

The preview is currently a bit limited but, in this case, you’re able to enhance the Azure Machine Learning Service by training a TensorFlow model for image classification scenarios. So, you would containerize that model in a docker container and then deploy it to the Data Box Edge IOT hub.

A good use case for this is if you’re using AI models for quality control purposes. Let’s say you know what a finished product should look like and what the quality specs are, and you build a model defining those parameters. Then you take an image of that product as it comes off the assembly line; now you can send those images to the Data Box Edge in your environment and more quickly capture defects.

Now you’re finding the root cause of defects quicker and throwing away fewer defective products and therefore, saving money. I’m looking forward to seeing how enterprises are going to leverage this awesome technology.

Expanding the Azure Data Box Family

In a previous blog I introduced Azure Data Box. Today I’d like to talk about how Microsoft is expanding the Azure Data Box family by introducing you to the Azure Data Box Gateway and the Azure Data Box Edge devices.

Until now the Data Box Family has been the disc, the box and the heavy. Each have their own limits for storage but are designed to improve your way of uploading massive amounts of data into Azure, without having to wait for it to travel across the wire or saturate your bandwidth (consider that the offline method).

Microsoft learned from customers that they want a better way to sync their local storage directly with Azure storage for operations like archival and disaster recovery. Here’s where Azure Data Box Gateway comes in.

The Data Box Gateway is a cloud storage gateway device that resides on premises and sends your image, media and other data directly to Azure.

  • The Gateway is a virtual machine provisioned in your Hypervisor (VMware or Hyper V) where you write the data directly to this virtual device using the NFS or SMB protocols, which it then sends to Azure.
  • One use case for the Data Box Gateway is for things like continuously ingesting massive amounts of data. So, we have a local data source that requires large data amounts and capacities and we can stream those and sync them directly with our Azure storage.
  • Another use case would be for a cloud archival of data in a secure and efficient way. If you then think about the incremental data transfer over the network after the initial bulk transfer is done using the Data Box of your choice for direct tie in to the same Azure storage container that you’re using for your Data Box.

Azure Data Box Edge is a storage solution that allows you to process data and send it over the network to Azure.

  • Data Box Edge uses a physical device supplied by Microsoft to accelerate the secure data transfer.
  • The device resides on premises in your network stack and you write data to it (also using NFS or SMB.)
  • It is additionally equipped with AI enabled Edge computing capabilities which help to analyze, process or filter data as it moves to Azure block blob, page blob or Azure files.
  • It has the appropriate chips to process intelligent learning (artificial intelligence, machine learning, deep learning and such).
  • Use cases are for things like pre-processing data. So, we can analyze data from on premises or IOT devices to get faster information about the data. That pre-processing will allow us to do things like aggregating your data before it gets sent to Azure or modifying data, such as taking out PII.
  • You can also subset and transfer the data needed for deeper analytics in the cloud.
  • Additionally, you can analyze and react to IOT events. So, if you’re running IOT devices on prem and you want the ability to be quicker to respond when those events occur, this is a great way to handle that.
  • Another great use case is you can run machine learning models to get quick results that can be acted on before the data is sent to the cloud.
  • With these IOT use cases, you don’t have to wait for the data to be transmitted over the wire, do any of the munging happening up in Azure and then return results. You can return those results on the fly in real time and react more quickly.
  • Eventually the full data set is transferred to continue and help you to retain and improve any of your machine learning models. You can continually feed it data and have those models trained repeatedly, thus learning to be more concise over time.

The Data Box family is a very cool technology by having an online version to further extend its capabilities.

New Development Feature for Azure Stream Analytics

Gaining insights from our data, especially in real-time is an important part of any business. Today I’d like to talk about some new development options for Azure Stream Analytics. If you’re not clear on what Azure Stream Analytics is, it’s a fully managed cloud solution in Azure that allows you to rapidly develop and deploy low cost solutions to gain real-time insights from devices, sensors, infrastructure and applications.

Stream Analytics is part of the Azure IoT suite which brings IoT to life and allows you to easily connect your devices and analyze previously untapped data and integrate business systems. The IoT workspace is expanding as it offers so much capability and information for things like production floors, jet engines and automobiles, just to name a few. I did another blog on some of the features here.

Today my focus is a new feature that allows you to do some local testing within Visual Studio to query logic with live data without needing to run in the cloud. You can test your queries locally while using live data streams from the sources such as Event Hub, IoT Hub or Blob Storage. Also, you can use the Stream Analytics time policies to be able to start and stop queries in a matter of seconds.

This offers a big improvement in development productivity, as you can save a lot time on the inner loop of query logic testing.

Some major benefits are:

  • The behavior query consistency, so you get the same experience when you’re using Visual Studio or the cloud interface.
  • Much shorter test cycles. You normally can expect a lag in cloud development. Now testing queries directly in Visual Studio in your local environment presents the opportunity to show the shape of the data coming in to help you easily adjust the query and see some immediate results.

A couple of caveats with deployment in this new feature:

  • The local testing feature should only be used for functional testing purposes. It doesn’t replace the performance or scalability tests that you would do inside the cloud.
  • It really should not be used for production purposes since it doesn’t guarantee any kind of SLA.
  • Also note, that when you’re on your machine, you can rely on local resources but when you deploy to the cloud, you can scale out to multiple nodes which allows you to add more streams and additional resources in order to process those.
  • Cloud deployment ensures things like check pointing, upgrades and other features that you need for production deployments, as well as provides the infrastructure to run your jobs 24/7.

So, remember, this new enhancement is just for testing purposes to help shorten the query and development cycle and avoid the lag in other testing and development tools. But a cool, time saving feature to investigate, and Microsoft is adding more features to Azure Steam Analytics.

Intro to Azure Databricks Delta

If you know about or are already using Databricks, I’m excited to tell you about Databricks Delta. As most of you know, Apache Spark is the underlining technology for Databricks, so about 75-80% of all the code in Databricks is still Apache Spark. You get that super-fast, in-memory processing of both streaming and batch data types as some of the founders of Spark built Databricks.

The ability to offer Databricks Delta is one big difference between Spark and Databricks, aside from the workspaces and the collaboration options that come native to Databricks. Databricks Delta delivers a powerful transactional storage layer by harnessing the power of Spark and Databricks DBFS.

The core abstraction of Databricks Delta is an optimized Spark table that stores data as Parquet files in DBFS, as well as maintains a transaction log that efficiently tracks changes to the table. So, you can read and write data, stored in the Delta format using Spark SQL batch and streaming APIs that you use to work with HIVE tables and DBFS directories.

With the addition of the transaction log, as well as other enhancements, Databricks Delta offers some significant benefits:

ACID Transactions – a big one for consistency. Multiple writers can simultaneously modify a dataset and see consistent views. Also, writers can modify a dataset without interfering with jobs reading the dataset.

Faster Read Access – automatic file management organizes data into large files that can be read efficiently. Plus, there are statistics that enable speeding up reads by 10-100x and data skipping avoids reading irrelevant information. This is not available in Apache Spark, only in Databricks.

Databricks Delta is another great feature of Azure Databricks that is not available in traditional Spark further separating the capabilities of the products and providing a great platform for your big data, data science and data engineering needs.

Device Management with Azure IoT Hub

Yesterday’s post covered what Azure IoT Hub is and what it brings. Today I’m going a bit deeper and talking about how the devices you’re bringing to the table get managed. IoT Hub provides the features and extensibility that enables devices, as well as the people who program those devices and their architectures, with a robust device management solution.

Devices are all over the place; they are sensors, microcontrollers and Raspberry Pi computers. It’s also the gateways that route the communications for groups of devices. They’re installed on a local network and can work in peer to peer networks or have a router that passes information back and forth.

Azure IoT Hub offers a flexible platform for the many different uses across many different industries and devices themselves to be able to have that compatibility no matter the industry you’re in. No matter what you’re using the devices for, a significant part lies in the planning of how the devices and gateways will work together in the IoT Hub.

Let’s look at some things to be aware of:

1. Device Management Principles – Here you’ve got your scale and automation. You need to have simple tools to automate routine tasks. And you need the ability to manage millions of devices simply, as well as remotely and in bulk, so you can make sweeping changes across a whole suite of devices.

In addition, you don’t need to be alerted for every change or notification, but you do need to be alerted when there’s a problem. There are many different devices, protocols and patterns. IoT Hub needs to accommodate all those changes; with the wide range of devices from single process chips to fully functional computers, we need to have the flexibility to accommodate those systems.

Other things you need to know are:

  • Context awareness to accommodate the SLA and maintenance windows for when there’s downtime.
  • The network and power states.
  • The in-use conditions – What are the expectations while the devices themselves are working?
  • Where the device is – Is it in a building or out in the field on a utility pole?

These devices serve many roles and must work within the IT operations of your group. They need to be easily managed from that group or an extension from that group, as well as be able to surface alerts when it’s required. Most importantly, this all needs to work within your internal IT ecosystem to keep that continuity and consistency inside the business.

2. Device Lifecycle – So, we start with a plan – how will we use the devices; how will they be managed; and what will the devices be for our specific instance? Next, we need to provision them by adding them into the IoT Hub identity registry, so when we get to the next step they are being acknowledged in the system. Our next step is to configure them. We want to maintain the health of the device, even when we’re doing these updates and configurations, and we can send confirm updates securely.

Also, we need to monitor the device’s health to be aware if it’s beginning to fail. Many are small, simple devices that have a certain lifespan. We also monitor the status of the device and we need the ability to get alerts when the device begins to have issues. Then, ultimately, we need to remove old devices that are no longer effective so they’re not showing up or cluttering up the space of that IoT Hub interface.

3. Device Management Patterns – How are we interacting with devices after they’ve been deployed? So, if you’re going to reboot, factory reset or redeploy a device, you’ll need to reconfigure it so that it can be brought back up in the system. You’ll need to do simple configurations to change how the devices behave, and these need to be done in bulk.

To ensure you’re staying on top of bug fixes and new functionality and features for your devices, you’ll need to send firmware updates. Lastly, you need to show reporting progress and statuses of the devices themselves. It’s important that you have visibility into how the devices are performing and know if there are any problems.

This has been a high-level overview of device management with Azure IoT Hub. I hope you found it informative and helpful.

How Does Azure IoT Hub Work?

Today I’d like to talk about Internet of Things (IoT) and the Azure IoT Hub. IoT devices are not your typical devices like mobile phones, tablets or laptops. IoT devices are designed to respond to sensor activity that the device is being used for, like a glass break sensor for instance.

These devices are meant to be used for specific communications, whereas the typical device acts more like a server waiting to receive information from everywhere. This can cause some security threats if they are deployed in that manner. We can use firewalls and software to protect our equipment, but the whole idea with IoT is that these low power, no frills devices are what’s being deployed, so you don’t have a lot of that capability.

Also, the traditional PKI trust model is inefficient and ineffective for the IoT model; the TTL (time to live) certificates are too long and it doesn’t make sense for these devices. As well as the fact that promiscuous mode is turned on by default, which defeats the purpose of trying to have a secure environment.

Azure IoT Hub implements a service assisted communication methodology and this mediates interaction between backend systems and devices. With this you have a bi-directional, trust worthy communication set up and security is the number one priority of this configuration.

Devices will not accept unsolicited information; they must regularly check in for instructions, and authorization is based on per device identity. For devices in areas where there are network coverage or power issues, IoT provides cues for the messages that are set up for communication with the devices. Essentially, it will hold the message and validate the device before anything is sent/received; it will send the necessary data after it’s validated.

This also sets up an application payload data, which is secured separately, so any data that’s flowing through is going to be secured for protected transit through the gateways. The data is wrapped prior to sending and receiving between devices. Devices can be configured to work peer to peer before they get to a gateway to be able to extend out the range. That gateway is what communicates with your Azure IoT Hub.

All that traffic is designed to flow to and from the gateway and then communicate with the IoT Hub, which you can use to collect the data for big data uses, setting up Power BI reports or many other ways to use that data.

What is Azure Cosmos DB?

Are you familiar with Azure Cosmos DB? Cosmos DB is Microsoft’s globally distributed, multi-model database. With the click of a button, it allows you to elastically and independently scale throughput and storage across any number of Azure’s geographic regions, so you can put the data where your customers are.

Cosmos DB has custom built APIs that allow you a multitude of data sources, like SQL Server, Mongo DB and Azure tables, as well as offering 5 consistency models. It offers comprehensive Service Level Agreements (SLAs) with money back guarantees for availability (99.99% to be exact), latency, consistency and throughput; a big deal when you need to serve your customers at optimum performance.

Cosmos DB is a great option for many different use cases:

  • Companies that are doing IOT and telematics. Cosmos DB can ingest huge bursts of data, and process and analyze that data in near real-time. Then it will automatically archive all the data it ingests.
  • Retail and Marketing. Take an auto parts product catalog, for example, with tons of parts within the catalog, each with its own properties (some unique and some shared across parts). The next year, new vehicles or new parts model come out, with some similar and different properties. All that data adds up very quickly. Cosmos DB offers a very flexible schema in a hierarchical structure that can easily change the data around as things change.
  • Gaming Industry. Games like Halo 5 by Microsoft are built on a Cosmos DB platform, because they need performance that is quickly and dynamically scalable. You’ve got things like millisecond read-times, which avoids any lags in game play. You can index player related data and it has a social graph database that’s easily implemented with flexible schema for all social aspects of gaming.

Azure Cosmos DB ensures that your data gets there and gets there fast, with a wealth of features and benefits to make your life easier. And it’s easy to set up and manage.

 

What is Internet of Things (IoT) and Why It Matters to IT

The Internet of Things (IoT) has become a growing topic both inside and outside of the workplace. It has the ability to change how we live and how we work. Many are already on board with IoT, and global IoT revenues are projected to reach over a trillion dollars by 2020. If you’re not there yet, I’d like to talk today about what IoT is and how it’s being used.

Internet of Things, or IoT, is defined as a device that used to be a stand-alone device, but is now connected to the internet. Consumer based devices include Google Home, Alexa, Smart Watches, Fitbits and home thermostats. These products are already changing the way the owners of these devices live.

From a business standpoint, with the help of services like Azure IoT Hub, this gets much bigger, with a much larger impact on how people work. Large engine monitoring devices for trains and planes, for example, have millions of components that are being monitored all the time, therefore, showing real-time statistics about what is happening on those devices.

Chevron is using Microsoft Azure IoT Hub in the backend as they build out their IoT infrastructure for monitoring their oil, and deep well, drilling devices. John Deer is mounting IoT devices on their equipment that tracks things such as where, how far apart or how deep seeds are being planted, so farmers can get planting information right from these devices.

Yes, IoT is a big deal and it’s helping manufacturing and other companies, as well as consumers alike, by expanding the capabilities of things that are, or can be, connected to the internet.