All posts by cseferlis

The Easy Path to AI

The explosion of interest in AI due to the recent success of ChatGPT, the state-of-the-art natural language generation model that can write anything from essays to poems to code, is no surprise. However, now we are starting to see the excitement wane as ChatGPT usage numbers drop. This could be due to competition, concerns about privacy and security, or the overall excitement factor slowing down as users struggle to find uses of the tool. Further, to use the API available from OpenAI, you need a lot of technical skills and resources to train, fine-tune, and deploy it. You also need to be careful about the quality and safety of the generated text, as it might contain errors, biases, or harmful content.

The good news? This is just one tool in a sea of many other AI tools that are refined and purpose-built for many organizational needs. At the top of that list of tools is Microsoft’s Azure Cognitive Services tools, a collection of cloud-based APIs that provide ready-made AI solutions for various scenarios. Anyone who is familiar with Data Science and Machine Learning knows that we need troves of clean and trustworthy data to train an ML model to be able to predict results. The beauty of Cognitive Services is that Microsoft has already built these models around many categories, and even won many “human-parity” awards! Below are just a few examples of how Cognitive Services can help you with your AI needs:

• Speech Recognition: This service allows you to convert speech to text in real time or from audio files. You can use it for voice commands, transcription, dictation, captioning, and more. You can also customize it with your own vocabulary and acoustic machine learning models.
• Computer Vision: This service allows you to analyze and understand images and videos. You can use it for face detection, emotion recognition, object detection, optical character recognition, video indexing, and more. You can also create your own custom vision models using a simple interface. I recently created a video with an overview of the service here:
• Text Analytics: This service allows you to extract insights from text data. You can use it for sentiment analysis, key phrase extraction, entity recognition, language detection, and more. Another example would be to use it to analyze healthcare documents and extract clinical information.
• Many more: Cognitive Services offer a wide range of services for different domains and scenarios, such as natural language understanding, conversational AI, anomaly detection, spatial analysis, personalization, and more.

You don’t need to worry about building or managing your own AI models, or if required, many of the services allow for custom models to be built as well. Once determined, you just need to connect to the API and start using it in your applications. Even better, many of the services can be containerized within a docker container where you can deploy the models locally or in other clouds for an even faster prediction for your application. Finally, you also get the benefits of Microsoft’s expertise and innovation in AI, such as high accuracy, reliability, security, and compliance.

To get started, many of the services have free service tiers for minimal transactions, and each service is billed based on consumption, so as long as you are controlling those transactions, you don’t have to worry about cost overruns, etc.

So, what are you waiting for? If you want to add AI capabilities to your applications without the hassle and complexity of ChatGPT and similar tools, Cognitive Services are the way to go! And if you really want to go deeper in understanding all the capabilities check out our recent book “Practical Guide to Azure Cognitive Services” from Packt or through other online book retailers:

If you think this was valuable, or could improve, please leave a comment below and share it with your friends. And don’t forget to subscribe to my video blog for more Data and AI insights and tips.

3 Key Differences in ChatGPT and Azure OpenAI

In this vLog I discuss some misconceptions around ChatGPT and Azure OpenAI, to include:

  • Who owns ChatGPT, OpenAI, and how Microsoft got involved
  • Security and privacy concerns about Azure OpenAI and ChatGPT
  • How each of the services is consumed and billed

Take a look to find out more!

 ChatGPTAzure OpenAI
OwnershipOwned by OpenAI LP a for profit arm of OpenAI the non-profit who’s mission is to development of societyPart of Azure AI offerings as APIs, and investor in OpenAI for exclusive rights to technology generated
SecurityInsecure and open to the public. LLM is trained on dataset created in GPT 3 and currently only references data from 2021 and earlier. Questions and interactions are captured and can be used for further trainingSecure to an Azure tenant using GPT-4, GPT 3.5, Codex and Dall-e Requires access for tenant to reduce the chances of the AI to be used for malicious purposes Data is not stored by the prompt, and model is not trained on data added
CostsFree during preview stages, or paid for better hardware availabilityBased on a consumption model like other Azure Cognitive Services. Biggest expense is re-training the model you’ve deployed for your data

Here comes your CoPilot

The new age of large language models (LLMs), and the ability to accelerate various forms of novel thought is being cast upon us at a rapid pace. Just like an airplane copilot, we are seeing an explosion of tools in various areas to help us do our everyday jobs, making us more productive and freeing up additional time to enhance our creativity… Or play candy crush.

You have already seen a plethora of announcements from Microsoft about various copilot tools that are being added to their Office productivity suite to assist the common office worker, GitHub CoPilot for assisting the software developer with writing, analyzing, and documenting code, the data analyst using Power BI and Microsoft Fabric for simplifying the analysis and report building process that could be tedious, and this is just the beginning from their standpoint. They’ve also announced the AI CoPilot software development kit that allows developers to add a CoPilot to any number of applications that are used throughout the business and consumer worlds for assisting people with their everyday tasks and simplifying the process by which they were able to develop and create new pieces.

The real question that comes to mind, however, is “who gets credit for the work that gets created?”. When we see situations such as in the entertainment industry, where movie scripts are being created by these tools, thousands of songs have been recently removed from Spotify due to the fact that they were generated with AI, images and videos are being developed and manipulated with AI, this question comes to mind frequently. And this is just the beginning of what I anticipate will be a massive explosion of questions around who really should get the credit for what’s being created. If AI is helping individuals complete their work at a faster pace, and the broader community is benefiting as a result, does it really matter? If I read 10 articles on copilot, and I am able to retain a significant portion of what I read, then turn around and form an opinion, and write my own article, such as I’m doing here about how I see things happening, is that still my work? Is it my work even though I am writing it based on a whole bunch of other material that others produced, and I’m summarizing in a slightly different way? This is the process by which the majority of research has been based for centuries now, as well as many fiction and non-fiction works have been created. Is that really different when we look at the technology that underlies LLMs?

In the world of data science, we can see tremendous opportunity to take advantage of already created machine learning models based on the algorithms that were used to be able to then replicate the findings for any number of various data sets and opportunities to create new algorithms and predictive models. Is this somehow “cheating” suddenly? Are these data scientists working, hopefully, towards the greater good, and having the novel inspiration for what they want to build, but are using these tools to help them produce it more quickly, cheaters? I think these are the questions we need to be asking ourselves rather than pointing fingers at who the people are using the tools to produce the work that they are producing.

The other major concern coming from all of this are the privacy and security implications of training these LLMs with information that ultimately should not be shared. Microsoft is providing excellent options with regard to these concerns by allowing customers to create their own instance of the various tools provided, such as ChatGPT or Dall-E APIs, that allow their customers to isolate the models that are trained specifically for them in their own Azure Subscription and are not using those individual models or data that is collected to train any of the other models. Using tools such as Google’s, Bard or OpenAI’s ChatGPT interface, you do not have the same luxury, and those models are being re-trained by all the data that is fed into them. This very example was made loudly public recently when some engineers from Samsung fed some of their data into ChatGPT, exposing corporate secrets unknowingly. This is also causing rash decisions by CxOs to widely ban the tools that are helping their employees be more productive. Clearly more education is needed for helping with these decisions and scenarios at all levels of corporations, education, and individuals across the board.

As a writer myself, recently just finishing publishing my first book, the days of writer’s block and challenges with getting started on various topics in chapters, still haunt me. I see these tools as an opportunity to help us get past that and produce work that much more quickly. Truthfully, the answer here is subjective to each person’s opinion, similarly to how a person feels about a painting, song, or written piece. We already base so many of our works of art, whether in the form of an application, methodology, algorithm, and more traditional artistic stylings on the knowledge we’ve acquired through experience, that the notion of “original thought” is now so uncommon, and even when introduced, rejected by the greater society, how is this any different? I say we take advantage of the tools we are given and make the copilot as pervasive as possible to help gain efficiencies in every aspect of the modern world!

Azure Cognitive Services

AI solutions are exploding, and Azure has the most complete offering of any cloud provider! Watch this video to get started with our API based Cognitive Services in Azure and a sample architecture of how to employ them with the Azure Bot Service. Azure Cognitive Services are cloud-based services with REST APIs and client library SDKs available to help you build cognitive intelligence into your applications.

You can add cognitive features to your applications without having artificial intelligence (AI) or data science skills. Azure Cognitive Services comprise various AI services that enable you to build cognitive solutions that can see, hear, speak, understand, and even make decisions. Azure Bot Service enables you to build intelligent, enterprise-grade bots with ownership and control of your data. Begin with a simple Q&A bot or build a sophisticated virtual assistant.

#Azure #AI #CognitiveServices #ArtificialIntelligence #Bots #ReferenceArchitecture #MachineLearning #API #Cloud #Data #DataScience

What is HTAP in Azure

Hybrid Transactional and Analytical Processing, or HTAP, is an advanced database capability that allows for both types of workloads to be performed without one impacting the performance of the other.

In this Video Blog, I cover some of the history of HTAP, some of the challenges and benefits of these systems, and where you can find them in Azure.

Overview of Azure Synapse Link featuring CosmosDB

Azure Synapse Link allows you to connect to your transactional system directly to run analytical and machine learning workloads while eliminating the need for ETL/ELT, batch processing and reload wait times.

In this vLog, I explain how to turn the capability to use Link on in CosmosDB, and what’s happening under the covers to give access to that analytical workload without impacting the performance of your transactional processing system.

Check it out here and let me know what you think!

Getting started with Spark Pools in Azure Synapse

In my latest video blog I discuss getting started on the newly Generally Available Spark Pools as a part of Azure Synapse, another great option for Data Engineering/Preparation, Data Exploration, and Machine learning workloads

Without going too deep into the history of Apache Spark, I’ll start with the basics. Essentially, in the early days of Big Data workloads, a basis for machine learning and deep learning for advanced analytics and AI, we would use a Hadoop cluster and move all these datasets across disks, but the disks were always the bottleneck in the process. So, the creators of Spark said hey, why don’t we do this in memory and remove that bottleneck. So they developed Apache Spark as an in memory data processing engine as a faster way to process these massive datasets.

When the Azure Synapse team wanted to make sure that they were offering the best possible data solution for all different kinds of workloads, Spark gave the ability to have an option for their customers that were already familiar with the Spark environment, and included this feature as part of the complete Azure Synapse Analytics offering.

Behind the scenes, the Synapse team is managing many of the components you’d find in Open-Sourced Spark such as:

  • Apache Hadoop Yarn – for the management of the clusters where the data is being processed
  • Apache Livy – for the job orchestration
  • Anaconda – a package manager, environment manager, Python/R data science distribution and a collection of over 7500 open source packages for increasing the capabilities of the Spark clusters

I hope you enjoy the post. Let me know your thoughts or questions!

Connecting to External Data with Azure Synapse

In my latest video blog I discuss and demonstrate some of the ways to connect to external data in Azure Synapse if there isn’t a need to import the data to the database or you want to do some ad-hoc analysis. I also talk about using COPY and CTAS statements if the requirement is to import the data after all. Check it out here

Comparing Azure Synapse, Snowflake, and Databricks for common data workloads

In this vLog post I discuss how Azure Synapse, Databricks and Snowflake compare when it comes to common data workloads:

Data Science

Business Intelligence

Ad-Hoc data analysis

Data Warehousing

and more!

Where does Azure Data Explorer fit in the rest of the Data Platform?

In this vLog I give an overview of Azure Data Explorer and the Kusto Query Language (KQL). Born from analyzing logs behind Power BI, ADX is a great way to take large sets of data and quickly analyze those datasets and get actionable insights on that data.

Find more details about Azure Data Explorer here:

And get started with these great tutorials: