Anomaly Detection

Definition of Anomaly

Anomalies, also referred to as outliers, novelties, noise, deviations, or exceptions, are data points that are associated with rare and exceptional events that significantly differ from the majority of available data. Anomalies are typically associated with some type of problem, such as cases of bank fraud, structural defects, medical issues, or errors in text.

Machine Learning and Anomalies

The search for anomalies is a typical application of Machine Learning. The use of AI in outlier detection is becoming increasingly common. Both supervised and unsupervised learning methods are used, as well as another technique called semi-supervised learning. Typical applications are intrusion detection, fraud detection, fault detection, system state monitoring, event detection in sensor networks, ecosystem disturbance detection, and detection of suspicious variations in images using artificial vision. Many anomaly detection methods are known, and performance depends on the quantity and quality of available data. Some of these techniques include density-based techniques (K-nearest neighbors, local outlier factor, isolation forest), support vector machines with one class, Bayesian networks, hidden Markov models, cluster analysis-based anomaly value detection, deviation from association rules, and fuzzy logic-based anomaly value detection.

Anomaly Detection Systems and IT Platforms

Major IT service providers offer platforms that facilitate the implementation of anomaly detection systems. We report information on what Amazon offers in its AWS Amazon Web Services platform and what Microsoft offers on the Azure platform.

AWS offers a wide portfolio of anomaly detection solutions, including AWS Panorama, Amazon CloudWatch, Amazon DevOps, and Amazon OpenSearch. We report below the functional block diagram of some of the Panorama and Kinesis architectures.

Panorama
Kinesis

Among the methods proposed by Amazon AWS, we mention Sagemaker, which is a cloud machine learning platform that can be used to generate predictions and track behaviors without writing code. Amazon Kinesis, on the other hand, is used for data acquisition and has a function that assigns scores to each detected anomaly. Kinesis is a managed tool that facilitates the identification of an anomaly and a real-time response.

Now let’s see what Microsoft proposes in its Azure cloud platform. Microsoft is very focused on the dissemination of these services and offers a lot of informative and educational material. Introductory guides are available with detailed instructions for making calls to the service and obtaining results in a short period of time. An interactive demo allows you to understand how Anomaly Detection works with simple operations. “How-to” guides contain instructions for using the service in more specific or customized ways. Tutorials are longer guides that illustrate how to use this service as a component in larger business solutions. In addition, code examples and conceptual articles are provided that provide in-depth explanations of the service’s features.

Azure offers a set of APIs that make it easy to monitor and detect anomalies in time series data without prior experience in Machine Learning, and the REST API allows the service to be easily integrated into applications. In Azure, it is possible to detect anomalies in a variable using Univariate Anomaly Detection or detect anomalies in multiple variables with Multivariate Anomaly Detection.