Real-world application of machine learning in networks


Illustration: © IoT for all

The growing demand for Internet connectivity has put a strain on improving network infrastructure, performance, and other critical parameters. Network administrators will invariably encounter different types of networks running multiple network applications. Each network application has its own set of features and performance settings that can change dynamically. Due to the diversity and complexity of networks, using conventional algorithms or hard-coded techniques designed for such network scenarios is a difficult task.

Machine learning has proven to be beneficial in almost every industry, and the networking industry is no exception. Machine learning can help solve old intractable network blockers and spur new network applications that make networking very convenient. Let’s discuss the basic workflow in detail, with some use cases to better understand machine learning technology applied in networking.

Intelligent network traffic management

With the growing demand for Internet of Things (IoT) solutions, modern networks generate massive and heterogeneous traffic data. For such a dynamic network, traditional network management techniques for monitoring network traffic and analyzing data such as ping monitoring, log file monitoring or even SNMP are not enough. They generally lack precision and efficient processing of real-time data. On the other hand, the traffic from other sources like cellular or mobile devices in the network comparatively shows more complex behavior due to the mobility of the devices and the heterogeneity of the network.

Machine learning facilitates analysis in big data systems as well as in wide area networks to recognize complex patterns when it comes to managing such networks. In examining these opportunities, networking researchers use deep learning models for network traffic monitoring and analysis applications such as traffic classification and prediction, congestion control, and more.

In-band network telemetry

Network telemetry data provides basic metrics on network performance. This information is generally quite difficult to interpret. Considering the size and total amount of data passing through the network, the data analyzed is of considerable value. If used intelligently, it can greatly improve performance.

Emerging technologies such as in-band network telemetry can help when collecting detailed network telemetry data in real time. In addition to that, performing machine learning on such datasets can help correlate phenomena between latency, paths, switches, routers, events, etc. These phenomena were difficult to pinpoint from the huge amounts of real-time data using traditional methods.

Machine learning models are trained to understand correlations and patterns in telemetry data. These algorithms then eventually acquire the ability to predict the future based on learning from historical data. This helps to manage future network outages.

Resource allocation and congestion control

Each network infrastructure has a predefined total speed. It is further divided into several lanes of different predefined bandwidths. In such scenarios, where the total bandwidth usage for each end user is statically predefined, there may be bottlenecks for parts of the network where the network is heavily used.

To avoid such congestion, supervised machine learning models can be trained to analyze network traffic in real time and derive an appropriate amount of bandwidth per user so that the network experiences the fewest bottlenecks.

These models can learn from network statistics, such as total number of active users per network node, historical network usage data for each user, time-based data usage patterns, movement of users across multiple access points, etc.

Traffic classification

In each network there are different types of traffic like web hosting (HTTP), file transfers (FTP), secure browsing (HTTPS), HTTP live video streaming (HLS), terminal services (SSH ), etc. Each of them behaves differently when it comes to the use of network bandwidth; for example, transferring a file via FTP uses a lot of data continuously for the duration of the transfer.

As another example, if a video is streaming, it uses the chunked data and a buffering method. These different types of traffic, when allowed to use the network in an unsupervised manner, create temporary blockages.

To avoid this, machine learning classifiers can be used to analyze and classify the type of traffic passing through the network. These models can then be used to infer network parameters such as allocated bandwidth, data caps, etc.

Internet security

The increase in the number of cyber attacks is forcing organizations to continuously monitor and correlate millions of external and internal data points across the entire network infrastructure and its users. Manually managing a large volume of data in real time becomes difficult. This is where machine learning comes in handy.

Machine learning can recognize certain patterns and anomalies in the network and predict threats in large data sets, all in real time. By automating such an analysis, it becomes easy for network managers to detect threats and quickly isolate situations with minimal human effort.

Identification and prevention of cyber attacks

Network behavior is an important parameter in machine learning systems for anomaly detection. Machine learning engines process huge amounts of data in real time to identify threats, unknown malware, and policy violations.

If it turns out that the behavior of the network matches the predefined behavior, the network transaction is accepted; otherwise, an alert is triggered in the system. This can be used to prevent many types of attacks such as DoS, DDoS, and polling.

Phishing prevention

It is quite easy to trick someone into clicking on a malicious link that looks legitimate, and then attempt to break through a computer’s defense systems with the information gathered. Machine learning helps flag suspicious websites to prevent people from signing into malicious websites.

For example, a text classifier machine learning model can read and understand URLs and identify those spoofed phishing URLs. It will create a much safer browsing experience for end users.

Integrating machine learning into networking is not limited to the use cases mentioned above. Solutions can be developed in the area of ​​using ML for networking and network security to resolve unresolved issues by highlighting opportunities and research from a networking perspective and machine learning.