Weaponizing Machine Learning Systems

Share This Post

Share on facebook
Share on linkedin
Share on email

The arrival of machine learning has been a breakthrough for humanity in terms of technological progression. But with its advent comes the new and unexpected security vulnerabilities. These security vulnerabilities are so subtle and intricate that detection becomes a serious predicament. The study of the threat area of machine learning systems is a fairly novel concept and thus several types of research are going on to inspect its flaws and weaknesses. 

Categorizing Adversarial Attacks

As machine learning systems become more prevalent, it is logical to assume that adversaries will learn how to attack them. Depending on the access of the victim model, here are the main categories of adversarial attacks against Machine Learning systems:

  • White Box Attacks– Under these attacks, the adversary has direct access to a model. Close access to the model can be gained through acquiring the model’s soft assets like the datasets or the parameters it’s trained on.
  • Black box attacks– These attacks do not require direct access to the ML model. Adversaries can perform multiple queries on the victim model, and observe its outputs. They then use the query results to build a copy model and then substitute it to craft an attack on the targeted ML system.
  • Grey-box attack– This method falls between white-box and black-box attacks. A grey-box attack assumes that the adversary has some information about the targeted ML model. Attackers can download a pre-trained model or develop a new one similar to the victim model and use it to craft attacks on the victim model. The idea here is that the comparable model can be used against the victim, even if they are architecturally different. Moreover, if the adversary is able to attack one model, it can also generate attacks against the other approximated ML models. 

Types of ML Attacks

Machine learning-based systems have become quite widespread in recent years and this trend is only going to escalate in the future. 

Benefits of AI vary for defenders and attackers

With the increase in deployment comes the growing reliance on these systems to make important decisions. Hence, we need to recognize the inherent flaws and the ways in which state-of-the-art ML systems can be used by malicious actors. These malicious attacks against ML systems can be broadly classified into four main segments based on the purpose of the attacker.

  • Confidentiality Attacks– ML attacks that are aided by exposing the training data by careful observation of how a model’s predictions vary for different inputs. An adversary can repeatedly query a trained model to obtain users’ private information. Thereby, compromising the confidentiality of the system.
  • Integrity Attacks– Under this attack, the adversary alters the training data of the model. This eventually impairs the normal functioning of the ML system. Through this, the attackers can poison training data to shift the boundary between what the classifier identifies as good input, and what the classifier identifies as bad data. For instance, integrity attacks can be used to bypass spam or malware classification or to inaccurately advertise a product in an online recommendation system or to avoid network anomaly detection. This can negatively reflect on the company employing the victim ML model.
  • Availability Attacks– These attacks happen when an adversary gains control over an ML system with a small but carefully chosen modification in the input that is imperceivable to the human eye but to the model, it looks entirely different. The changes in the input can, therefore, lead a model to yield false output. For instance, subtle alterations to a traffic sign can confuse autonomous cars and cause them to potentially crash into other objects on the way.

Small alterations to traffic signs can confuse the ML systems

  • Replication Attacks– It refers to the situation where an adversary reverse engineers the ML model to construct a copy model and then uses it to craft attacks on the original victim model. Replication attacks are created to steal intellectual property. According to researchers, it is also possible to use these “copy models” to craft transferable attacks i.e. it can be used to substitute models with similar training data.

What’s Next?

Keeping in mind the extensive deployment of machine learning-powered systems, it is evident that new security threats that these systems possess need to be proactively addressed. Hackers with evil intent lurk in the hazy corners of the web and researchers are working on finding every possible method hackers can use to manipulate the ML model. So, that we can develop security compliance systems in accordance. Scanta is also dedicated to the cause of developing security solutions to protect machine learning systems so that companies can safely and widely employ them. Our featured product, VA Shield aims at protecting VA chatbots from attacks happening at the machine learning level.