A Peek Into The Black Box

It is one of the most amazing feats of human engineering: recreating the workings of the human brain. Artificial neural networks can learn independently – and regularly surpass the skills of their developers. We can hardly comprehend how artificial intelligence (AI) works and what results it produces. But we should, says Prof. Daniel Neider. Alongside a team spread across two continents, he is working on making AI trustworthy.

Dortmund, Kaiserslautern, Austin (Texas) and Tempe (Arizona) – the members of the international and interdisciplinary group led by Prof. Daniel Neider and his doctoral student Rajarshi Roy all work in these cities. They have made it their mission to make artificial intelligence easier to understand and, as a result, more trustworthy. The researchers in these cities, five males and one female, come from the USA and Germany, but also from India and France. How did this collaboration across thousands of kilometers come to be?

Daniel Neider, theoretical computer scientist and since late 2022 Professor for Verification and Formal Guarantees of Machine Learning at the newly founded “Research Center Trustworthy Data Science and Security” of the University Alliance Ruhr, previously researched in Kaiserslautern and before that as a postdoctoral researcher in the USA. This is where he met colleagues with whom he stayed in touch. “Over time, this has resulted in a long-term cooperation with several works that all build on one another. We just complement each other and work well together: We in Germany do the theoretical work, while our American colleagues do the experimental work,” says Neider.

Trust through understanding

Many decisions made by artificial intelligence are puzzling, incomprehensible to humans – and sometimes even very controversial. In areas where we already rely on the supposed incorruptibility and objectivity of artificial intelligence, we find that it is not that objective at all – because the data on which the decisions are based often comes from humans. For example, racial prejudices are reflected when algorithms in the USA discriminate against African Americans when assessing their creditworthiness. The global public also learned in 2015 and 2018 that AI makes sometimes catastrophic mistakes, when (partially) autonomous vehicles caused fatal accidents in the USA.

Fully self-driving cars are not currently permitted in Germany – because they are still not 100% safe and we do not trust them yet. “It's difficult to trust a system without knowing exactly how it works. That’s our working hypothesis,” says Neider. His mission statement is “trust through understanding”.

A graphic with a table, illustrations of drones and airplanes. — For example, computer scientists can use formal logic to explain the behavior of an autonomously flying drone. This mathematical language is aimed at technicians who use artificial intelligence when developing their vehicles and aircraft, but do not understand it in detail.

Now the average passenger or typical car owner knows very little about why exactly an airplane stays in the air and how their own car works. But those who build vehicles or airplanes know a great deal. They designed them and programmed the systems within them – the output depends on their input into the system. If there is an error in the system, it can usually be identified and the system reprogrammed. “It’s different with AI. Instead of engineers programming the software, they run an algorithm over a mass of data and look for patterns. The problem with this is that, at some point, no one knows exactly what’s happening internally. The algorithm recognized a pattern, but the rest is a black box,” says Neider. “It is usually impossible for humans to predict the behavior of an AI system through simple observation," adds Rajarshi Roy. “So our goal is to generate models that people can interpret.” In other words, the team is trying to shed some light on the black box.

On the one hand, this is necessary for establishing trust. On the other hand, legal requirements necessitate a certain degree of transparency for the AI – for example, the Artificial Intelligence Act of the European Union. In the future, it must be possible to explain why someone does not get a loan – deferring to the higher reason of data-based artificial intelligence is not enough. “Our research is not aimed at the end customer, but primarily at those who have to work with AI – the technicians,” says Daniel Neider. “We want to describe how an artificial intelligence system works in their own language, using the methods they already know and are already using successfully to make classic hardware and software secure.”

„It is impossible for humans to predict the behavior of an AI system through simple observation.“ Rajarshi Roy

Retranslating the behavior of artificial intelligence into a mathematical language, a programming language, so that engineers can understand it – that’s roughly the goal. It helps to think of it as an elevator, an example that Neider often uses with his students. An elevator can be programmed to always return to the ground floor after it has finished its job. But is that also the best way of keeping travel and waiting times as short as possible or saving electricity? An AI system could figure this out by collecting data, such as waiting and driving times. On top of that, the elevator users could rate each trip. The more data the AI collects, the better. “In the end, the elevator will probably work optimally based on the available data – but we have no idea how exactly,” says Neider. “So we observe what the new controller is doing and describe it in the language of the engineers.”

A fact sheet for AI

The elevator manufacturers may still like to change the optimal behavior of the elevator calculated by the AI – for instance, because there are special fire protection specifications not taken into account by the AI. In this case, the developers would have to ensure the system learns new things from other data. However, an analysis of this kind is only possible if there is an understanding of how the system works. Daniel Neider calls this reverse engineering: “We break the complexity back down to individual steps and provide the engineers with a kind of fact sheet for the AI.” It contains formulas with temporal operators and if-then scenarios that describe what a system never does and what it always does, what it does next if x happens, and what if y happens in the future. “We don’t just look at behavior in the moment, but at behavior over time,” says doctoral student Roy, and Neider adds: “The temporal aspect is our niche. Transferring classic computer science methods to AI is one of the trends that my group is successfully driving forward. It is gradually becoming clear that researchers with traditional computer science backgrounds who work on software reliability can make valuable contributions to making AI secure and understandable.”

A shot from a frog's perspective of an elevator shaft. — With artificial intelligence elevators, for example, can be optimized.

A hand presses the down button in front of an elevator. — Prof. Neider's team describes exactly how AI behaves for the manufacturers of such systems. The “package insert” includes formulas with temporal operators and if-then scenarios.

A prototype has already been developed that was able to convincingly generate descriptions of the behavior of artificial intelligence that experts could understand – at least in the research setting. “It’s fundamental research,” says Daniel Neider, “and we first selected small, not overly complicated systems such as an autonomously flying drone in order to understand and describe them. The next step is then to try it out in the relevant application area, for example alongside engineers.”

Cristiano Ronaldo and AI

Research on trustworthiness is still a new topic in AI, but it is becoming increasingly important – because AI is being used in more and more areas. With that in mind, two traditionally separate, almost contradictory approaches are being brought together again – inductive and deductive reasoning. The classic deductive method consists of deriving new facts using logical formulas and on the basis of axioms, i.e., assumptions that are deemed irrefutable. The inductive method involves deriving generalizations from data, just like neural networks do. “However, we have now recognized the added value that can come from combining the two methods,” says Daniel Neider. His team's approach is to use the best of both approaches: “It's about connecting rules to the data you have.”

A photo of Christiano Ronaldo with other players on the soccer field. — Which players are on the field? This question is still easier for the viewing public to answer than for an AI system, which has to work hard to find out that Christiano Ronaldo cannot be on the field twice. Computer scientists hope to provide information like this to AI in the future to make it more efficient.

Daniel Neider gives a clear example: “AI should use the images of a soccer broadcast to find out which players are on the field. To us humans, it would be obvious that Cristiano Ronaldo, for example, can’t be on the field twice – that is an axiom. A neural network first has to work hard to find that out.” Needless to say, the system never “sees” that Ronaldo is in the picture twice or more, and it recognizes that he is either in it once or not at all. So, at some point, it reaches the conclusion: There is only one Ronaldo. Information like this – the laws of physics, for example – could be given to the AI in the system. Neider: “This is a very exciting development. It makes the processes more data-efficient, faster, and probably less error-prone.”

Text: Katrin Pinetzki

About:

A photo of Prof. Daniel Neider, a man in a light blue shirt.

Prof. Dr. Daniel Neider has held the professorship in “Verification and Formal Guarantees of Machine Learning” since the end of 2022. The professorship is the first to be filled at the new “Research Center Trustworthy Data Science and Security” of the University Alliance Ruhr and is based at the Department of Computer Science at TU Dortmund University. Daniel Neider studied computer science and business informatics at RWTH Aachen, where he also received his doctorate. He then spent two years as a postdoctoral researcher at the University of Illinois at Urbana-Champaign (USA) and the University of California, Los Angeles (USA). He worked at the Max Planck Institute for Software Systems in Kaiserslautern for five years until he submitted his thesis for a postdoctoral lecture qualification in theoretical computer science at TU Kaiserslautern in 2022. After a year as a professor for security and explainability of learning systems at the Carl von Ossietzky University of Oldenburg, he joined TU Dortmund University.

Rajarshi Roy is a doctoral candidate at the Max Planck Institute for Software Systems in Kaiserslautern. Prof. Daniel Neider is supervising his doctoral thesis in computer science. He studied mathematics and computer science in Chennai (India).