In many areas from medical engineering to logistics, perception models based on artificial intelligence (AI) have become indispensable. They have a problem assessing their own predictions, however. Neural networks, in particular, which have become established in many areas, have major difficulties with self-assessment of their own predictions. This doesn’t pose a problem for many applications, however, because the forecasts do not present a potential risk. It can have disastrous consequences in safety-critical applications, however. For example, a doctor may place too much trust in an AI-based visual system for diagnosis if the model erroneously fails to detect disease patterns with high confidence.
In such cases, it would be good if the system stated that it was uncertain about its prediction. The situation with autonomous driving is similar, where it makes a big difference for the overall system whether an object is identified with high or low confidence, because this plays a key role for the planning process of the driving maneuvers. Moreover, an undiscovered object in the vehicle environment is not the only problem in some cases. An object that is falsely identified with a high confidence value can also result in an erroneously initiated emergency braking and thus a risky scenario.
AI predictions are being scrutinized
The majority of research and development for such AI-supported models focuses on maximizing classic performance metrics like accuracy. This value indicates how much of the relevant data has been assigned correctly, for example, in a classification problem. While these metrics also play a major role in AI-based safety-critical, autonomous systems, other extremely important aspects are not taken into account: A model that assigns a high uncertainty score to the wrong predictions is preferable to a model that doesn’t, for instance, even if the accuracy is identical.
This is where the online tool developed by Fraunhofer IKS comes in. It gives users deep insights into the holistic reliability of the given forecasts. To do so, safety-specific metrics that are designed specifically for safety-critical applications, generally published and developed by Fraunhofer IKS are used to assess and explain the results. In addition, the tool offers extensive visualization features to quickly identify weaknesses in the model. It also provides detailed interpretations and suggestions for improvement that can help to make the system under review more robust in many cases.
In addition, potential methods for the respective problems are suggested that are usually easy to implement. They can improve robustness significantly and are based on prior experiments with large, public datasets at Fraunhofer IKS. The tool is easy to use and does not require any potentially copyright-protected models or datasets.
Users simply have to upload an easy-to-format JSON or XLS file with the predictions and results of their AI model to the website to start the analysis. Once the evaluation is complete, users also have the option of downloading a detailed analysis as a standardized document in the form of a PDF file.
The online tool is slated to become generally available starting this fall. Interested parties can take part in user tests before the tool is published for the general public. Please sign up here and we’ll be happy to schedule an appointment.
This work was funded by the Bavarian Ministry for Economic Affairs, Regional Development and Energy as part of a project to support the thematic development of the Institute for Cognitive Systems.