Enhancing the quality of medical Data Annotation by Including Humans in the Machine Learning Loop

aihealthcaredatasets.png


Abstract: Currently the vast majority of Artificial Intelligence (AI) systems require the involvement of humans for their creation maintenance, tuning, and development. In particular, Machine Learning (ML) systems would be greatly benefited by their expertise or experience. This is why there is a growing interest in what humans do to these systems in order to get the best results for the AI systems as well as the people that are. There are a variety of approaches that have been researched and suggested in the literature, which can be considered under the umbrella term Human-in-the-Loop Machine-Learning. The application of these techniques for the medical informatics system could have a great impact on the diagnosis and prognosis process helping to improve the health care system for Cancer-related illnesses.

 

Introduction 

Most Machine Learning (ML) systems require humans to participate in various stages of the AI pipeline. In light of this need, new kinds of interactions between machines and humans are being developed, which can be classified under the name Human-in-the-Loop Machine Learning (HITL-ML) [11. The objective is to make models of machine learning more precise, attain the required accuracy quicker, and increase the efficiency of humans when training or using a Machine Learning model. In the field of health (and other areas) because of the limited number of datasets conventional ML models have insufficient training examples [ Utilizing certain techniques such as those described in this paper could assist in improving both the process of training and also the performance of the user.

 

Materials and Methods

 Hybrid Intelligence Systems include several strategies aimed at increasing the capabilities of the machine, the human, or both. Taxonomy has been suggested basing it on the characteristics of the task such as learning algorithm, AI-Human interaction and Human-AI interactions discuss a variety of techniques that could be classified under the umbrella of Human-in-the-Loop Machine Learning (HITL-ML) and could be applied, in addition to other methods in the Cancer Prognosis as well as diagnosis situations in which training samples are limited and domain expertise is costly. 

 
1. Human-in the-Loop Techniques

 Human-in the-Loop ML is designed to increase the precision of a model using machine learning, achieving the performance desired faster by combining machine and human intelligence to improve accuracy and aiding human tasks by using machine learning in order to boost efficiency. The most important tasks that are mentioned are: 

 

  • Annotating nonlabelled data to generate training as well as validation and evaluation data. 
  • Rectifying the most essential non-labeled data items.
  • Incorporating Human-Computer Interaction techniques into the annotation. 

 

Depending on the individual in charge over the process of learning we have identified different methods: Active Learning, Interactive Machine Learning and Machine Teaching.

 

2. Active Learning

One of the most basic techniques that has been developed is Active Learning (AL) [4The system is in charge of the process of learning and considers humans as oracles that label relevant data that is not labeled. It is particularly beneficial when labeling examples is costly or time-consuming and also in the case of limited instances (e.g. cancer). AL uses an interactive/iterative process for obtaining training data, unlike passive or classical learning, where the data is provided in advance. The student requests information from the Oracle, which it decides upon depending on various methods of query.

 

3. Interactive Machine Learning 

Another approach is Interactive Machine Learning (IML) where there is more interaction between learners and their learning systems and people are able to supply data in a more specific more frequent and incremental manner as opposed to traditional machine-learning. In this method, the control of learning processes can be shared by the learning system as well as the users, and they work closely to gain from the mutual benefit.

 

4. Machine Teaching

In the end machine Teaching (MT) (The idea is to concentrate on the teaching role which humans can play in order to generate useful data from the information available. To facilitate the development of new models that require experts with a deep understanding about machine learning. This approach is a way to separate knowledge of machines learning techniques from the teaching process. Humans would act as a teacher who guides the learning process. A specific form that uses MT includes Iterative Machine Teaching (iMT) which aims to achieve the best training set based on an algorithm for machine learning and an objective model. The concept is to acquire an idea with only a few iterations by using the smallest data set.

 
5. Implementing and interpreting

The results after the model has been deployed and is being utilized in a production environment We could apply Explainable AI (XAI) 11to make the outcomes that are generated by AI systems more easily understood by humans. There are particular domains in which the above methods may be able to meet the goals of the model. In particular, ML-based approaches could be particularly useful to resolve problems with Health Informatics, where we lack large AI Training Datasets, and we have to handle complicated data or rare events and traditional algorithms are hampered by the lack of training samples. 

 

9c4c4fbd51ab209101ee7733fd0cf4e8.jpg

The results

To date have examined two of the methods that were uncovered: Iterative Machine Teaching (iMT) and Active Learning (AL). We've looked at ways to integrate them into the learning process by using common datasets like Gaussian, MNIST and Vehicles. Our idea to integrate iMT along with AL in the machine-learning loop would be to employ iMT as a method to acquire what is known as the "Minimum Viable Data (MVD)" for the purpose of training an learning model. That is, a set of data that allows us to boost speed and simplify the learning process , by permitting the creation of early prototypes.

 

The results of the use of iMT as well as AL on data sets that have been tested can be seen at [12The results of the application of the iMT and AL on known datasets can be found. We can see that in the iMT study results demonstrate - both in the case problems as well as in real-world problems that the algorithms that are trained through any of the suggested teachers produce superior results than those that are trained random selection of the examples. In our AL experiment, we discover that the main benefit of this method lies in the continual enhancement of the model which increases its resilience and reduces obsolescence.

 

Conclusions 

The strategies presented (combined or separately) are applicable to particular area (Cancer diagnostics and prognosis) which makes Machine Learning (ML) methods available to specialists in the subject and enhancing the efficiency of both the system as well as humans (i.e. the HITL-ML) getting understandable, semantic ML models (i.e. explainable AI).

 

What GTS Offers?

At Global Technology Solutions, we create training datasets to provide the needed support for medical data sets. GTS offers a wide variety of services that comprise annotation, tuberculosis x ray dataset, data collection, and protected transcription in order to provide support for your AI Healthcare Datasets for machine learning.


Comments

Popular posts from this blog

AI's Role in Medical Image Annotation

How AI Will Transform Healthcare.

Best Practices for Datasets in Healthcare Organizations