Improving Machine Learning for Underrepresented Groups
Machine-learning models often struggle to make accurate predictions for individuals who are underrepresented in the data they were trained on. For example, a model designed to recommend treatments for chronic diseases might be trained predominantly on data from male patients. This could lead to inaccurate predictions for female patients when used in a medical setting.
To address this issue, engineers can attempt to balance the training data by removing certain data points to ensure equal representation across subgroups. However, while this method shows promise, it often involves removing a significant amount of data, which can negatively impact the model’s overall performance.
MIT researchers have developed a new approach that identifies and removes specific data points in a training set that contribute most to a model’s errors with minority subgroups. This technique allows for the removal of fewer data points compared to other methods, maintaining the model’s overall accuracy while enhancing its performance for underrepresented groups.
The technique can also uncover hidden biases in datasets that lack labels. Unlabeled data is more common than labeled data in many fields. This method could be integrated with other strategies to make machine-learning models more fair, particularly in critical areas such as healthcare, where it could help prevent biased AI models from misdiagnosing underrepresented patients.
“Many algorithms that try to tackle this issue assume that every data point is equally important. Our research shows that’s not the case. Specific data points contribute to bias, and by identifying and removing them, we can improve performance,” explains Kimia Hamidieh, an electrical engineering and computer science (EECS) graduate student at MIT and co-lead author of a paper on this method.
Hamidieh collaborated with co-lead authors Saachi Jain PhD ’24 and fellow EECS graduate student Kristian Georgiev; Andrew Ilyas MEng ’18, PhD ’23, a Stein Fellow at Stanford University; and senior authors Marzyeh Ghassemi, an associate professor in EECS, and Aleksander Madry, the Cadence Design Systems Professor at MIT. Their research will be presented at the Conference on Neural Information Processing Systems.
Identifying and Removing Problematic Data Points
Machine-learning models are often trained on massive datasets collected from various online sources. These datasets are too large to be manually curated, so they may include flawed examples that degrade model performance.
Researchers know that some data points have a greater impact on a model’s performance for specific tasks. The MIT team combined this understanding into an approach that identifies and removes problematic data points. They aim to solve the issue of worst-group error, where a model performs poorly on minority subgroups within a training set.
Their new technique builds on previous work where they developed a method called TRAK, which identifies the most significant training examples for a particular model output.
For the new approach, they use TRAK to pinpoint training examples that contribute most to incorrect predictions for minority subgroups. By aggregating this information across incorrect predictions, they can identify the specific parts of the training data that lower the worst-group accuracy.
Removing these specific samples and retraining the model helps maintain its overall accuracy while boosting its performance for minority subgroups.
A More Accessible Solution
The MIT technique outperformed several methods across three machine-learning datasets, improving worst-group accuracy while removing significantly fewer training samples than conventional data balancing methods. It also achieved higher accuracy compared to methods that involve altering a model’s internal mechanisms.
Since the MIT method involves modifying the dataset rather than the model itself, it is more accessible for practitioners and applicable to various model types.
Moreover, it can be used even when bias is unknown due to unlabeled subgroups in a training dataset. By identifying critical data points, practitioners can understand the features the model uses for predictions.
“This tool is available to anyone training a machine-learning model. They can examine the data points to see if they align with the model’s intended capabilities,” says Hamidieh.
Although using the technique to detect unknown subgroup biases requires intuition about which groups to target, the researchers aim to validate and explore it further through future human studies.
They also seek to enhance the technique’s performance and reliability and ensure it is accessible and user-friendly for practitioners who might use it in real-world applications.
“Tools like this help you critically evaluate data, identifying which data points might lead to bias or other issues, offering a first step toward developing fairer, more reliable models,” says Ilyas.
This research is supported in part by the National Science Foundation and the U.S. Defense Advanced Research Projects Agency.