
The State of Competitive Machine Learning 2023 is a great read! Every working data scientist can learn from what wins challenges, so I've picked out some key insights below.
Challenges are like prototype environments: participants want to get to the best answer, quickly, efficiently, and with minimum code. If in your work you want quick results and improved accuracy, you'll learn something here.
-
We know Python, deep learning, and PyTorch are a common stack for winners, but the extent—it is basically a clean sweep—is surprising. This has definitely changed in the last couple years. In my opinion, tensorflow and keras are still worth considering, especially for edge/mobile/browser deployment, but to keep up with the latest you've got to know PyTorch.
-
Obviously, deep learning won most NLP and CV challenges, and a unified API for pretrained models is essential, particularly timm and huggingface's transformers. Try a different architecture or pretraining regime with very little code change. Turns "no free lunch" into a buffet!
-
Some interesting things that don't happen in challenges: training from scratch, custom pretraining, and novel architectures. Likely because of the time and computation constraints these don't make sense for challenges, but we'll often explore these in our own work. Competitions help demonstrate that the work to return ratio on starting from first principles is not worth it for many problems.
-
Interesting to see EfficientNet to be the most popular pre-trained architecture; we've found it to be hyperparameter sensitive when fine-tuning. Comps saw some ConvNeXt activity, and I think more is coming—give it a shot.
-
LightGBM and XGBoost continue to be what you should reach for first for anything that is not computer vision or NLP. Both for tabular data and model ensembling they both perform well, and appear across ML competitions solutions. In my mind, neither is the clear winner, and I still recommend trying both and settling the question empirically.
-
Augmentation is common for computer vision (albumentations) and sometimes test-time augmentation appears in winning solutions. I think both training and test augmentation are under-utilized in production ML flows at the moment. Also, it is clear that NLP is missing both the methods and the libraries for effective augmentation, but I think that changes in the near future.
A couple bonus observations:
- 😍 More challenges around areas that DD community cares about—particularly, conservation, energy and medicine
- 😍 Novel competition structures for things like reinforcement learning and weak supervision
- 😍 Cross-validation gets a big shoutout! Do it. We still see it neglected
I'm proud of DrivenData's representation in the report, especially the callout to our openness including:
HUGE thanks to ML Contests for all their hard work! 🙏