A meaningful evaluation framework is crucial to the success of any Machine Learning (ML) application, yet structural biases often creep in unknowingly. When this happens, the results may not be as reliable as they initially appear…This talk will identify common pitfalls and illustrate them with real-world examples from nearly two decades of experience in the data science field. We will explore the hidden story behind the performance metrics, moving beyond a single F-measure or accuracy score to delve into the intricacies of the data set and its domain. We’ll discuss how to identify artificial biases in your data and offer strategies for preventing them through rigorous design of your data collection and annotation processes.Ultimately, this talk will provide a list of practical recommendations for building ML projects on solid foundations. Because amid the current AI boom and hype, we urgently need high-quality datasets, meaningful evaluations, and robust algorithms to ensure we are not just building elaborate sandcastles with GPUs.
I am a machine learning and NLP engineer who firmly believes in the power of data to transform decision making in industry. I have a Master in Computer Science (software engineering) and a PhD in Sciences (Bioinformatics), and nearly 2 decades of experience in Natural Language Processing and Machine Learning, including in the pharmaceutical industry and the food industry.I am passionate about open-source, and have worked on the maintenance of various popular Python packages including spaCy, Typer and FastAPI. I also run a one-woman consulting company called OxyKodit, developing tailored NLP solutions for a variety of businesses and domains. Throughout my code and projects, I am passionate about quality assurance and testing, introducing proper levels of abstraction, and ensuring code robustness and modularity.