We want to thank Lauren Erdman, Oren Kraus, Eleni Triantafillou and Marc Law from U of T’s Machine Learning Group in the Department of Computer Science for giving us great talks on their latest research progress last Tuesday and Thursday!
Oren Kraus talked about Microscopy in Biology. He presented a few innovative ideas and cutting-edge techniques that he has been trying out during his research. For example, in Yeast Proteome Dynamics, he uses methods such as computer vision, multiple instance learning, fully-connected CNNs, to extract features from yeast cells and perform cell recognition as well as classification. These techniques can be used in identification of tissues, and diagnosis for cancer.
Lauren Erdman emphasised on understanding the training dataset. She listed 3 examples on which the misused dataset lead to unexpected outcomes. The first example came in the perspective of making ethical decisions in health care. Degenerate ER risk assessments may bias on non-random treatment and mis-diagnosed the condition. The second example revealed the possibility that machine learning model learns the wrong training information when there is, for instance, structural bias in image data. Moreover, the model may also learn bias from historical data. Thus, it is important for us to know where the data set comes from and what it represents, so that we could mitigate its bias.
Eleni’s research is on few-shot learning, which understands new concepts from only a few examples. In order to optimize the information from the small amount of data, each data point, viewed as a “query”, ranks the other points based on its predicted relevance to them. This framework of structured prediction defines a model to optimize Mean Average Precision and performs just as well as other algorithms.
Marc Law teaches us about clustering, which groups examples so that similar ones are grouped into the same cluster and dissimilar examples are in different clusters. Two problem-dependant factors affect the quality of clustering: the chosen similarity metric and the data representation. Supervised clustering approaches try to find the metric that optimizes the performance of clustering.
Thank you for those who came to our events and asked great questions! We are looking forward to having everyone to join us and get a broader perspective on machine learning!