Title: Statistical Modeling to Better Understand CS Students
Author: Mehran Sahami

Bio: Mehran Smehran5ahami is a Professor and Associate Chair for Education in the Computer Science department at Stanford University. He is also the Robert and Ruth Halperin University Fellow in Undergraduate Education at Stanford. In 2014, he received the ACM Presidential Award for his work on the CS2013 curricular guidelines in computer science. He also co-founded the ACM Conference on Learning at Scale, which has become an annual meeting focused on interdisciplinary research at the intersection of the learning sciences and computer science. In addition to his work in CS education, Mehran has published over 50 technical papers and has over 20 patent filings on a variety of topics including machine learning, web search, recommendation engines in social networks, and email spam filtering that have been deployed in several commercial applications.


While educational data mining has often focused on modeling behavior at the level of individual students, we consider developing statistical models to give us insight into the dynamics of student populations.  In this talk, we consider two case studies in this vein.  The first involves analyzing the evolution of gender balance in a college computer science program, showing that focusing on percentages of underrepresented groups in the overall population may not always provide an accurate portrayal of the impact of various program changes.  We propose a new statistical model based on Fisher’s Noncentral Hypergeometric Distribution that better captures how program changes are impacting the dynamics of gender balance in a population, especially in the case where the overall population is rapidly increasing (as has been the case in CS in recent years).

Our second study looks at the performance of student populations in an introductory college programming course during the past eight years to better understand the evolving mix of students' abilities given the rapid growth in the number of students taking CS courses.  Often accompanying such growth is a concern from faculty that the additional students choosing to pursue computing may not have the same aptitude for the subject as was seen in prior student populations.  To directly address this question, we present a statistical analysis of students’ performance using mixture modeling.  Importantly, in this setting many variables that would normally confound such a study are directly controlled for.  We find that the distribution of student performance during this period, as reflected in their programming assignment scores, remains remarkably stable despite the large growth in course enrollments.  The results of this analysis also show how conflicting perceptions of students’ abilities among faculty can be consistently explained.

The presentation includes work done jointly with Sarah Evans, Chris Piech, and Katie Redmond.



Title: Professional Competencies for Real? A Question about Identity!
Author: Mats Daniels


Bio:  Mats Daniels, is Associate Professor and director of undergraduate studies at the Department of Information Technology, Uppsala University, Sweden. Mats is also director of the national centre for pedagogical development in technology education in a societal and student oriented context (CeTUSS, and future site coordinator for the ACM ITiCSE conference. He is a founder and member of the Uppsala Computing Education Research Group (UpCERG, He has published over 100 journal and conference papers. His ambition when it comes to education is to find new formats and especially such where the students will experience a holistic learning environment, e.g. in Open Ended Group Projects.


How students develop professional competencies has been an interest for me for decades. There are several aspects to this issue that I have addressed, e.g. what are professional competencies [Bernáld et al., 2012], how can their development of them be supported in educational settings [Daniels et al., 2010], what motivates a student to put in an effort towards developing a competency [Cajander et al., 2012], how can they be assessed [Daniels, 2011], how can progression of professional competencies be handled in an education curricula [Lárusdóttir et al., 2015], and how can development of professional competencies be specified in a course description [Jónsson et al., 2016]. These are among the more prominent issues that have been on my mind. In this work I have noticed a huge “gap" between how professional competencies are expressed as important learning outcomes of degree programs and the almost zero link to how this development should be done at the course instance level. This “gap” is frustrating for me and a source for thoughts regarding how to bridge that “gap”. 

Work in our research group UpCERG (Uppsala Computing Education Research Group, has lately included studying issues related to identity, initially mostly the identity of different student cohorts [Peters et al., 2015], but now also that of teachers and education leaders [Pears et al., 2016]. This research provides valuable insights towards causes for the “gap”. That is, the slow closing of the “gap” can be understood by placing this in the context of the identity of the teachers (especially) and the students. It is how professional competencies are valued in relation to “pure” subject knowledge among these identities that provides severe obstacles to inclusion of development of professional competencies in a meaningful way at the course instance level. This is despite much of the previous work regarding issues related to developing professional competencies in educational settings, as those mentioned above.

I will address how I view the identities of teachers and students interfering with integration of development of professional competencies in degree programs. I will give examples from our research and my experience that illustrates the difficulties and outline some potential interventions that might lead to changes. My hope is that this talk will result in many fruitful discussions, both at the talk and afterwards.