2 Student Perceptions of Learning Experience (SPLE)

The name change recommendation in this chapter was approved unanimously by the committee.

2.1 Rationale for the name change

The current names — Student Evaluation of Instruction and Student Evaluation of Faculty — mischaracterize what the instrument does and should do. The word “evaluation” implies that students are rendering a verdict on the quality of instruction or on the instructor. They are not. As detailed below, the proposed instrument asks students to report on their own experiences in the classroom: whether they felt treated with regard, held to consistent standards, able to access help, able to see how course elements connected, comfortable participating, and that the learning environment was responsive to them. These are experiential reports, not evaluative judgments.

This distinction is not merely semantic. The peer-reviewed literature on student evaluations of teaching (SET) establishes that items framed as evaluations of teaching effectiveness, course effectiveness, or instructor competence are particularly susceptible to bias — including bias linked to the instructor’s gender, race, and accent — and are evidently misleading (Boring, Ottoboni, and Stark, 2016; Stark, 2016; Stark, 2026). By contrast, items that ask students to report on their own experience are less susceptible to these biases, precisely because they do not ask students to make judgments they are not qualified to make. The name of the instrument should reflect what it actually measures.

2.1.1 Sources of evidence about the validity and bias of SET

Some earlier research has argued that student evaluations are valid and reliable measures of teaching effectiveness (e.g., Marsh, 1987; Abrami, 2002; Berk, 2005). This committee examined this claim in light of the more recent experimental and quasi-experimental evidence summarized below.

Studies that claim SET are fair and valid rely on data that cannot answer the relevant question. Some studies compare average SET for male and female faculty and conclude there is no bias because these averages are similar. That conclusion is unwarranted because “one cannot assess gender bias in SET merely by comparing how women and men are rated by students: that comparison does not control for actual differences in teaching effectiveness, subject matter, class size, format, etc., resulting in confounding (Boring et al., 2016; Wagner et al., 2016). The appropriate question is not ‘do men and women get similar ratings?’ but rather ‘would a given instructor teaching a given course have received different ratings if their gender had been different but nothing about their teaching were different?’” (Stark, 2026, p.7).

Randomized experiments and natural experiments — where nature assigns subjects to treatments as if at random — in real class settings provide the strongest evidence about whether SET measure teaching quality or something else. Such research has found:

SET have weak or negative association with objective measures of learning (Carrell and West, 2010; Braga et al., 2014; Boring et al., 2016)
SET have substantial bias from gender: female instructors sometimes get lower ratings than objectively less effective male instructors (Boring et al., 2016); gender affects ratings of “objective” items like promptness (MacNell et al., 2015; Boring et al., 2016); bias varies across disciplines (Boring et al., 2016; Mengel et al., 2018); the bias of male and female students towards male and female faculty differs (Boring et al., 2016)
SET have bias from ethnicity and gender (Chisadza et al., 2019)
SET have stronger association with grade expectations than with learning (Boring et al., 2016)
Students reward grades — not learning — by giving high SET scores (Cho et al., 2015; Carrell and West, 2010; Braga et al., 2014; Stroebe, 2020)
Providing cookies during class increases ratings of instructors and course materials (Hessler et al., 2018)
The number of points on Likert scales affects gender differences in SET scores (Rivera and Tilcsik, 2019)
Student perceptions of their learning do not match objectively measured learning (Deslauriers et al., 2019; Dunning et al., 2004; Hartwig and Dunlosky, 2017; Knof et al., 2024; Kruger and Dunning, 1999; Lake, 2001; Lindsey and Nagel, 2015; Wooliscroft et al., 1993; Xu et al., 2024)

Source: Stark, 2026, pp. 2–3

Moreover, such research has found that “bias may be large in some situations and small in others… Indeed, the main reason it is impossible to adjust SET for bias is that there are many sources of bias that may interact in complex ways. SET cannot be presumed to be valid, reliable, or fair in any given course, department, or university, absent affirmative evidence of reliability, validity, and unbiasedness in that time and place.” (Stark, 2026, p.8).

It is on the strength of the experimental and quasi-experimental evidence — which can control for these confounds — that this proposal reframes the instrument around experiential reports rather than evaluative judgments, since these are the ones that this literature finds most susceptible to bias.

2.2 The proposed name

Each word in the proposed name — Student Perceptions of Learning Experience — is chosen deliberately:

Student: the respondent.
Perceptions of: what the data represent. The word “perceptions” acknowledges that the instrument captures how students experience the learning environment from their own vantage point. Their nature as perceptions is already captured in the Collective Bargaining Agreement (CBA §15.15). Students occupy a position in the classroom that no other observer shares — they are the only ones who can report on whether the instructor engaged with them as individuals, whether they could see how the course fit together, or whether they felt comfortable participating. “Perceptions” names this unique epistemic contribution directly: the data are the students’ own account of their experience, grounded in what they are distinctively positioned to observe.
Learning Experience: what is being reported on. “Learning experience” scopes the instrument to the educational context without making the teaching or the instructor the object of assessment. It signals that the data concern the student’s experience of learning — the process, not the outcome — rather than a judgment of instructional quality.