Abstract

The purpose of the student evaluations of teaching (SET) are to help instructors enhance the teaching and learning experience in their courses; however, student feedback can often be more unconstructive than useful because students are usually requested to evaluate instruction with little or no formal training. As a result, SET become missed opportunities for students to effectively communicate their learning needs and for instructors to collect actionable information about how the course is perceived. This project aims to improve the quality of student responses to the open-ended questions that instructors receive by partnering with undergraduates in demonstrating to their peers the importance of SET and how to compose potent answers for instructors. A peer-led presentation was delivered midway and at the end of the semester, immediately before students completed end-of-semester SET. A total of 529 SET responses were gathered from 29 writing classes taught by four participating instructors. Class division was a strong predictor of feedback quality, with upper division students providing more useful feedback than lower division students. Faculty reported receiving an increase in actionable feedback on SET, and students found the peer-led presentation helpful, recommended it to others, and reported improved skills in providing feedback. This project provides a rubric and an asynchronous video as resources that can be easily transferred to other courses and institutions to support teaching, learning, and SET.

Keywords: assessment, student-faculty partnerships, student evaluations of teaching, course evaluation


Soon after it was launched in 2005, the University of California, Merced (UCM) was designated as a Hispanic-Serving Institution. The campus currently enrolls almost 9,000 students, with a projected growth of 10,000 by 2021. For a research university, UCM has relatively high percentages of Pell grant recipients (63.8%), and more than half of undergraduates are first- generation college-goers (73%) during AY 2019–2020. The campus has been recognized by numerous news outlets and organizations as a leader in diversity, community engagement, and scholarships and financial support for students.

The Center for Engaged Teaching and Learning initiated on campus in 2008, and soon afterward, it started to sponsor the Students Assessing Teaching and Learning (SATAL) program, which is a faculty-student partnership program that engages undergraduates in learning, teaching, and assessment at the program and classroom level. Since 2009, SATAL interns have supported instructors by collecting, analyzing, and reporting student feedback so that faculty in partnership with SATAL interns can make data-informed decisions to adjust their teaching. Among the assessment support provided by the program, SATAL offers mid-course feedback in various forms, such as Clark and Redmond’s (1982) Small Group Instructional Diagnosis (SGID), Smith et al.’s (2013) Classroom Observation Protocol for Undergraduate STEM (COPUS), video recording, and focus groups. One key approach to gathering student actionable feedback is through the How to Provide Valuable Feedback workshop, which is a peer-led presentation facilitated to explain the importance of actionable feedback and model quality feedback responses and practice activities (Signorini, 2014). For SATAL interns, collecting feedback when peers are asked to reflect on their progress as learners as well as an instructor’s teaching effectiveness is of paramount importance to complete their work successfully.

At UCM, faculty, including Merritt Writing Program (MWP) lecturers, are dedicated instructors who rely on student evaluations to improve their teaching effectiveness. These evaluations are essential to letting instructors know whether they are achieving course and program learning outcomes and provide key insights into what actions they can take in their classroom to promote learning. Sometimes interpreting student evaluation data can be puzzling (e.g., identifying areas that instructors need to improve), which makes it very difficult for faculty to objectively determine the quality of their teaching, work systematically to improve it, and document it. Additionally, for many instructors, reading student evaluations can be a stressful and daunting task because student feedback can often be more unconstructive than useful. Even the kindest of subjective student comments that read “Professor X is a nice teacher” are limited in terms of how helpful they are. Can Professor X identify exactly why this student thought he was “nice” so he can continue doing or being “nice” in future classes? Did being a “nice” teacher help this student learn how to draft a thesis statement? While this may be an authentic compliment from the student, it hasn’t given the instructor any actionable feedback that addressed student learning needs. For instructors, receiving such feedback not only limits their ability to see what they are doing well and what they need to improve but also is discouraging because many instructors devote a great deal of class time teaching students how to effectively and critically respond to the work of their peers, so why don’t these skills seem to transfer to course evaluations?

Since course evaluations are about the course and the instructor, having the instructor teach students how to interpret and respond on student evaluations of teaching (SET) presents an ethical dilemma. Therefore, we recommend that the best way to get more honest, unbiased, and actionable course feedback from students on course evaluations is to partner with other students, as in the SATAL program. If students are the agents who prepare other students for completing course evaluations, instructors should be able to expect more practical responses that they can apply to significant course adjustments while also nurturing student abilities in leaving feedback. In order to test this hypothesis, a project called Students Helping Students Provide Valuable Feedback on Course Evaluations was designed to engage undergraduates in the SET process. The main goal of the project was to enhance the effectiveness of course evaluations by partnering with undergraduate interns who can demonstrate to peers the importance of SET and how to compose potent answers for instructors.

In 2016, the Students Helping Students project began after receiving a POD Network grant. While the MWP investigator enlisted the participation of MWP instructors to implement the peer-led presentation in their courses, the SATAL coordinator supervised student interns in developing a 30-minute presentation that they delivered in participating classrooms. SATAL interns also collected and analyzed the data from various parts of this project and recorded a video as an asynchronous alternative for faculty to implement in their classes.

In order to find out whether this effort was successful, the following questions guided this study:

  1. Does the usefulness of student feedback that students provide to their instructors on course evaluations or SET improve after participating in a peer-led presentation about SET?
  2. Do students find the peer-led presentation and the feedback rubric useful in composing potent responses for their instructors?

Literature Review

A Measure of Teaching Quality and Effectiveness

Since their inception in the 1920s, SET have been the most common approach to assessing teaching effectiveness in higher education through surveys administered directly to enrolled students near the end of the academic term before final grades are assigned. Recent SET research, Linse (2017) and Spooren and Christiaens (2017), for instance, has noted that SET instruments can be found at almost every institution of higher education throughout the world and that these instruments have a twofold purpose. First, SET are thought to be essential tools to improve teaching quality by providing insight into how instructors might modify their classroom techniques or what new pedagogies they might adopt in order to improve student learning (Berk, 2013; Clayson, 2009; Linse, 2017). Second, SET are also the most widely used and analyzed sources of information in promotion and tenure decisions based on the belief that students learn more from highly rated professors (Berk, 2013; Fraile and Bosch-Morell, 2015; Linse, 2017; Uttl et al., 2017). In relation to both purposes, when student feedback does not indicate what students need and how the instructor can meet those needs, all parties that engage with SET tools (students, instructors, and administrators) suffer. Furthermore, when the sole focus of SET is the unidirectional evaluation of teaching quality, SET procedures become a missed opportunity for communication between students and instructors and an invalid instrument when it comes to improving the teaching and learning experience.

SET Validity and Student Feedback

SET tools have a long history of questionable validity and effectiveness, and one of the main reasons for this is the quality of student feedback. Student feedback can often be judgments about performance rather than insight into what the instructor can do to improve the student experience (Clayson and Haley, 2011, p. 108). SET research has shown major differences in the ways in which students and instructors perceive effective teaching and between effective teaching and factors that are unrelated to good teaching, including how students perceive their instructor’s characteristics and personality traits. Research has demonstrated student bias in particular against faculty who are female and of color (Boring et al., 2016; Huston, 2006; MacNeill et al., 2015; Mitchell and Martin, 2018). As a result, SET may be able to teach us about students’ opinions, perceptions, and attitudes toward instructors but not about the instructor’s actual performance (Clayson and Haley, 2011; Kornell and Hausman, 2016; Stark and Freishtat, 2014; Uttl et al., 2017). On that note, most leading SET researchers are convinced of the validity of SET data because they have found these factors to be of little or no influence; however, such bias studies continue to play a central role in the literature (Linse, 2017, p. 97; Spooren, 2013; Spooren and Christiaens, 2017, p. 44).

The assumption of bias and the lack of usefulness of student feedback at providing diagnostic information that can result in actionable changes might affect the way teachers perceive and value SET as a tool to improve their teaching. “After all, their [SET data] usefulness for the improvement of teaching depends upon the extent to which teachers respond to SET and use them” (Spooren and Christiaens, 2017, p. 44). The authors cite a couple of studies (Burden, 2008, 2010) conducted with over a hundred instructors in tertiary education, finding that instructors considered SET of little value, just tips and hints from students for the improvement of the instructors’ teaching of which they made little or no use.

Students are also ambivalent about the relative utility of the SET process. Spooren and Christiaens (2017) pointed out that students’ attitudes toward the goals of SET are also important in the collection of SET data. “If students see no connection between their efforts in completing SET questionnaires and the outcomes of these evaluations (e.g., improvements in teaching or course organization), such evaluations may become yet another routine task, thus leading to mindless evaluation behavior” (Dunegan and Hrivnak as cited in Spooren and Christiaens, 2017, p. 44). According to students, the most attractive outcome of a teaching-evaluation system is that the feedback provided improves teaching efforts. Therefore, when instructors disregard SET results, they are only confirming the negative assumptions that students have about SET and excluding them from the teaching and learning process. This is unfortunate because highly engaged students, who perceive themselves as being important stakeholders in the academic community, will be more likely to complete SET and give higher ratings to all activities (including teaching practices) that take place in that academic community (Spooren and Christiaens, 2017, p. 44).

Another factor that impacts the validity of SET is that according to Bloom’s cognitive taxonomy (1956), the ability to evaluate is a higher-order skill, and based on the literature on learning, the best ways to learn a skill (and giving feedback is a skill) is to observe a model (Bandura as cited in Svinicki, 2001). However, we often ask students to evaluate instruction without teaching them how or explaining why it is important.

Some studies have shown that students are not the best judges of their own learning, which results in feedback that does not clearly address the questions on evaluations (Lauer, 2017; Wieman, 2015). Students do not always “hold a realistic evaluation of their own learning” (Clayson, 2009, p. 27), and students may not understand or make distinctions between questions, so their feedback provides “little information on specific aspects of the teaching that might be improved” (Wieman, 2015, p. 10). Relatedly, another basic limitation of SET is when students are asked how much they learned in the course. Wieman (2015, p. 9) pointed out that “it is difficult to know what you do not know,” so the accuracy of this evaluation relies on the expertise of the respondent. Key evidence cited in support of this statement is a meta-analysis of faculty’s teaching effectiveness revealing not significant correlations between SET ratings and student learning (Uttl et al., 2017).

Other issues that plague SET validity include the way SET are distributed. SET have been negatively impacted in recent years because many are delivered digitally. This has resulted in lower response rates on SET (Berk, 2012; Felton, 2004). Additionally, the quantitative results of SET are often discussed, researched, and utilized disproportionately to the qualitative responses that are generated or just completely disregarded (Fraile and Bosch-Morell, 2015).

A New Approach to Addressing SET Validity

The question of validity on SET procedures and practices had led centers for teaching and learning (e.g., T-Eval), higher education associations (e.g., American Association of University Professors, 2015), and many researchers to search for and design new tools to replace them since the mid-late 1990s (Berk, 2006, 2005; Clayson and Sheffet, 2006; Rhem, 2020; Wieman, 2015); however, it is clear that without a unified and strategic effort to achieve this, SET are here to stay. Fortunately, student-assisted teaching approaches that focus on how and why students learn (Barr and Tagg, 1995) are becoming increasingly common across the globe. Students as partners, change agents, evaluators, and co-creators are now familiar terms for anyone interested in the importance of student voice in teaching and learning (Cook-Sather et al., 2019; Healey et al., 2014). Students have been included in pedagogical planning as co-creators of teaching approaches, course design, and curricula as well as pedagogical consultants (Bovill et al., 2011; Cook-Sather et al., 2014, 2019) and student ambassadors (Peseta et al., 2016), and more and more staff and faculty have been engaging students as partners, which differs from just collecting the student perspective on pedagogical practices.

Partnership is a “reciprocal process through which all participants have the opportunity to contribute equally, although not necessarily in the same ways, to curricular or pedagogical conceptualization, decision-making, implementation, investigation, or analysis” (Cook-Sather et al., 2014, pp. 6–7; Healey et al., 2014; Signorini and Pohan, 2019). Such partnership seeks to “engage students as co-learners, co-researchers, co-inquirers, co-developers, and co-designers” (Healey et al., 2016, p. 2) with faculty, administrators, and other students. Engaging in such partnership results in a variety of beneficial outcomes for students, staff, and faculty (Bryson, 2016; Mercer-Mapstone et al., 2017), including in assessment (Cook-Sather et al., under review; Deeley and Bovill, 2017; Deeley and Brown, 2014; Wittman and Abuan, 2015, p. 66).

Research by Clayson (2009) and Price et al. (2010) has suggested that instructors need to specifically teach students about the feedback process, why it is meaningful, and how it relates to course evaluations. This is important for not only improving response rates but also receiving more thorough answers to the open-ended questions on SET. On the one hand, if instructors do this, it might very well result in more actionable feedback from students as well as higher response rates; on the other hand, this also presents a clear ethical dilemma. However, if student partnerships between staff and faculty can be implemented to further subject-based research inquiry; scholarship of teaching and learning (SoTL); curriculum design and pedagogic consultancy; and learning, teaching, and assessment (Healey et al., 2014), then why not partner with students to help improve the quality of feedback that instructors receive on SET? If students are the agents who prepare other students for completing the SET, instructors should be able to expect more actionable responses that can be applied to significant course adjustments while also nurturing student abilities in assessment. This is a key component of the approach to gathering feedback from enrolled students through the Students as Learners and Teachers (SaLT) program at Bryn Mawr and Haverford Colleges (Cook-Sather, 2009).

Student evaluations have been shown to be poor at correlating with learning outcomes, addressing specific aspects of the course or teaching, and being objective about the instructor (Wieman, 2015). Since peers are the most potent source of influence on students’ development during their college years (Astin, 1993), they could be significantly more effective at helping fellow students understand the purpose of SET as well as why it is essential to complete them and compose thoughtful answers to open-ended questions and, therefore, potentially improve the validity of SET. Additionally, students watching their peers explain and discuss this widely used assessment tool might result in additional benefits, such as gaining new disciplinary knowledge and developing what Hutchings (2005) called “pedagogical intelligence” (as cited in Mihans et al., 2008, p. 7), improving their abilities in complex judgment tasks like self-assessment, and helping students realize that when they engage in assessment, they become stakeholders in their own learning and contribute to the entire campus community.

Methodology

This Institutional Review Board-approved, experimental study used a mixed-methods approach in which quantitative and qualitative data were collected, analyzed separately, and reported in the results (Creswell and Creswell, 2018). The emphasis was on the quantitative data, with qualitative data providing additional context.

A total of 521 SET responses were gathered from 29 writing classes taught by four volunteering instructors during the academic years (AYs) 2013–2014 through 2016–2017. These evaluations consisted of the three open-ended questions from the official SET currently used by the writing program:

  • Q1. How would you describe your writing ability now compared to the beginning of the semester?
  • Q2. Identify and evaluate aspects of this course that have been especially helpful to you.
  • Q3. Describe aspects of this course that you would change if you had the opportunity.

SET collected from 18 classes prior to AY 2016–2017 were used as controls (the “Pre” group). Participating faculty defined helpful feedback and designed a rubric accordingly (Appendix A). They underwent a norming session prior to rating the quality of student responses to these questions in their own SET as “H” (highly useful), “S” (somewhat useful), or “N” (not useful). SET that did not include answers to all three questions were omitted from analysis.

The SATAL program developed a 30-minute live presentation on the importance of the SET instrument and on how to leave detailed and useful feedback for instructors per the feedback rubric (Appendix B). This peer-led presentation was delivered in 11 classes in AY 2016–2017: midway through the Fall 2016 and Spring 2017 semesters and at the end of the Fall 2016 semester, immediately before students completed end-of-semester SET (see Table 1). The mid-semester SET were used for formative assessment only and comprised the “Mid” group. The “Final” group was composed of official SET completed online at the end of the Fall 2016 semester. Students in their courses completed a Mid and a Final SET immediately after the peer-led presentation while referring to the Student Rubric as a guide (Appendix C). Mid SET were completed on paper, whereas Final SET were completed online. Courses were matched across groups; each instructor submitted rated SET for at least two sections of a given course in the Pre group and another two in either the Mid and/or Final groups. Each section included up to 20 students.

Table 1. Data Collection Timeline
GroupPreMidFinal
Semester(s)Fall 2013–Spring 2015Fall 2016–Spring 2017Fall 2016
SET Responses205157159
Peer-led presentationNone administeredAdministered mid-semesterAdministered before final evaluations
SET timingFinal, onlineMid-semester, paperFinal, online

Data were analyzed for group differences in the quality of feedback provided, in aggregate as well as by question and course division (upper or lower). Pearson’s chi-square or Fisher’s exact test were used to calculate significance as appropriate. A bias-corrected form of Cramér’s V was used to calculate effect size from frequencies.

To assess net change in feedback quality, scored sums were calculated by weighting responses such that highly useful responses ranged from 70%–100% useful, somewhat useful from 1%–69%, and not useful as 0%. H responses were weighted at 0.85, S responses at 0.35, and N responses at -0.2. Findings were robust to a wide range of weights, of which the weights above represent the mean. The negative weight of N responses represents the negative utility of reading and sorting through SET that provide no actionable feedback. The percentage change in scores was used to measure effect size.

Qualitative data from students and faculty was collected through a feedback survey (Appendix D). At the end of the peer-led presentation, students responded to a survey on the presentation materials, feedback skills, and whether they recommend the presentation to others.

The four participating faculty completed a two-question survey after analyzing their individual SET results about whether: (a) the responses generated any actionable data and if they could implement changes based on the feedback received and (b) if they would recommend the peer-led presentation to other instructors and why.

Results

SET improved on all questions from Pre to Mid based on aggregated data (Table 2). However, controlling for class division reveals that this improvement was not uniform. Class division was a highly significant and moderately strong predictor of feedback quality, with upper division (UD) students providing more useful feedback (p < .001, V = .315). Additionally, UD students’ SET improved significantly from Pre to Final (p < .001, V = .429) but not from Pre to Mid, except on Q3, in which they improved moderately (p < .027, V = .235). Lower division (LD) students improved slightly from Pre to Mid on Q1 and Q2 (Table 3) but not from Pre to Final on any of the questions or in aggregate. Their responses did not improve on Q3 in either the Mid or Final conditions.

Table 2. Differences in Response Quality by Condition and Question
QuestionPre vs. MidMid vs. FinalPre vs. Final
p < V p <V p <V
Q1.001.2190.660.00--
Q2.007.1540.600.00--
Q3.019.140.005.173.02.128
Note. Pre vs. Final calculations were not computed when Mid and Final did not differ significantly.
Table 3. Differences in Response Quality by Condition, Question, and Division
DivisionQuestionPre vs. MidPre vs. Final
Increase VIncreaseV
UDAll Questions11.9%0.06441.6%0.429***
Q121.6%0.22453.8%0.504***
Q217.8%0.21828.7%0.347***
Q3-3.7%0.235*42.3%0.312**
LDAll Questions25.4%0.144***0.1%0.000
Q117.8%0.203**3.0%0.000
Q215.7%0.152*5.1%0.056
Q329.7%0.079-7.6%0.000
*p < .05, **p < .01, ***p < .001. Significance and Cramér’s V were based on raw data while percentage increases were calculated from scored sums. UD Q3 Pre vs. Mid differed significantly because feedback quality became more polarized (more responses scored as H or N).

Analysis of scored sums supports these findings. UD students provided more useful responses to all questions in all conditions (Table 4), and they derived sizable benefit from training, whereas LD students did not (Table 3). Students found the feedback presentation helpful and felt that their ability to give feedback had improved significantly (Appendix D, Question 1). Upper and lower division students rated their skill development equally. However, only students in UD courses performed better in their SET responses. Nearly all of them found the provided feedback rubric useful (92%, n = 291). Most students (83%, n = 262) responded that they would recommend the feedback presentation be delivered in other classes.

Table 4. Scored Sums of Response Quality by Condition, Question, and Division
DivisionQuestionPre MidFinal
UDAll Questions54.5%61.1%41.6%
Q150.8%61.8%78.1%
Q260.6%71.4%78.0%
Q351.9%50.0%73.9%
LDAll Questions43.1%53.6%43.7%
Q143.0%56.2%44.3%
Q254.4%62.9%57.1%
Q332.1%41.6%29.6%

Based on the faculty survey results, participating faculty recommend the presentation to other instructors, and upon analyzing students’ comments, faculty identified concrete ways to enhance their courses for content and instruction. Three of the four participating faculty stated the following:

Students were able to articulate their ideas in concrete terms. I learned both the what and why of their feedback. With it, I was able to clearly identify aspects of my class that were working, and precisely those that didn’t. In addition, students going through the peer-to-peer presentation wrote much longer comments than other classes I’ve had in the past. Their comments had meaning, and I liked this aspect of the presentation: it showed students how to become more engaged with their classes in terms of assessment, and it showed them how to meet the level of responsibility of their feedback. After the peer-to-peer session, any student comment that was glibly enthusiastic or damning could be easily dismissed as invaluable since the comment lacks ethos. (UD Instructor)

I feel I received more feedback focused on items of importance in regards to content and instruction. I came away with a specific change in regards to my Writing 10 course. I decided to allow/change my late policy to be more flexible. (LD Instructor)

Overall, I did notice that the feedback received after the peer-presentation was more in depth and appeared to be less hesitantly given as honest feedback. There were fewer “no comments” or statements like “The course should stay the same.” Generally, the students appeared less hesitant about giving responses and as a result, took their time when writing their responses and seemed to know the importance of specific feedback, so the instructor could create more student-centered courses and class activities. I used the peer to peer presentations before both midterm and for final evaluations in lower division writing classes. I was able to see that for the most part, my learning activities were successful. However, the students did indicate that some of the instructions for some of the activities were unclear. Thus, for the following semester, I rewrote the activities in a more concise way and orderly step-by-step organization, not making any assumptions about prior knowledge the students possessed about the topic under discussion. I received about 10% more responses than before. (LD Instructor)

Students’ specificity in their feedback is cited repeatedly as being key to usefulness, and students provided more specific feedback after the peer-led presentation. Students also appeared less hesitant to be straightforward and may have felt more responsible for providing actionable feedback and for the potential effects of their feedback or lack thereof.

Research Limitations

Pre SET were gathered from years prior to the Mid and Final SET, and thus we could not control for longitudinal effects such as changes in instructor skill over time. Because evaluations are anonymous, we could not control for the possibility that multiple SET from a single student were included in analysis, and some students may have participated in the peer-led presentation twice as they moved to the next course. Furthermore, no blinding procedure was implemented—faculty knew in advance which group data would belong to, and their expectations could have influenced the ratings they gave to their evaluations. SET in the Mid group were completed with a limited amount of time and after students had only completed the first half of the course. Finally, question order was not randomized and thus might play a role in the difference in response usefulness across questions, as students may have left too little time to answer later questions.

Discussion

The study attempted to answer two questions:

  • 1. Does the usefulness of student feedback that students provide to their instructors on course evaluations or student evaluations of teaching (SET) improve after participating in a peer-led presentation about SET?

Findings indicate that the peer-led presentation is a very effective resource as-is when administered in UD courses just before final course evaluations. The reasons for this significant result could be attributed to the following:

  • UD students have been exposed to more college teaching and thus to a greater variety of course designs and activities. This could explain the exceptionally large difference in feedback quality between UD and LD students on Q3, which is the only question that requires students to draw upon their experiences. However, the lack of improvement in UD students’ feedback from Pre to Mid stands in stark contrast to their sizable improvements from Pre to Final. This may reflect a more serious attitude toward formal SET than to the informal mid-semester evaluations administered by the SATAL interns.
  • The difference in quality of feedback across divisions may be the result of greater experience in college courses or simply of survivorship bias; students in UD courses are far more likely to have opted to take the course.
  • The difference in treatment response between divisions may be because the presentation gives them a mental model to integrate skills and knowledge that UD students already possess. However, LD students may not possess the same foundation and thus would not show the same improvement. Tailoring the presentation to LD students might elicit more useful feedback from them, as suggested below.
  • Multiple exposures to the presentation content may improve feedback from LD students. This peer-led presentation was turned into 7-, 5-, and 3-minute videos as an asynchronous resource available for instructors willing to share them in their classes. The concept behind offering different length videos is supported by the understanding that real learning does not occur in one-time events but needs spacing and repetitions to imprint content into long-term memory (Thalheimer, 2006).
  • 2. Do students find the peer-led presentation and the feedback rubric useful in composing potent responses for their instructors?

Participating faculty and students noted that the peer-led presentation material was conducive to the significant findings providing face validity to the presentation rubric and information consisting of outcome definitions, use of examples, analogy, and practice opportunity for the students to try, fail, and try again. Since the best ways to learn a skill is to observe a model, this intervention gives students the opportunity to interact with their peers as they present a variety of examples. This project provides a resource for students and faculty and implements an original rubric and presentation that can be easily transferred to other programs and institutions to support teaching, learning, and course development.

Further Intervention and Research

A video presentations pilot process was completed during AY 2018–2019 (Signorini and Abuan, 2019). Future research on SET will be conducted using the asynchronous video and will control for limitations in the present study. This next phase of the project will investigate whether reiterating the material at mid-course and during final course evaluations will result in more actionable feedback in LD courses. Additionally, it will examine whether the presentation delivery (live or video) has an impact on feedback quality or response rates. Also, qualitative analysis of SET may yield insight into why UD SET were more useful than LD SET, especially on Q3. Finally, the presentation was administered shortly before students completed their SET. How long do improvements in skill at providing feedback acquired through the presentation persist?

Conclusion

Partnering with undergraduate interns to demonstrate the importance of SET to peers and how to compose potent answers for instructors produced more actionable SET responses that allowed faculty to understand how they could respond to their students’ needs. Overall, students addressed the open-ended questions more fully, and the responses were more aligned to the course outcomes than on previous SET, which enabled faculty to close the assessment loop.

The usefulness and quality of the feedback that students provided to their instructors on SET improved significantly after participating in the peer-led presentation in the UD courses, and it could provide similar results in LD courses if the material is reiterated at different times during the course through direct instruction on how to leave detailed and descriptive feedback.

This project not only benefited the faculty who gathered actionable feedback to adjust their courses but also the students in their classes who received direct instruction on how to provide valuable course feedback. Presentation feedback surveys and course evaluation responses collected show that this presentation was, overall, highly successful in explaining to both UD and LD students the importance of course evaluations and demonstrating how to compose useful and quality feedback. This intervention indicated that responding to SET is a skill that needs to be modeled and explicitly taught by peers rather than instructors through examples, practice, and repetition.

The peer-led presentation, including its components—rubrics, outcomes, and practice—was turned into video form, which became available for all undergraduate courses at UCM after the pilot process was completed. Other institutions are welcome to implement this free asynchronous resource (https://cetl.ucmerced.edu/SATAL_Video).

Funding

This research was supported by a POD Network grant and received the 2019 Menges Award for Outstanding Research in Educational Development and the 2018 Innovation Award.

Acknowledgments

The authors would like to extend their appreciation to the SATAL interns involved in the Students Helping Students project through time: Leslie Bautista, Guadalupe Covarrubias-Oregel, Tamy Do, Brian Hoang, Jesus Lopez, Arby Mariano, Valezka Murillo, Justen Palmer, Sara Patino, Andrew Perez, Tea Pusey, Jose Sandoval, Harmanjit Singh, and Monica Urbina. Our appreciation also goes to MWP faculty, including Marisol Alonso, Kara Ayik, Cheryl Finley, Pam Gingold, Fiona Memmott, Derek Merrill, Stan Porter, Iris Ruiz, and Jane Wilson for their participation in the many phases of the study. We thank CETL and MWP leadership, James Zimmerman, Anne Zanzucchi, and Samantha Ocena for their support and Dr. Alison Cook-Sather and Stephan Bera for graciously sharing their expertise on partnerships and SET.

Biographies

Adriana Signorini coordinates the Students Assessing Teaching and Learning (SATAL) program at the Center for Engaged Teaching and Learning at the University of California, Merced. She facilitates the student and faculty partnerships, and her research interests focus on collaborative approaches to teaching and learning in higher education.

Mariana Abuan is a Lecturer with continuing appointment in the Merritt Writing Program at the University of California, Merced, whose research interest includes assessment of teaching and learning in higher education.

Gautam Panakkal is a former SATAL intern and University of California, Merced Psychology graduate. His current job involves testing self-driving vehicles and studying artificial intelligence.

Sandy Dorantes is an undergraduate student double majoring in Psychology and Spanish at the University of California, Merced. As a SATAL intern, she has trained new hires based on a professional development apprenticeship model and assessed qualitative and quantitative data. In addition, she formed part of the cohort that won the 2019 Robert J. Menges Award for Outstanding Research in Educational Development for her contribution to the data collection and analysis.

References

  • American Association of University Professors. (2015). Policy documents and reports (11th ed.). Johns Hopkins University Press.
  • Astin, A. W. (1993). What matters in college. Liberal Education, 79(4), 4–15.
  • Barr, R. and Tagg, J. (1995). From teaching to learning—A new paradigm for undergraduate education. Change, 27(6), 12.
  • Berk, R. A. (2005). Survey of 12 strategies to measure teaching effectiveness. International Journal of Teaching and Learning in Higher Education, 17(1), 48–62.
  • Berk, R. A. (2006). Thirteen strategies to measure college teaching: A consumer’s guide to rating scale construction, assessment, and decision making for faculty, administrators, and clinicians. Stylus Publishing.
  • Berk, R. A. (2012). Top 20 strategies to increase the online response rates of student rating scales. International Journal of Technology in Teaching and Learning, 8(2), 98–107.
  • Berk, R. A. (2013). Top 10 flashpoints in student ratings and the evaluation of teaching: What faculty and administrators must know to protect themselves in employment decisions. Stylus Publishing.
  • Bloom, B. S. (1956). Taxonomy of educational objectives, handbook I: The cognitive domain. David McKay Co.
  • Boring, A., Ottoboni, K., and Stark, P. B. (2016). Student evaluations of teaching (mostly) do not measure teaching effectiveness. ScienceOpen Research. https://doi.org/10.14293/S2199-1006.1.SOR-EDU.AETBZC.v1
  • Bovill, C., Cook-Sather, A., and Felten, P. (2011). Students as co-creators of teaching approaches, course design, and curricula: Implications for academic developers. International Journal for Academic Development, 16(2), 133–145. https://doi.org/10.1080/1360144X.2011.568690
  • Bryson, C. (2016). Engagement through partnership: Students as partners in learning and teaching in higher education. International Journal for Academic Development, 21(1), 84–86. https://doi.org/10.1080/1360144X.2016.1124966
  • Burden, P. (2008). Does the end of semester evaluation forms represent teacher’s views of teaching in a tertiary education context in Japan? Teaching and Teacher Education, 24, 1463–1475. doi:10.1016/j.tate.2007.11.012
  • Burden, P. (2010). Creating confusion or creative evaluation? The use of student evaluation of teaching surveys in Japanese tertiary education. Educational Assessment, Evaluation and Accountability, 22, 97–117. doi:10.1007/s11092–010- 9093-z
  • Clark, D. J., and Redmond, M. V. (1982). Small group instruction diagnosis: Final report (ERIC Document Reproduction Service no. ED 217954).
  • Clayson, D. E. (2009). Student evaluations of teaching: Are they related to what students learn? Journal of Marketing Education, 31(1), 16–30. https://doi.org/10.1177/0273475308324086
  • Clayson, D. E., and Haley, D. A. (2011). Are students telling us the truth? A critical look at the student evaluation of teaching. Marketing Education Review, 21(2), 101–112. https://doi.org/10.2753/MER1052-8008210201
  • Clayson, D. E., and Sheffet, M. J. (2006). Personality and the student evaluation of teaching. Journal of Marketing Education, 28(2), 149–160. https://doi.org/10.1177/0273475306288402
  • Cook-Sather, A. (2009). From traditional accountability to shared responsibility: The benefits and challenges of student consultants gathering midcourse feedback in college classrooms. Assessment and Evaluation in Higher Education, 34(2), 231–241.
  • Cook-Sather, A., Bahti, M., and Ntem, A. (2019). Pedagogical partnerships: A how-to guide for faculty, students, and academic developers in higher education. Elon University Center for Engaged Learning’s Open Access Book Series. https://www.centerforengagedlearning.org/books/pedagogical-partnerships/
  • Cook-Sather, A., Bovill, C., and Felten, P. (2014). Engaging students as partners in learning and teaching: A guide for faculty. Jossey-Bass.
  • Cook-Sather, A., Signorini, A., Dorantes, S., Abuan, M., Covarrubias-Oregel, G., and Cribb, P. (under review). “I never realized…”: Shared outcomes of different student-faculty partnership approaches to assessing student learning experiences and evaluating faculty teaching.
  • Creswell, J. W., and Creswell, J. D. (2018). Research design: Qualitative, quantitative, and mixed methods approaches (5th ed.). Sage.
  • Deeley, S. J., and Bovill, C. (2017). Staff student partnership in assessment: Enhancing assessment literacy through democratic practices. Assessment and Evaluation in Higher Education, 42(3), 463–477. https://doi.org/10.1080/02602938.2015.1126551
  • Deeley, S. J., and Brown, R. A. (2014). Learning through partnership in assessment. Teaching and Learning Together in Higher Education, 13. http://repository.brynmawr.edu/tlthe/vol1/iss13/3
  • Felton, J. M., Mitchell, J., and Stinson, M. (2004). Web-based student evaluations of professors: The relations between perceived quality, easiness and sexiness. Assessment and Evaluation in Higher Education, 29(1), 91–108. https://doi.org/10.1080/0260293032000158180
  • Fraile, R., and Bosch-Morell, F. (2015). Considering teaching history and calculating confidence intervals in student evaluations of teaching quality: An approach based on Bayesian inference. Higher Education, 70(1), 55–72.
  • Healey, M., Flint, A., and Harrington, K. (2014). Engagement through partnership: Students as partners in learning and teaching in higher education. Higher Education Academy. https://www.advance-he.ac.uk/knowledge-hub/engagement-through-partnership-students-partners-learning-and-teaching-higher
  • Healey, M., Flint, A., and Harrington, K. (2016). Students as partners: Reflections on a conceptual model. Teaching and Learning Inquiry, 4(2), 8–20. https://doi.org/10.20343/teachlearninqu.4.2.3
  • Huston, T. A. (2006). “Race and gender bias in higher education: Could faculty course evaluations impede further progress toward parity?” Seattle Journal for Social Justice, 4(2). http://digitalcommons.law.seattleu.edu/sjsj/vol4/iss2/34
  • Hutchings, P. (January 2005). Building pedagogical intelligence. Carnegie perspectives. The Carnegie Foundation for the Advancement of Teaching.
  • Kornell, N., and Hausman, H. (2016). Do the best teachers get the best ratings? Frontiers in Psychology, 7, 570. https://doi.org/10.3389/fpsyg.2016.00570
  • Lauer C. (2017) A comparison of faculty and student perspectives on course evaluation terminology. To Improve the Academy, 31(1).
  • Linse, A. R. (2017). Interpreting and using student ratings data: Guidance for faculty serving as administrators and on evaluation committees. Studies in Educational Evaluation, 54(9), 94–106. https://doi.org/10.1016/j.stueduc.2016.12.004
  • MacNell, L. D., Driscoll, A., and Hunt, A. N. (2015). What’s in a name: Exposing gender bias in student ratings of teaching. Innovative Higher Education, 40(4), 291–303. https://doi.org/10.1007/s10755-014-9313-4
  • Mercer-Mapstone, L., Dvorakova, S. L., Matthews, K. E., Abbot, S., Cheng, B., Felten, P., Knorr, K., Marquis, E., Shammas, R., and Swaim, K. (2017). A systematic literature review of students as partners in higher education. International Journal for Students as Partners1(1), 1–23.
  • Mihans, R. J., II, Long, D. T., and Felten, P. (2008). Power and expertise: Student-faculty collaboration in course design and the scholarship of teaching and learning. International Journal for the Scholarship of Teaching and Learning, 2(2), 2.
  • Mitchell, K. M. W., and Martin, J. (2018). Gender bias in student evaluations: The teacher. Cambridge University Press. https://doi.org/10.1017/S104909651800001X
  • Peseta, T., Bell, A., Clifford, A., English, A., Janarthana, J., Jones, C., Teal, M., and Zhang, J. (2016). Students as ambassadors and researchers of assessment renewal: Puzzling over the practices of university and academic life. International Journal for Academic Development, 21(1), 54–66. https://doi.org/10.1080/1360144X.2015.1115406
  • Price, M., Handley, K., Millar, J., and O’Donovan, B. (2010). Feedback: All that effort, but what is the effect? Assessment and Evaluation in Higher Education, 35 (3), 277–289.
  • Quinlan, K. M. (2016). How emotion matters in four key relationships in teaching and learning in higher education. College Teaching, 64(3), 101–111. https://doi.org/10.1080/87567555.2015.1088818
  • Rhem, J. (2020, February 7). Re: Alternative teaching evaluations. POD Network Open Discussion Group. https://groups.google.com/a/podnetwork.org/g/discussion/c/W8R3Op0ht7E/m/w_J1H94hGAAJ?pli=1
  • Signorini, A. (2014). Involving undergraduates in assessment: Assisting peers to provide constructive feedback. Assessment Update, 26(6), 3–13. https://doi.org/10.1002/au.30002
  • Signorini, A., and Abuan, M. (2019, February 21). Students helping students provide valuable feedback on course evaluations (eNewsletter post no. 1700). Tomorrow’s Professor. https://tomprof.stanford.edu/posting/1700
  • Signorini, A., and Pohan, C. A. (2019). Exploring the impact of the Students Assessing Teaching and Learning Program. International Journal for Students as Partners, 3(2), 139–148. https://doi.org/10.15173/ijsap.v3i2.3683
  • Smith, M. K., Jones, F. H. M., Gilbert, S. L., and Wieman, C. E. (2013). The classroom observation protocol for undergraduate STEM (COPUS): A new instrument to characterize university STEM classroom practices. CBE—Life Sciences Education, 12(4), 618–627. A protocol sheet in Excel format is available at www.cwsei.ubc.ca/resources/COPUS.htm. https://doi.org/10.1187/cbe.13-08-0154
  • Spooren, P. B., Brockx, B., and Mortelmans, D. (2013). On the validity of student evaluation of teaching: The state of the art. Review of Educational Research, 83(4), 598–642. https://doi.org/10.3102/0034654313496870
  • Spooren, P., and Christiaens, W. (2017). I liked your course because I believe in (the power of) student evaluations of teaching (SET). Students’ perceptions of a teaching evaluation process and their relationship with SET scores. Studies in Educational Evaluation, 57(9), 43–49.
  • Stark, P. B., and Freishtat, R. (2014). An evaluation of course evaluations. ScienceOpen Research. https://doi.org/10.14293/S2199-1006.1.sor-edu.aofrqa.v1
  • Svinicki, M. D. (2001). Encouraging your students to give feedback. New Directions for Teaching and Learning, 87, 17–24.
  • Thalheimer, W. (2006, February). Spacing learning events over time: What the research says.https://www.worklearning.com/wp-content/uploads/2017/10/Spacing_Learning_Over_Time__March2009v1_.pdf
  • University of California, Merced, Center for Institutional Effectiveness https://cie.ucmerced.edu/analytics-hub
  • Uttl, B., White, C. A., and Wong Gonzalez, D. (2017). Meta-analysis of faculty’s teaching effectiveness: Student evaluation of teaching ratings and student learning are not related. Studies in Educational Evaluation, 54, 22–42.https://doi.org/10.1016/j.stueduc.2016.08.007
  • Wieman, C. (2015). A better way to evaluate undergraduate teaching. Change: The Magazine of Higher Learning47(1), 6–15. https://doi.org/10.1080/00091383.2015.996077
  • Wittman, J., and Abuan, M. (2015). Socializing future professionals: Exploring the matrix of assessment. Pedagogy, 15(1), 59–70. https://doi.org/10.1215/15314200-2799180

Appendix A: Question Response Quality

Question 1 response quality: “How would you describe your writing ability now compared to the beginning of the semester?” The difference in feedback quality from Pre to Mid was significant (p < 0.001, V = 0.22). The feedback received in the Final condition was not significantly different from that in the Mid condition.

Question 2 response quality: “Identify and evaluate aspects of this course that have been especially helpful to you.” The distribution of feedback across the three categories of usefulness changed significantly from Pre to Mid (p < 0.007, V = 0.15). Feedback did not differ significantly between the Mid and Final groups. 

Question 3 response quality: “Describe aspects of this course that you would change if you had the opportunity.” Faculty received a higher proportion of helpful feedback immediately following the live presentation (p < 0.018, V = 0.14). Notably, the results for Mid and Final SET were significantly different from each other (< 0.004, V = 0.17), unlike in the previous two questions. 

Scored sums of responses by question. Feedback was scored by quality using a range of point values for H, S, and N and summed to create an aggregate score. Results were fairly similar across the entire range of reasonable values; moderate values of 0.85 for H, 0.35 for S, and -0.2 for N were used here and for subsequent analyses. As such, the maximum score is 85%. 

Appendix B: Instructor Rubric

Instructors collected and rated the usefulness of student comments according to the following criteria:

Highly UsefulI clearly understand the experience the student is having, what I am doing well, or what I could do better. I know what I should continue doing in this class and exactly what I can do to improve my course and/or instruction. Any improvements that need to be made are plausible and are within my control.
Somewhat UsefulI have a general or vague idea of what is going well or what I should change to improve my course, but it is not completely clear. I can make a change to my course or instruction, but I may not get the result this student is looking for. I may not have the ability to completely make this change.
Not UsefulI don’t know what I can do to improve my course at all based on this answer. It tells me nothing about my class or pedagogy. I can’t tell if the student is having a positive learning experience or negative experience and/or exactly why. I have no control over making this change.
0No response.

Appendix C: Student Rubric

Students receiving the peer-led presentation were given the following instructions and rubric:

You are welcome to address any aspect of the course you wish, but I would particularly appreciate your feedback about the following:

  • Giving and attending to feedback
  • Analyzing readings
  • Developing a topic
  • Composing an argument and integrating evidence
  • Crafting an essay
How To Provide Valuable Feedback on Course Evaluations
CriteriaHighly UsefulSomewhat UsefulNot Useful
1.Offer commentary on attributes of the learning environment.“I find the instructor very caring and that motivates me to try harder in this class”“The instructor cares about my learning.”“My instructor’s hair is cool.”
2.Answer all parts of the question focusing on description rather than judgment.“My writing ability now is better than at the beginning because now I am more confident in my work and writing based on the feedback I received from instructor and peers.”“It improved a lot. I noticed that my critical thinking ability has improved a lot.”“Hard class.”
3.Attribute positive or constructive feedback to specific aspects of the course. Use examples that support your answer to the question.“Before this class I was every unsure on how to do a research paper, now that I have taken the class I am more confident in my writing skills. I understand how to format a research paper correctly and how to follow MLA.”“Instructor sometimes describes things unclearly, but I always ask questions if I am confused about anything.”“Research projects are stressful”
4.Focus on the course and the quality of instruction given regarding the course learning outcomes.“I loved the projects, in particular group discussions were very important to understand the readings.”“Peer review, presenting, and office hours helped me with learning.”“I wish that Cat Courses told us when assignments are due”
5.Offer suggestions that are relevant and plausible to the course or instruction and why you think they would help your learning.“If I had the opportunity, I would include more journal writings or just open-ended writing assignments so students could grow more.”“I wouldn’t change anything.”“This class is too early.”

Appendix D: Student Input on the Peer-Led Feedback Presentation

Students’ Rating on Their Skill Development (Question 1: On a scale of 1–5 (5 being the highest), how would you rate your skills at providing valuable feedback on course evaluation after this presentation compared to before?

Rubric Usefulness (Question 2: Do you find the rubric useful?)

Students’ Recommend Presentation to Other Courses (Question 3: Would you recommend that this presentation should be facilitated in other courses?)

Appendix E: Post Surveys

Table E1. Peer-Led Presentation: Post Survey (for students)

About the peer-led presentation …

  1. On a scale of 1–5 (with 5 being the highest), how would you rate your skills at providing valuable feedback on course evaluation after this peer-led presentation compared to before the presentation?
    • 1- 2- 3- 4- 5
    • Please explain why:
  2. Do you find the rubric useful?
    • Yes _____ No _____
    • If so, which criteria was the most useful.
    • If not, why not?
  3. Would you recommend any changes to the rubric? Please explain.
  4. How effective do you find the peer-led presentation? What suggestions do you have to improve this presentation?
  5. Would you recommend that this peer-led presentation should be facilitated in other courses?
    • Yes _____ No _____
    • Please explain
  6. If the peer-led presentation was offered by faculty instead of students, what would have been different? Why?

Table E2. Post Survey for Faculty

  1. Do you recommend this “peer to peer” presentation to other faculty prior to administering their mid or final course evaluations? Please explain.
  2. For this question, please indicate whether you are referring to a lower division or upper division course: Did you come away with ideas for concrete changes to make after reflecting on the student feedback you received in your final course evaluations? If so, what changes did you implement? Please give at least one specific example.
  3. Overall, did you notice any difference between the feedback you received from students after they participated in this project? If no, in which way was it the same? If yes, in which way was it different?