Are you interested in what the research says about student course evaluations, aka student evaluation of teaching (SET)? Jordan Troisi was, so he asked the members of the Society for the Teaching of Psychology-operated listserv PsychTeacher for articles on SET. He received plenty of responses. Here are the references sent to him, along with a few others I found along the way. I included links to the original article when available. This is by no means a comprehensive list, but it should be enough to get you started.
Ambady, N., & Rosenthal, R. (1993). Half a minute: Predicting teacher evaluations from thin slices of nonverbal behavior and physical attractiveness. Journal of Personality and Social Psychology, 64(3), 431-441. doi:10.1037//0022-35184.108.40.2061
Bennett, S. K. (1982). Student perceptions of and expectations for male and female instructors: Evidence relating to the question of gender bias in teaching evaluation. Journal of Educational Psychology, 74(2), 170-179. doi:10.1037//0022-06220.127.116.11
Benton, S. L., & Cashin, W. E. (n.d.). Student ratings of teaching: A summary of research and literature. Retrieved February 4, 2017, from http://www.ideaedu.org/research-and-papers/idea-papers/50-student-ratings-teaching-summary-research-and-literature
Boring, A. (2017). Gender biases in student evaluations of teaching. Journal of Public Economics, 145, 27-41. doi:10.1016/j.jpubeco.2016.11.006
Boring, A., Ottoboni, K., & Stark, P. B. (2016). Student evaluations of teaching (mostly) do not measure teaching effectiveness. ScienceOpen Research. doi:10.14293/s2199-1006.1.sor-edu.aetbzc.v1
A January 25, 2016 NPR story (“Why female professors get lower ratings”) features this study.
Boysen, G. A. (2016). Using student evaluations to improve teaching: Evidence-based recommendations. Scholarship of Teaching and Learning in Psychology, 2(4), 273-284. doi:10.1037/stl0000069
Boysen, G. A., Richmond, A. S., & Gurung, R. A. (2015). Model teaching criteria for psychology: Initial documentation of teachers’ self-reported competency. Scholarship of Teaching and Learning in Psychology, 1(1), 48-59. doi:10.1037/stl0000023
Boysen, G. A., Kelly, T. J., Raesly, H. N., & Casner, R. W. (2013). The (mis)interpretation of teaching evaluations by college faculty and administrators. Assessment & Evaluation in Higher Education, 39(6), 641-656. doi:10.1080/02602938.2013.860950
Carrell, S., & West, J. (2010). Does professor quality matter? Evidence from random assignment of students to professors. Journal of Political Economy, 118(3), 409-432. doi:10.3386/w14081
DeWitt, P. (2015, January 02). 10 seconds: The time it takes a student to size you up. Retrieved from http://blogs.edweek.org/edweek/finding_common_ground/2015/01/10_seconds_the_time_it_takes_a_student_to_size_you_up.html
Eiszler, C. F. (2002). College students' evaluations of teaching and grade inflation. Research in Higher Education, 43(4), 483-501.
Gamliel, E., & Davidovitz, L. (2005). Online versus traditional teaching evaluation: Mode can matter. Assessment & Evaluation in Higher Education, 30(6), 581-592. doi:10.1080/02602930500260647
Kite, M. E. (2012). Effective evaluation of teaching: A guide for faculty and administrators. Retrieved from the Society for the Teaching of Psychology web site: http://teachpsych.org/ebooks/evals2012/index.php
Macnell, L., Driscoll, A., & Hunt, A. N. (2014). What's in a name: Exposing gender bias in student ratings of teaching. Innovative Higher Education, 40(4), 291-303. doi:10.1007/s10755-014-9313-4
A February 23, 2015 Inside Higher Ed blog (“Gender bias in student evaluations”) features this study.
Onwuegbuzie, A. J., Witcher, A. E., Collins, K. M., Filer, J. D., Wiedmaier, C. D., & Moore, C. W. (2007). Students' perceptions of characteristics of effective college teachers: A validity study of a teaching evaluation form using a mixed-methods analysis. American Educational Research Journal, 44(1), 113-160. doi:10.3102/0002831206298169
Ory, J. C. (2000). Teaching evaluation: Past, present, and future. In Evaluating Teaching in Higher Education: A Vision for the Future. San Francisco, CA: Jossey-Bass.
Pusateri, T. (1016, December 16). Student feedback on teaching: Why mean ratings may not tell the full story. Retrieved from http://cetl.kennesaw.edu/article/student-feedback-teaching-why-mean-ratings-may-not-tell-full-story
Richmond, A. S., Boysen, G. A., Gurung, R. A., Tazeau, Y. N., Meyers, S. A., & Sciutto, M. J. (2014). Aspirational model teaching criteria for psychology. Teaching of Psychology, 41(4), 281-295. doi:10.1177/0098628314549699
Sidanius, J., & Crane, M. (1989). Job evaluation and gender: The case of university faculty. Journal of Applied Social Psychology, 19(2), 174-197. doi:10.1111/j.1559-1816.1989.tb00051.x
Spooren, P., Brockx, B., & Mortelmans, D. (2013). On the validity of student evaluation of teaching: The state of the art. Review of Educational Research, 83(4), 598-642.
Sproule, R. (2000). Student evaluation of teaching: Methodological critique. Education Policy Analysis Archives, 8(50). doi:10.14507/epaa.v8n50.2000
Sproule, R. (2002). The underdetermination of instructor performance by data from the student evaluation of teaching. Economics of Education Review, 21(3), 287-294. doi:10.1016/s0272-7757(01)00025-5
Sproule, R., & Valsan, C. (2009). The student evaluation of teaching: Its failure as a research program, and as an administrative guide. Economic Interferences, 11(25), 125-150. Retrieved from http://www.amfiteatrueconomic.ro/temp/Article_641.pdf
Stark, P. (2013, October 18). Do student evaluations measure teaching effectiveness? Retrieved from http://blogs.berkeley.edu/2013/10/14/do-student-evaluations-measure-teaching-effectiveness
Stark, P. B., & Freishtat, R. (2014). An evaluation of course evaluations. ScienceOpen Research. doi:10.14293/s2199-1006.1.sor-edu.aofrqa.v1
A September 26, 2014 NPR story (“Student course evaluations get an ‘F’”) features this study.
Stroebe, W. (2016, July 17). Student evaluations of teaching: No measure for the TEF. Retrieved from https://www.timeshighereducation.com/comment/student-evaluations-teaching-no-measure-tef
Stroebe, W. (2016). Why good teaching evaluations may reward bad teaching: On grade inflation and other unintended consequences of student evaluations. Perspectives on Psychological Science, 11(6), 800-816. doi:10.1177/1745691616650284
Wolfer, T. A., & McNown Johnson, M. (2003). Re-evaluating student evaluation of teaching: The teaching evaluation form. Journal of Social Work Education, 39(1), 111-121.