教學大綱 Syllabus

科目名稱:測驗編製與量表發展研究

Course Name: Study on Test Construction and Scale Development

修別:選

Type of Credit: Elective

3.0

學分數

Credit(s)

40

預收人數

Number of Students

課程資料Course Details

課程簡介Course Description

本課程旨在介紹測驗、問卷及量表的編製原理與建置方法,同時介紹幾種常用的資料分析軟體程式,並透過閱讀相關的研究報告,以達到培育具備獨立編製一份測量工具與應用分析軟體的能力。

*修課的背景知識

    凡欲選修本課程者,學習者必須具備下列的先備課程知識

1.大學部「教育統計學」、「教育測驗與評量」、「學習評量」、「心理與教育測驗」、「心理測驗與評量」或其他相關課程知識(必要)。

2.碩士班「高等教育統計學」、「教育研究法」等課程知識(必要)。

3.具備操作SPSS統計套裝程式的能力(必要)。

4.碩博班合開之「測驗理論研究」、「多變量分析」、「潛在變項模式」等課程知識(建議)。

核心能力分析圖 Core Competence Analysis Chart

能力項目說明


    課程目標與學習成效Course Objectives & Learning Outcomes

    1.瞭解測驗編製與量表發展的原理原則。

    2.瞭解測驗分析軟體在工具建構的應用。

    3.培養測驗分析軟體的操作與應用能力。

    4.培養能夠獨立編製測驗與量表的能力。

    5.培養利用自編工具進行獨立研究能力。

    每周課程進度與作業要求 Course Schedule & Requirements

    教學週次Course Week 彈性補充教學週次Flexible Supplemental Instruction Week 彈性補充教學類別Flexible Supplemental Instruction Type

    五、大綱及教學進度

     

    週次

    日期

    內容及進度

    1

    2/22

    課程導論測量建構/潛在變項概念

    2

    2/29

    教育(成就)測驗編製

    3

    3/7

    古典測驗理論試題分析、SP表分析

    4

    3/14

    古典測驗理論信度/效度分析

    5

    3/21

    當代測驗理論—IRT簡介1

    6

    3/28

    當代測驗理論—IRT簡介2

    7

    4/4

    春假(兒童節)停課一次

    8

    4/11

    心理量表發展Rasch模型簡介

    9

    4/18

    探索性因素分析(EFA

    10

    4/25

    驗證性因素分析(CFA

    11

    5/2

    Rasch測量模式與分析-1 二元計分

    12

    5/9

    Rasch測量模式與分析-2 RSM

    13

    5/16

    Rasch測量模式與分析-3 PCM

    14

    5/23

    Rasch測量模式與分析-4 MFM

    15

    5/30

    Rasch測量模式與分析-5 MRCMLM

    16

    6/6

    常模與量尺建立

    17

    6/13

    差異試題功能(DIF)分析

    18

    6/20

    繳交期末報告

    註:1.本課程為3學分,每週四9:10~12:00上課,上課地點:井塘樓二樓電腦教室。

            2.補充資料會公布在學校Moodle平台上,請點選本課程名稱進入即可。

            3.教師Office Hours時間:每週五下午210400(需事先預約),地點:井塘樓四樓020406研究室。

    授課方式Teaching Approach

    40%

    講述 Lecture

    20%

    討論 Discussion

    0%

    小組活動 Group activity

    40%

    數位學習 E-learning

    0%

    其他: Others:

    評量工具與策略、評分標準成效Evaluation Criteria

    1.完成指定讀物、出席率及參與課堂活動,佔40%的學期成績。

    2.修課同學每人自行選定一個Project主題,利用課堂所學知識,自編一份量表或修訂一份現成量表,難後收集有效的大樣本(N>1000)數據,數據分析後,搭配文獻評閱的說明,撰寫一份具學術性論文品質的書面期末報告(文長不超過15千字,企盼能具有投稿外審制學術刊物或出版成論文集之水準,並踴躍去投稿,不光只是繳交期末報告而已),佔60%的學期成績。

    3.預定繳交期末報告日期:113620() 1200,請將電子檔期末報告檔案(標題請註明姓名+學號e-mail tomnyu@nccu.edu.tw

    4.本課程學習成績之計算,即為前述12項成績之總和。

    指定/參考書目Textbook & References

    *為指定教科書

    *余民寧 (2020)量表編製與發展Rasch測量模型的應用。臺北市:心理。

    參考書目

    余民寧 (2006)潛在變項模式:SIMPLIS的應用臺北市:高等教育。

    余民寧(2009)。試題反應理論(IRT)導論與應用臺北市:心理。

    余民寧(2012)。心理與教育統計學(第三版)。臺北市:三民。

    余民寧 (2013)縱貫性資料分析:LGM的應用。臺北市:心理。

    *余民寧 (2020)量表編製與發展Rasch測量模型的應用。臺北市:心理。

    余民寧(2022)。教育測驗與評量成就測驗與教學評量(第四版)。臺北市:心理。

    李茂能 (2006)結構方程模式軟體Amos之簡介及其在測驗編製上之應用臺北市:心理。

    魏勇剛、龍長權、宋武譯(2010)。量表編製:理論與應用臺北市:五南。(Robert F. Devellis原著。Scale development: Theory and applications.

    邱皓政(2011)。量化研究法():測驗原理與量表發展技術臺北市:雙葉書廊。

    陳新豐 (2018)R語言:量表編製、統計分析與試題反應理論臺北市:五南。

    涂金堂 (2023)量表編製與SPSS (2)臺北市:五南。

    Adams, R. J., Wu, M. L., Cloney, D., Berezner, A., & Wilson, M.R. (2020). ACER ConQuest 5: Generalized item response modeling software [Computer software]. Camberwell, Victoria: Australian Council for Educational Research.

    Alagumalai, S., Curtis, D. D., & Hungi, N. (Eds.) (2005). Applied Rasch measurement: A book of exemplars. New York: Springer-Verlag.

    Allen, W. J., & Yen, W. M. (2001). Introduction to measurement theory (2nd ed.). Monterey, CA: Brooks/Cole.

    Andersen, E. B. (1973). Conditional inference and models for measuring. Copenhagen: Mentalhygiejnisk Forlag.

    Andrich, D., & Marais, I. (2019). A course in Rasch measurement theory: Measuring in the educational, social and health Sciences. New York: Springer.

    Bandalos, D. L. (2018). Measurement theory and applications for the social sciences. Guilford.

    Bartram, D., & Hambleton, R. (Eds.) (2006). Computer-based testing and the internet: Issues and advances. Hoboken, NJ: John Wiley & Sons.

    Berk, R. A. (Ed.) (1982). Handbook of methods for detecting test bias. Baltimore, MD: Johns Hopkins University Press.

    Boeck, P., & Wilson, M. (Eds.) (2004). Explanatory item response models: A generalized linear and nonlinear approach. New York: Springer-Verlag.

    Bond, T. G., & Fox, C. M. (2015). Applying the Rasch model: Fundamental measurement in the human science (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.

    Brennan, R. L. (2001). Generalizability theory. New York: Springer-Verlag.

    Brennan, R. L. (Ed.) (2006). Educational measurement (4th ed.). Washington , DC: National Council on Measurement in Education.

    Cardinet, J., Johnson, S., & Pini, G. (2009). Applying generalizability theory using EduG. Routledge/Psychpress.

    Camilli, G., & Shepard, L. A. (1994). Methods for identifying biased test items. Thousand Oaks, CA: Sage.

    Cizek, G. J. (Ed.) (2001). Setting performance standards: Concepts, methods, and perspectives. Mahwah, NJ: Lawrence Erlbaum Associates.

    Cizek, G. J., & Bunch, M. B. (2006). Standard setting: A guide to establishing and evaluating performance standards on tests. Thousand Oaks, CA: Sage.

    Coaley, K. (2010). An introduction to psychological assessment and psychometrics. Thousand Oaks, CA: Sage.

    CohenR., & Swerdlik, M. (2009). Psychological testing and assessment: An introduction to tests and measurement (7th ed.). McGraw-Hill.

    Crocker, L., & Algina, J. (2006). Introduction to classical and modern test theory (2nd ed.). New York: Wadsworth.

    DeVellis, R. F. (2017). Scale development: Theory and applications (4th ed.). Thousand Oaks, CA: Sage.

    Dorans, N. J., Pommerich, M., & Holland, P. W. (Eds.) (2007). Linking and aligning scores and scales. New York: Springer-Verlag.

    Downing, S. M., & Haladyna, T. M. (Eds.) (2006). Handbook of test development. Mahwah, NJ: Lawrence Erlbaum Associates.

    Du Toit, M. (Ed.) (2003). IRT from SSI: BILOG-MG, MULTILOG, PARSCALE, TESTFACT. Chicago, IL: Scientific Software International, Inc.

    Embretson, S. E. (Ed.) (1985). Test design: Developments in psychology and psychometrics. Orlando, FL: Academic Press.

    Engelhard, G. Jr. (2012). Invariant measurement. Taylor & Francis.

    Engelhard, G. Jr., & Wang, J. (2021). Rasch models for solving measurement problems:  Invariant measurement in the social sciences. Thousand Oaks, CA: Sage..

    Fischer, G. H., & Molenaar, I. W. (Eds.) (1995). Rasch models: Foundations, recent developments, and applications. New York: Springer-Verlag.

    Furr, R. M. (2011). Scale construction and psychometrics for social and personality psychology. Thousand Oaks, CA: Sage.

    Haladyna, T. M. (1996). Writing test items to evaluate higher order thinking. Allyn & Bacon.

    Haladyna, T. M. (2004). Developing and validating multiple-choice test items. Mahwah, NJ: Lawrence Erlbaum Associates.

    Hand, D. J. (2004). Measurement theory and practice. Hodder Arnold.

    Hektner, J. M., Schmidt, J. A., & Csikszentmihalyi, M. (2006). Experience sampling method measuring the quality of everyday life. Thousand Oaks, CA: Sage.

    Henshaw, J. M. (2006). Does measurement measure up? How numbers reveal and conceal the truth. Boston, MA: The Johns Hopkins University Press.

    Holland, P. W., & Rubin, D. B. (1982). Test equating. New York: Academic Press.

    Holland, P. W., & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Lawrence Erlbaum Associates.

    Irwing, P., Booth, T., & Hughes, D. J. (2018). The Wiley handbook of psychometric testing: A multidisciplinary reference on survey, scale and test development. Blackwell.

    Irvine, S. H., & Kyllonen, P. C. (Eds.) (2002). Item generation for test development. Mahwah, NJ: Lawrence Erlbaum Associates.

    Jalali, S. (2011). Item response theory and its applications to differential item functioning, item banking, and computer adaptive testing in TEFL. Iran: LAP LAMBERT Academic Publishing.

    Keeves, J. P., & Alagumalai, S. (Eds.) (2005). Applied Rasch measurement: A book of exemplars: Papers in honour of John P. Keeves. Dordrecht; Norwell, MA: Springer.

    Khine, M. S. (Ed.) (2020). Rasch measurement: Applications in quantitative educational research. New York: Springer-Verlag.

    Kline, T. J. B. (2005). Psychological testing: A practical approach to design and evaluation. Thousand Oaks, CA: Sage.

    Kline, P. (2016). A handbook of test construction: Introduction to psychometric design. New York, NY: Routledge.

    Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices (2nd ed.). New York: Springer-Verlag.

    Kurpius, S. E. R., & Stafford, M. E. (2005). Testing and measurement: A user-friendly guide. Thousand Oaks, CA: Sage.

    McCoach, D. B., Gable, R. K., & Madura, J. P. (2013). Instrument development in the affective domain. School and corporate applications (3rd Ed.). New York, NY: Springer.

    Millsap, R. E. (2011). Statistical approaches to measurement invariance. Routledge Academic.

    Nering, M. L., & Ostini, R. (2010). Handbook of polytomous item response theory models: Developments and applications. Taylor and Francis.

    Nering, M. L., & Ostini, R. (2006). Polytomous item response theory models. Thousand Oaks, CA: Sage.

    Netemeyer, R. G., Bearden, W. O., & Sharma, S. (2003). Scaling procedures: Issues and applications. Thousand Oaks, CA: Sage.

    Osterlind, S. J. (1998). Constructing test items: Multiple-choice, constructed-response, performance and other formats. Norwell, MA: Kluwer Academic.

    Osterlind, S. J., & Everson, H. T. (2009). Differential item functioning. Thousand Oaks, CA: Sage.

    Ostini, R., & Nering, M. L. (2005). Polytomous item response theory models. Thousand Oaks, CA: Sage.

    Price, L. R. (2016). Psychometric methods: Theory into practice. Guilford.

    Rabinovich, S. (2005). Measurement errors and uncertainties: Theory and practice (3rd ed.). New York: Springer-Verlag.

    Rasch, G. (1980). Probability models for some intelligence and attainment tests. Chicago: The University of Chicago Press. (Original edition was published in 1960 by The Danish Institute for Educational Research, Copenhagen)

    Raykov, T., & Marcoulides, G. A. (2011). Introduction to psychometric theory. New York, NY: Routledge.

    Raykov, T. (2015). Scale construction and development. Lecture Notes. Measurement and quantitative methods. East Lansing, MI: Michigan State University.

    Reckase, M. D. (2009). Multidimensional item response theory. New York: Springer Verlag.

    Reise, S. P., & Revicki, D. A. (2015). Handbook of item response theory modeling: Applications to typical performance assessment. New York: Routledge.

    Rust, J., & Golombok, S. (2009). Modern psychometrics: The science of psychological assessment (3rd ed.). London: Routledge.

    Salkind, N. J. (2005). Tests & measurement for people who (think they) hate tests & measurement. Thousand Oaks, CA: Sage.

    Saris, W. E., & Gallhofer, I. N. (2007). Design, evaluation, and analysis of questionnaires for survey research. Hoboken, NJ: Wiley.

    Schuur, W. H. (2011). Ordinal item response theory: Mokken scale analysis. Thousand Oaks, CA: Sage.

    Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park: CA: Sage.

    Sharon A., Shrock , S. A., & Coscarelli, W. C. (2007). Criterion-referenced test development: Technical and legal guidelines for corporate training (3rd ed.). New York: John Wiley & Sons.

    Shiel, G., & Cartwright, F. (2015). Analyzing data from a national assessment of educational achievement. Geneva: World Bank.

    Shultz, K. S., & Whitney, D. J. (2004). Measurement theory in action: Case studies and exercises. Thousand Oaks, CA: Sage.

    Sijtsma, L., & Molenaar, I. W. (2002). Introduction to nonparametric item response theory. Thousand Oaks, CA: Sage.

    Smith, E. V., & Smith, R. M. (2004). Introduction to Rasch measurement: Theory, models and applications. JAM Press.

    Streiner, D. L., Norman, G. R., & Cairney, J. (2015). Health measurement scales: A practical guide to their development and use (5th ed.). Oxford, UK: Oxford University Press.

    Tang, K. L. (1996). Polytomous item response theory (IRT) models and their applications in large-scale testing programs : review of literature. ETS.

    Thomas M. Haladyna, T. M. (1999). Developing and validating multiple-choice test items.

    Van der Linden, W. J. (2005). Linear models for optimal test design. New York: Springer.

    Van Der Linden, W. J., & Hambleton, R. K. (Eds). (2016). Handbook of item response theory: Volume one: Models. Chapman & Hall.

    Van Der Linden, W. J. (Ed). (2016). Handbook of item response theory: Volume two: Statistical tools. Chapman & Hall.

    Van Der Linden, W. J., & Hambleton, R. K. (Eds). (2018). Handbook of item response theory: Volume three: Applications. Chapman & Hall.

    Viswanathan, M. (2005). Measurement error and research design. Thousand Oaks, CA: Sage.

    Von Davier, M., & Carstensen, C. H. (2009). Multivariate and mixture distribution Rasch models: Extensions and applications. New York: Springer-Verlag.

    Von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004). The kernel method of test equating. New York: Springer-Verlag.

    Wainer, H., Bradlow, E. T., & Wang, X. (2007). Testlet response theory and its applications. Cambridge, UK: Cambridge University Press.

    Wainer, H., & Braun, H. I. (Ed.) (1988). Test validity. Hillsdale, NJ: Lawrence Erlbaum Associates.

    Waugh, R. (2010). Applications of Rasch measurement in education. Nova Science Pub Inc.

    Waugh, R. (2010). Specialized Rasch measures applied at the forefront of education. Nova Science Pub Inc.

    Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah, NJ: Lawrence Erlbaum Associates.

    Wind, S. A. (2022). Exploring rating scale functioning for survey research. Thousand Oaks, CA: Sage.

    Wind, S. A., & Hua, C. (2022). Rasch measurement theory analysis in R. Chapman & Hall.

    Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago: MESA Press.

    Wright, B. D., & Stone, M. H. (1979). Best test design. Chicago: MESA Press.

    Wright, R. J. (2007). Educational assessment: Tests and measurements in the age of accountability. Thousand Oaks, CA: Sage.

    Wu, M. L., Tam, H. P., & Jen, T. H. (2016). Educational measurement for applied researchers. Singapore: Springer Nature Singapore.

    Yen, M. W. (1992). Item response theory. In M. Alkin (Ed.), Encyclopedia of educational research (6th ed., pp. 657-667). New York: Macmillan.

    Yen, W. M., & Fitzpatrick, A.R. (2006). Item response theory. In R. T. Brennan (Ed.), Educational measurement (4th ed., pp. 111-154). Westport, CT: Praeger.

    Yu, C. (2002). Evaluating cutoff criteria of model fit indices for latent variable models with binary and continuous outcomes. Los Angeles, CA: University of California, Los Angeles.

    Zoanetti, N. (2011). Applications of item response theory to explore rater data quality: Analysing fit statistics and plausible values to identify and correct for suspect rater data. Iran: LAP LAMBERT Academic Publishing.

    課堂指定閱讀之Papers

    已申請之圖書館指定參考書目 圖書館指定參考書查詢 |相關處理要點

    維護智慧財產權,務必使用正版書籍。 Respect Copyright.

    課程相關連結Course Related Links

    
                

    課程附件Course Attachments

    課程進行中,使用智慧型手機、平板等隨身設備 To Use Smart Devices During the Class

    需經教師同意始得使用 Approval

    列印