Type of Credit: Elective
Credit(s)
Number of Students
This course aims to guide graduate students in learning the fundamental quantitative methods that are frequently used in linguistic research. It is expected that students have learned fundamental techniques in programming, especially in R, since this course will not start from scratch introducing basic commands in R. The first two hours of the course will be mainly lecturing, and the last hour will be a lab session for students to get their hands ``dirty".
Students will learn why quantitative methods are important, and how to analyze large datasets quantitatively and explore the underlying linguistic cues systematically. In this course, students will learn not only essential quantitative methods, but also basic machine learning approaches with big data and how to evaluate models. This course will present and demonstrate how to use R in implementing the above approaches. Lastly, a final project needs to be submitted at the end of the semester with skills learned in the course.
Furthermore, this course collaborates with the company Trend Micro to tackle cyber security related issues (e.g., the detection of false messages). Students will be provided with real time datasets in the course, and are encouraged to develop projects relevant (but not limited) to the topics. They are expected to provide linguistic insights to work on this interdisciplinary study.
能力項目說明
TBA
週次 |
教學主題 |
主要參考資料 |
其他參考資料 |
作業 |
1 |
Course Introduction & Why quantitative methods? |
|
|
|
2 |
Descriptive Statistics & Probability Distri butions |
Baayen (2008) Ch2-3; Gries (2021) Ch3; 論文閱讀; 自製講義 |
|
當週作業 |
3 |
Descriptive Statistics & Probability Distri butions |
Baayen (2008) Ch2-3; Gries (2021) Ch3; 論文閱讀; 自製講義 |
|
當週作業 |
4 |
Test Statistics, Effect Size, Standardiza tion, and Regularization |
Baayen (2008) Ch4; Gries (2021) Ch4; 論文閱讀; 自製講義 |
|
當週作業 |
5 |
Inter-annotator Agreements |
論文閱讀; 自製講義 |
|
當週作業 |
6 |
Linear Regression and Correlation |
Baayen (2008) Ch6; Gries (2021) Ch5; 論文閱讀; 自製講義 |
|
當週作業 |
7 |
Linear Mixed models |
Baayen (2008) Ch6-7; Gries (2021) Ch5; 論文閱讀; 自製講義 |
|
當週作業 |
8 |
(mixed-effects) Logistic Regression |
Baayen (2008) Ch7; Gries (2021) Ch6; 論文閱讀; 自製講義 |
|
當週作業 |
9 |
mini Hackathon |
|
|
|
10 |
mini Hackathon - presentation |
|
|
|
11 |
Principal Component Analysis and Factor Analysis |
Baayen (2008) Ch5; 論文閱讀; 自製講義 |
|
當週作業 |
12 |
Multidimensional Scaling and Hierarchical Cluster Analysis |
Baayen (2008) Ch5; 論文閱讀; 自製講義 |
|
當週作業 |
13 |
Discriminative Analysis and Nonlinearities |
論文閱讀; 自製講義 |
|
期末專題準備 |
14 |
Decision Trees and Random Forests |
Gries (2021) Ch7; 論文閱讀; 自製講義 |
|
|
15 |
Embeddings and Latent Semantic Analysis |
論文閱讀; 自製講義 |
|
|
16 |
業界演講 |
|
|
邀請趨勢科技公司前來分享人工智慧處理案例及假訊息相關議題。 |
17 |
Final Project Presentation |
|
|
期末專題展演 |
18 |
Term paper due |
|
|
期末專題論文繳交 |
TBA
TBA