Type of Credit: Elective
Credit(s)
Number of Students
This course targets students who are interested in processing dataset automatically with programming language. The goal is to let students know how to write scripts with released packages to help further analyze data linguistically with basic knowledge in data processing with basic statistics and models. In this course, we'll handle text data with R (which is one of the prevalent programming languages nowadays). Research articles related to current linguistic issues and application approaches are assigned, and we will have a deeper discussion of these articles during the class. At the end of this course, students will need to do a final presentation and submit a final term paper.
*每堂課程自備筆電
能力項目說明
It is expected that by the end of the course, students are able to write their own scripts to process text data in their own research projects. They will also learn basic procedures in machine learning applications.
*每堂課程自備筆電
教學週次Course Week | 彈性補充教學週次Flexible Supplemental Instruction Week | 彈性補充教學類別Flexible Supplemental Instruction Type |
---|---|---|
*Subject to change (課程內容暫定,每週請自備筆電至課堂)
Week |
Topic |
Note |
1 |
Course Introduction |
|
2 |
Data preprocessing |
|
3 |
Basic Knowledge of Stats |
Word segmentation and stopwords |
4 |
Descriptive Stats |
Function and Data Structure: matrix, dataframe, and list |
5 |
Analytical Stats I |
Loop and if/else Statement |
6 |
Analytical Stats II |
Data Visualization |
7 |
Linear Model |
Feature extraction: bag of words and tf-idf |
8 |
Logistic Regression |
Machine learning: decision tree and random forest |
9 |
Mixed Model and HCA |
Machine learning: support vector machine |
10 |
Opinion mining: crawler |
|
11 |
Collostructional Analysis |
Rmarkdown and Shiny |
12 |
R & API |
|
13 |
Proposal Discussion |
|
14 |
Final Presentation |
|
15 |
Final Presentation |
|
16 |
Term Paper Due |
|
(暫定)
課程參與率 20 %: 包含出席率、課堂討論、小組討論參與率
課後作業 20%: 課後作業完成度
課堂報告 20%: 整體成果完成度及個人參與率
期末口頭報告 20%: 期末口頭報告
期末書面報告 20%: 個人期末成果呈現完成度
*嚴格禁止抄襲
*無故缺席不得超過三次
Cotton, Richard. (2013). Learning R. O’Reilly.
Teetor, Paul. (2011). R Cookbook. O’Reilly.
Stefan Th. Gries. (2013). Statistics for Linguistics with R: A practical introduction. (2nd ed.).
Anatol Stefanowitsch and Stefan Th. Gries. (2005). Covarying collexemes.
Learning R programming. https://www.tutorialspoint.com/r/r_variables.htm R Tutorial for Beginners: Learning R Programming. https://www.guru99.com/r-tutorial.html Learn R Programming. https://www.datamentor.io/r-programming/ R Tutorial – Outstanding Introduction to R Programming for Data Science! https://data-flair.training/blogs/r-tutorial/ DataCamp. https://www.datacamp.com/