Type of Credit: Elective
Credit(s)
Number of Students
This course targets students who are interested in processing dataset automatically with programming language. The goal is to let students know how to write scripts with released packages to help further analyze data linguistically with basic knowledge in data processing with machine learning models. In this course, we'll handle text data with R (which is one of the prevalent programming languages nowadays). Research articles related to current linguistic issues and application approaches are assigned, and we will have a deeper discussion of these articles during the class. There will also be one mini-hackathon held after mid-term week to help students integrate all the skills they learned during the course. At the end of this course, students will need to do a final presentation and submit a final term paper.
能力項目說明
It is expected that by the end of the course, students are able to write their own scripts to process text data in their own research projects. They will also learn basic procedures in machine learning applications.
*Subject to change
Week |
Topic |
Note |
1 |
Course Introduction |
|
2 |
Data preprocessing: advance search and replacement |
|
3 |
Word segmentation and stopwords |
reading |
4 |
Function and Data Structure: matrix, dataframe, and list |
|
5 |
Loop and if/else Statement |
|
6 |
Data Visualization |
|
7 |
Feature extraction: bag of words and tf-idf |
reading |
8 |
Machine learning: decision tree and random forest |
reading |
9 |
Machine learning: support vector machine |
reading |
10 |
mini-Hackathon |
|
11 |
mini-Hackathon Presentation |
|
12 |
Machine learning: logistic regression |
reading |
13 |
Opinion mining: crawler |
reading |
14 |
Rmarkdown and Shiny |
reading |
15 |
Proposal Discussion |
|
16 |
R & API |
|
17 |
Final Presentation |
|
18 |
Term Paper Due |
|
(暫定)
課程參與率30%: 包含出席率、課堂討論、小組討論參與率
課後作業 30%: 課後作業完成度
微黑客松 20%: 整體成果完成度及個人參與率
個人報告 20%: 個人期末成果呈現完成度
*嚴格禁止抄襲
*無故缺席不得超過三次
Cotton, Richard. (2013). Learning R. O’Reilly.
Teetor, Paul. (2011). R Cookbook. O’Reilly.
Learning R programming. https://www.tutorialspoint.com/r/r_variables.htm R Tutorial for Beginners: Learning R Programming. https://www.guru99.com/r-tutorial.html Learn R Programming. https://www.datamentor.io/r-programming/ R Tutorial – Outstanding Introduction to R Programming for Data Science! https://data-flair.training/blogs/r-tutorial/ DataCamp. https://www.datacamp.com/