Syllabus

ST 563: Introduction to Statistical Learning (Online)
Fall, 2021
3 credit hours

 

Instructor: Dr. Arnab Maity
Email: amaity@ncsu.edu
Office hours (via Zoom): Wednesday 4:30  – 5:30 PM EST, or by appointment

 

Teaching Assistant (TA): Hunter Jiang
Email: hjiang24@ncsu.edu
Office hours (via Zoom):
Monday/Thursday 4:30 — 6:00 PM
, or by appointment

 

Course Description:  This course covers supervised and unsupervised statistical learning techniques, such as regression, classification, clustering, and dimension reduction. Specific topics include linear and generalized linear models, regularized regression, bootstrap and cross-validation, discriminant analysis, nearest neighbor methods, classification and regression tree (random forest, bagging, boosting), support vector machine, neural networks, principal component analysis, hierarchical clustering. We will use the R software for demonstration, but students can choose any language they like to complete the assignments.

Prerequisites: A knowledge of Basic Statistics at the level of ST 512 or 517 or equivalent, and basic knowledge of Matrix Algebra.

Required Text: The textbook for this course is “An Introduction to Statistical Learning”, second edition, by Gareth James et al. You do not need to buy the book  —  it is available for free at https://www.statlearning.com/ or through NCSU library. Materials will also be taken from a variety of books/sources. You do not need to purchase any of these books. Some references are given below.

  1. “Hands-on machine learning with R” by Bradley Boehmke & Brandon Greenwell
  2. “Machine learning with R, the tidyverse, and mlr” by Hefin I. Rhys
  3. “Deep learning with R” by Francois Chollet with J. J. Allair

Calculator: Students will need a basic calculator that can do addition, subtraction, multiplication, division, square roots, exponentials, and logarithms.

Scanner: Students will need to submit their homework in pdf or word format. It is acceptable to write the homework by hand and scan it into a pdf. Therefore, you will need to have access to a scanner or use a phone application. Alternatively,
students may also type their homework assignments.

Software: Students in this course will use R statistical software. This software is open-source and free to anyone. It is widely used in statistics and is especially great for visualizations and custom analyses.
You may also want to download R Studio.

Communication: Students are expected to check their NCSU email regularly to receive course announcements. Students who do not use their NCSU email should arrange to have this email forwarded to an account they do use. Due to university
regulations, the instructor can send course announcements only to NCSU email addresses.

Support mechanisms: Since this course is online, the instructor and TA will have virtual office hours via zoom. Additionally, a general discussion board on the course website will allow students to ask questions of each other. Students
should take advantage of this discussion board, and use it as the initial approach to addressing questions.

Course Content: Students in this course do not attend a typical class period. Instead, students will watch videos and complete activities. Students should set aside sufficient time in their schedules to complete these materials.

Grades: It is the student’s responsibility to be aware of their grades in the course and the appropriate level of work required. Your final grade in this course will depend on the following:

  1. Midterm Exam I: 20% of grade
  2. Midterm Exam II: 20% of grade
  3. Online Discussion Posting: 10% of grade
  4. Homework: 30% of grade
  5. Final Project: 20% of grade

The course uses the following grading scale:

A+ >= 98 > A >= 93 > A- >= 90 >
B+ >= 88 > B >= 83 > B- >= 80 >
C+ >= 78 > C >= 73 > C- >= 70 >
D+ >= 68 > D >= 63 > D- >= 60 > F

 

Requirements for Credit-Only (S/U) Grading: To receive a grade of S, students are required to take all exams and quizzes, complete all assignments, and earn a grade of C- or better. Conversion from letter grading to credit only (S/U)
grading is subject to university deadlines. Refer to the Registration and Records calendar for deadlines related to grading. For more details refer to http://policies.ncsu.edu/regulation/reg-02-20-15.

 

Requirements for Auditors (AU): Information about and requirements for auditing a course can be found at http://policies.ncsu.edu/regulation/reg-02-20-04. Auditors are expected to attend class regularly and submit homework on the same
schedule as the other students. The final grade for auditors (AU or NR) will be based on their final homework average (final homework grade will be calculated by dropping the two lowest grades). A final homework score of at least 70% and participation
in the online discussion are required for an AU.

 

Policies on Incomplete Grades: If an extended deadline is not authorized by the Graduate School, an unfinished incomplete grade will automatically change to an F after either (a) the end of the next regular semester in which the student
is enrolled (not including summer sessions), or (b) by the end of 12 months if the student is not enrolled, whichever is shorter. Incompletes that change to F will count as an attempted course on transcripts. The burden of fulfilling an incomplete
grade is the responsibility of the student. The university policy on incomplete grades is located at http://policies.ncsu.edu/regulation/reg-02-50-03

Additional information relative to incomplete grades for graduate students can be found in the Graduate Administrative Handbook https://grad.ncsu.edu/students/rules-and-regulations/handbook/

 

Homework: There will be homework assignments each week. Many of the assignments will include a programming portion. No late assignments are accepted. Homework will need to be submitted in either word or pdf format. You may need
a scanner or a picture to PDF application. Alternatively, students may also type their homework assignments.

 

Project: Since this course introduces a variety of methods to analyze data, there will be a project, done in a group, applying the techniques to real data. Details of the project will be posted on the course page.

 

Discussion Board Postings: Students in this course will be broken into small groups. Each week these groups will answer questions and discuss course content using the online discussion board. These discussion questions will be keyed to
the specific week’s material and will have specific due dates. Due dates are firm and
your fellow group members will be counting on your contributions to be
submitted by the due date.
Discussion postings will be graded based on the quality and timeliness of responses. Students are expected to treat each other with respect on the boards.

 

Exams: All exams (except the final project) will be conducted online using Moodle Quiz. Any choice of calculator (such as TI-83) may be used on all exams. Students who are unable to attend an exam for a legitimate unavoidable reason may
take a make-up exam only if the student provides suitable documentation of the delay and take the make-up in a timely manner. Students may take the exam on either of the two days listed in the course outline. The midterm exam must be completed within
2.0 hours of starting the attempt. Only one attempt is allowed.

 

Electronically-Hosted Course Components: Students may be required to disclose personally identifiable information to other students in the course, via electronic tools like email or web-postings, relevant to the course. Examples include
online discussions of class topics and posting of student coursework. All students are expected to respect the privacy of each other by not sharing or using such information outside the course.

 

Students with Disabilities: Reasonable accommodations will be made for students with verifiable disabilities. In order to take advantage of available accommodations, students must register with the Disability Services Office at Suite
2221, Student Health Center, Campus Box 7509, 919-515-7653. For more information on NC State’s policy on working with students with disabilities, please see the Academic Accommodations for Students with Disabilities Regulation (REG 02.20.01) at https://policies.ncsu.edu/regulation/reg-02-20-01/

 

Academic Misconduct: Cheating, plagiarism, and other forms of academic dishonesty will not be tolerated. To create a fair and equitable environment, the instructor aggressively enforces the university policies on academic misconduct.
All exams are to be completed individually. Although working together on written assignments to overcome obstacles is encouraged, each student must compose and write their own analysis and reports. All cases of academic misconduct will be handled
as set out in university policies. For additional information see: http://policies.ncsu.edu/policy/pol-11-35-01

 

Non-Discrimination Policy: NC State University provides equality of opportunity in education and employment for all students and employees. Accordingly, NC State affirms its commitment to maintain a work environment for all employees
and an academic environment for all students that is free from all forms of discrimination. Discrimination based on race, color, religion, creed, sex, national origin, age, disability, veteran status, or sexual orientation is a violation of state
and federal law and/or NC State University policy and will not be tolerated. Harassment of any person (either in the form of quid pro quo or creation of a hostile environment) based on race, color, religion, creed, sex, national origin, age, disability,
veteran status, or sexual orientation also is a violation of state and federal law and/or NC State University policy and will not be tolerated. Retaliation against any person who complains about discrimination is also prohibited. NC State’s policies
and regulations covering discrimination, harassment, and retaliation may be accessed at http://policies.ncsu.edu/policy/pol-04-25-05 or http://www.ncsu.edu/equal_op/. Any person who feels that he or she has been the subject of prohibited discrimination,
harassment, or retaliation should contact the Office for Equal Opportunity (OEO) at 919-515-3148.

 

N.C. State University Policies, Regulations, and Rules (PRR): Students are responsible for reviewing the PRRs which pertain to their course rights and responsibilities. These include http://policies.ncsu.edu/policy/pol-04-25-05 (Equal
Opportunity and Non-Discrimination Policy Statement), https://diversity.ncsu.edu/ (Office for Institutional Equity and Diversity), http://policies.ncsu.edu/policy/pol-11-35-01 (Code of Student Conduct), and http://policies.ncsu.edu/regulation/reg-02-50-03
(Grades and Grade Point Average).

 

Course Topics:  Specific topics include linear and generalized linear models, regularized regression, bootstrap and cross-validation, discriminant analysis, nearest neighbor methods, classification and regression tree (random
forest, bagging, boosting), support vector machine, neural networks, principal component analysis, hierarchical clustering.