Schedule
- Fall 2024 - Professor Luis Amaral
- Dates: September 10-18, 2024
- Times: 9:30 a.m. - 12:00 p.m. & 1:30 - 4:30 p.m.
- Location: Clark B01 (555 Clark Street)
- Prerequisites: None
Overview
Our digital, connected, sensor rich world is generating extraordinary amounts of data (“Big Data”) that are being used for purposes as diverse as teaching a computer to win at Jeopardy or offering taxi alternatives. The skills needed to go from data to knowledge and application, which go under the name of Data Science, are in big demand in industry, government, and academia. This course provides an introduction to the foundational skills needed by data scientists. Prior knowledge of programming is not needed.
Restrictions
Intended primarily for undergraduate and graduate students. Postdoctoral students and staff must contact NICO before registering. Students will need an up-to-date laptop running Linux, OS X, or Windows 7 or higher. Chromebooks will not be permitted. Prior to the start of the course, students must install several packages and verify that they run properly on their machine. Lecture materials are available online.
Requirements
There will be about six homework assignments involving the writing of Python code for solving specific problems. Students’ solutions will be uploaded to a server where they will be unit tested. All students will be expected to attend lectures and complete in class assignments.
Topics
- Examples of problems amenable to computation
- Overview of computer hardware & different filesystems
- The Zen of Python: Code style & commenting
- Using IPython notebook
- Basic Python data types: Integers, floats, strings, & lists
- Flow control: Loops, conditionals, exceptions
- Input & output
- Functions & code modularity
- The Python standard library: string, math, sys, & so on
- Sophisticated data types: tuples, sets, & dictionaries
- Data visualization using matplotlib
- Numerical computing using numpy & scipy
- Example: Image processing using numpy
- Retrieving data from the web using requests & splinter
- Text analysis & intro to regular expressions
- Example: Computing with Shakespeare
- Computing with dates & times
- Analyzing tabular data using pandas
- Example: Time series analysis of stock prices
- Numerical precision & algorithm scaling
- Statistical analysis with stats models
- Finding other resources