Foundations of Data Science
Fall 2015

Principal Instructor:
Ani Adhikari
John DeNero
Michael I. Jordan
Tapan Parikh
David Wagner


Piazza is a platform where we (instructors and students) can post announcements, have discussions, and ask questions. To add the course on Piazza, go to "Students Get Started" on, find UC Berkeley, and find Statistics 94. Create a student account with your Berkeley email.

Feel free to ask questions by email, but staff will check Piazza more frequently, and often other students will answer more quickly than we can. Plus, you'll help out other students who have the same question.


bCourses is where we will post information that can't go on the public Internet. Currently it only contains your grades. By enrolling in the class, you should be automatically signed up for the bCourses page.

Tips on using Jupyter notebooks

Labs and course projects will be done in Jupyter notebooks (also called iPython notebooks). A notebook is a mix of text and computer code that you can edit. (Editing is disabled for parts of some notebooks, but this is just to make sure you don't accidentally mess them up.)

You can create and edit notebooks yourself by installing software on any computer. But it's easier (and in the case of labs and projects, necessary) to edit notebooks using the course webpage.

The first lab

For example, here are the steps to get started on the first lab:

  1. Open a web browser on any computer. Currently, the only supported web browser is Chrome. (There's a problem with using the site in Safari, and possibly other browsers. We're working on it.)
  2. Go to .
  3. You should be prompted to log in to your Cal Google account. (For example, .)
  4. Click the green “My Server” button.
  5. Navigate to labs/lab01 and click “Lab 1 - Python and Jupyter.ipynb”. (Future labs will be under labs/labXY, and projects will be under projects/projectXY.)
  6. The lab notebook has instructions for completing it. Lab 1 has more detailed tips for editing notebooks.

Testing and being done

Near the top-right corner of the Jupyter interface is a button that says "run ok tests". This runs an "autograder" program, which is the main mechanism for both you and the course instructors to check whether you've completed the labs correctly. When you click the button, each question in the lab will be checked for correctness. If a question is correct, you'll see something like this below the output for that cell:

Assignment: Lab 1
OK, version v1.4.1

Running tests

Test summary
    1 test cases passed! No cases failed.

Checking for software updates...
OK is up to date

If the autograder thinks you haven't given the right answer, you'll see something like this instead:

Assignment: Lab 1
OK, version v1.4.1

Running tests

Question 1 > Suite 1 > Case 1

from submission import *
>>> data-science

# Error: expected
#     8
# but got
#     7

Test summary
    0 test cases passed before encountering first failed test case

Checking for software updates...
OK is up to date

The autograder message is somewhat verbose, but it is telling you that, in question 1, the correct output was 8, but your cell's output was 7. (At this point, you could just copy the expected output 8 into the end of your cell, but then you wouldn't learn anything. The labs aren't graded for correctness anyway.)

Once all of the autograder tests pass, you've finished the lab. Congratulations! Typically you can leave the lab section when you're done, unless there's something else on the agenda.

Errors and getting an account

When you go to and sign in for the first time, you may see the error message "403 Access Forbidden". That just means we haven't created an account for you yet on that part of the website. If you've signed up recently for the course, this is likely to happen to you.

If you see this, just send one of the course GSIs an email telling us your email address, and we'll add you as soon as we can. (If you're in lab, you can ask your lab GSI.)

If you have other difficulties accessing course material, post about it on Piazza (publicly, if you're comfortable) or send a GSI an email.

If something goes wrong

A notebook is really just a file (named, for example, "Lab 1 — Python & Jupyter.ipynb") that lives in a folder that's been set up for you on the course servers. You are accessing and modifying notebook files when you open them on

Initially, everyone has the same copy of the assignment notebooks, but as you fill it in with your code and text, your copy diverges from the original. It's possible, with a little ingenuity, to mess up your copy. If this happens, you can ask course staff to give you a new copy of the original assignment notebook, without any of your changes. (Unfortunately, you might lose whatever progress you've made on the assignment, unless it can be recovered.)

Editing text cells

In the first lab, you're only asked to edit and run the cells with code in them. You may notice that double-clicking on the text of the lab opens a similar editing interface. In fact, we've put the lab text in cells just like the code, but text cells are marked as such, so the notebook knows to treat them differently.

Actually, the text cells are not just ordinary text; they're in a language called Markdown, which is similar to HTML but much more readable. You can edit them if you like, though you might destroy the lab instructions if you do!

In any case, if you're done playing with the Markdown editing interface for a text cell (or you accidentally double-clicked some text), just click the Run button near the top left (which will also run code cells). Running a Markdown cell converts it back to nicer-looking text.