Foundations of Data Science combines three perspectives: inferential thinking, computational thinking, and real-world relevance. Given data arising from some real-world phenomenon, how does one analyze that data so as to understand that phenomenon? The course teaches critical concepts and skills in computer programming and statistical inference, in conjunction with hands-on analysis of real-world datasets, including economic data, document collections, geographical data, and social networks. It also delves into social issues surrounding data analysis such as privacy and study design.
This course does not have any prerequisites beyond high-school algebra. The curriculum and format is designed specifically for students who have not previously taken statistics or computer science courses. Students with some prior experience in either statistics or computing are welcome to enroll and will find much of interest due to the innovative nature of the course. Students who have taken several statistics or computer science courses should instead take a more advanced course.
Our primary text is an online book called Computational and Inferential Thinking: The Foundations of Data Science. This text was written for the course by the course instructors.
The computing platform for the course is hosted at data8.datahub.berkeley.edu. Students find it convenient to use their own computer for the course. If you do not have adequate access to a personal computer, we have machines available for you; please contact your lab TA.
You are not alone in this course; the staff and instructors are here to support you as you learn the material. It's expected that some aspects of the course will take time to master, and the best way to master challenging material is to ask questions. For questions, use Ed. We will also hold in-person and virtual office hours for real-time discussions.
Your lab TA will be your main point of contact for all course related questions/grade clarifications. The TAs are here to support you so please lean on your lab TA if you need more support in the class or have any questions/concerns.
Small-group tutoring sessions will be available for students in need of additional support to develop confidence with core concepts. In past semesters, students who attended have found these sessions to be a great use of their time. Details about sign-ups will be available later in the term.
The rest of this page details the policies that will be enforced in the Summer 2022 offering of this course. These policies are subject to change until the beginning of the semester and throughout the remainder of the course, at the judgement of the course staff.
All times listed below are in Pacific Time (PT).
If you are on the waitlist, you must still do all coursework and complete labs and homework by the deadlines. We will not be offering extensions if you are admitted into the course later. So it is your responsibility to stay up to date on the assignments.
Unfortunately, doing all the coursework is not a guarantee of enrollment. You will only be enrolled if there is space in lecture. Enrollment for lecture will proceed by CalCentral.
Lectures will be held on Monday through Thursday from 10am-11am, and Friday from 10am-12pm, in Dwinelle 155. They will be used to highlight and review vital concepts of the course. Accompanying notebooks with examples will typically be provided to students. Recordings of these sessions will be provided shortly afterwards, though students are highly encouraged to attend in real time.
This course has two 2-hour labs per week. All students are required to attend the first lab section, on 6/22. If you are unable to attend the first lab section, please reach out to your assigned TA once you receive your section assignments.
The lab sections have two components: a worksheet about recent material, and a lab assignment that develops skills with computational and inferential concepts. These lab assignments are a required part of the course and will be released before the lab section, though you do not need to work on them in advance or submit them anywhere.
Lab worksheets will be posted electronically as well as handed out on paper, so it is recommended to bring a tablet to lab or writing utensils. You might find the following resources helpful:
Note that lab sections are not webcast.
You can get credit for each lab assignment in one of two ways described below:
Biweekly homework assignments are a required part of the course. You must complete and submit your homework independently, but you are allowed to discuss problems with other students and course staff. See the "Learning Cooperatively" section below.
Homeworks will be released on Tuesday and Friday after lecture and are due the following Friday and Tuesday respectively. If you submit a homework or project 24 hours before the deadline or earlier, you will receive 5 bonus points on that assignment.
Data science is about analyzing real-world data sets, and so you will also complete two projects involving real data. On each project, you may work with a single partner. Both partners will receive the same score.
The midterm exam will be held on Friday, July 15 from 10am-12pm PT. Please note the date and time carefully. There will be one alternate exam for the midterm, for those in alternate time zones or with conflicting exams. You should expect that the alternate time will be as close as possible to the regularly scheduled time.
Although the midterm exam will be held in-person, students may request a remote midterm exam due to extenuating circumstances. An online exam will be fully proctored. For the entirety of the exam period, these students will need to work in a completely quiet room with a camera recording their computer screen and hands, unless they have accomodations on file with the DSP office. Refer to the Student Technology Equity Program for any technology needs.
The in-person final exam is required for a passing grade, and will be held on Friday, August 12 from 10am-1pm. Please double check your schedule to make sure that you have no conflicting finals and that you are able to take the exam in-person in Berkeley. No exceptions will be made to this rule. If you have exam accomodations on file with the DSP office, they will be taken into account for both the midterm and final exams.
Grades will be assigned using the following weighted components. Every assignment is weighted equally in its category. For example, there are 2 projects, so each project is worth (20/2)%=10% of your grade.
In past semesters of Data 8, more than 40% of the students received grades in the A+/A/A- range and more than 35% received grades in the B+/B/B- range.
Instructors and TAs will not release grade bins during the semester. These bins will be created after all grades come in by the instructors at the end of the semester. These grade bins vary from semester to semester and are subject to change, but they will be no harsher than the following bounds:
|90%||Guarantees at least an A-|
|80%||Guarantees at least a B-|
|60%||Guarantees at least a C- (PASS)|
Grades for Homeworks, Projects, and Labs will be posted about 1 week after the assignment's due date. Solutions to the assignment and common mistakes will also be posted on Ed. It is up to you to check the solutions and request a regrade request before the regrade deadline (typically 5 days after grade release). Regrade requests can be made on Gradescope. Any regrade request past the deadline will not be looked at; this is to enforce the same deadline across all students, so please do not delay in reviewing your work.
For the midterm exam, there will be a regrade request submission window. Please review the solutions and common mistakes before submitting a regrade request. Requests where a rubric item was incorrectly selected or not selected will be reviewed, but any regrade requests that ask to change the rubric or for partial credit will be ignored.
Similar to other CS classes.
Participation points were created to encourage people to be good academic citizens, in a way that traditional grades could not capture. This can help boost you over a grade boundary if you’re close to one. Scoring is confidential, and your score will not be shared - please do not ask staff members any questions about this.
There are 3 distinct categories to accumulate up to a maximum total of 20 participation points in Data 8. This is meant to give all students the maximum flexibility to participate in a way they feel comfortable with.
Lecture attendance/participation (max 20): Lecture attendance is tracked using a form that is shared at a random time during each lecture. Actively asking / answering questions during lecture will help you accumulate extra points in this category. Lastly, students will also have the opportunity to interact with Ellen and Kevin during OH.
Ed-gagement (max 20): At the end of the semester, an Ed-gagement score will be assigned to all students, where we will roughly grade everyone on a scale that rewards thought-provoking questions or insightful answers on our online forum, Ed. Staff will assign grades manually based on these rough features, so there is no formula that exactly maps the number of contributions to the number of points you will receive.
Section participation (max 20): During the semester, there are several ways to engage with the course staff: lab sections, tutoring sections, and office hours. At the end of the semester, your participation in these section activities will be scored according to attendance and engagement. The best way to score full points in this section is to attend your lab section, sign up for tutoring, and make sure your TA knows your name!
So in summary, participation points can be earned from any combination of the categories. Each participation point is worth 0.05% of your grade, capped at 20 points (equivalent to 1%).
All assignments (homework, labs, and projects) will be submitted on Gradescope. Please refer to the following tutorials:
Late submissions of assignments will not be accepted, unless you have received an extension for the assignment through contacting your Lab TA in advance or if you have relevant DSP accomodations.
Your lowest lab score and lowest homework score will be dropped in the calculation of your overall grade. These drops are intended to cover emergency situations that prevent you from finishing an assignment. If you encounter further difficulties, or need additional time to complete an assignment, please reach out to your assigned Lab TA via email.
Projects will be accepted up to 2 days (48 hours) late. Projects submitted fewer than 24 hours after the deadline will receive 2/3 credit, and projects submitted between 24 and 48 hours after the deadline will receive 1/3 credit. Projects submitted 48 hours or more after the deadline will receive no credit.
We encourage you to discuss course content with your friends and classmates as you are working on your assignments. No matter your academic background, you will learn more if you work alongside others than if you work alone. Ask questions, answer questions, and share ideas liberally.
If some emergency takes you away from the course for an extended period, or if you decide to drop the course for any reason, please don't just disappear silently! You should inform your lab TA and your project partner (if you have one) immediately, so that nobody is expecting you to do something you can't finish.
You must write your answers in your own words, and you must not share your completed work. The exception to this rule is that you can share everything related to a project with your project partner (if you have one) and turn in one project between the two of you, and if you are attending a lab session and have a lab partner you can share everything related to that lab with your lab partner.
Make a serious attempt at every assignment yourself. If you get stuck, read the textbook and go over the lectures and lab discussion. After that, go ahead and discuss any remaining doubts with others, especially the course staff. That way you will get the most out of the discussion.
It is important to keep in mind the limits to collaboration. As noted above, you and your friends are encouraged to discuss course content and approaches to problem solving. But you are not allowed to share your code or answers with other students. Doing so is considered academic misconduct, and it doesn't help them either. It sets them up for trouble on upcoming assignments and on the exams.
In addition, posting course content such as homeworks, projects, and exams on any 3rd party websites or submitting your own answers on outside sites/forums is considered academic misconduct.
You are also not permitted to turn in answers or code that you have obtained from others. Not only does such copying count as academic misconduct, it circumvents the pedagogical goals of an assignment. You must solve problems with the resources made available in the course. You should never look at or have in your possession solutions from another student or another semester.
Please read Berkeley's Code of Conduct carefully. Penalties for academic misconduct in Data 8 are severe and include reporting to the Center for Student Conduct. They might also include a F in the course or even dismissal from the university. It's just not worth it!
When you need help, reach out to the course staff using Ed, in office hours, and/or during labs. You are not alone in Data 8! Instructors and staff are here to help you succeed. We expect that you will work with integrity and with respect for other members of the class, just as the course staff will work with integrity and with respect for you.
Finally, know that it's normal to struggle. Berkeley has high standards, which is one of the reasons its degrees are valued. Everyone struggles even though many try not to show it. Even if you don't learn everything that's being covered, you'll be able to build on what you do learn, whereas if you cheat you'll have nothing to build on. You aren't expected to be perfect; it's ok not to get an A.
The main goal of the course is that you should learn, and have a fantastic experience doing so. Please keep that goal in mind throughout the semester. Welcome to Data 8!