from datascience import *
import numpy as np
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')
In this lecture, I am going to use more interactive plots (they look better) so I am using the plotly.express library. We won't test you on this but it's good to know.
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go
Recall, long ago, in lecture 10 we built a function to predict child heights. We started with Galton's height dataset which contained the full grown heigh of children and the height's of both of their parents. We then computed the average height of the parents of each child.
The following is the simplified version of the data containing just the parent's heights and the child height.
# Note: Child heights are the **adult** heights of children in a family
families = Table.read_table('family_heights.csv')
parent_avgs = (families.column('father') + families.column('mother'))/2
heights = Table().with_columns(
'Parent Average', parent_avgs,
'Child', families.column('child'),
)
heights
Parent Average | Child |
---|---|
72.75 | 73.2 |
72.75 | 69.2 |
72.75 | 69 |
72.75 | 69 |
71 | 73.5 |
71 | 72.5 |
71 | 65.5 |
71 | 65.5 |
69.5 | 71 |
69.5 | 68 |
... (924 rows omitted)
What was the relationship between height of the full grown child and the height of the parents?
heights.iscatter('Parent Average', 'Child')