{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tutorial 2: Regression with kNN and Linear Regression\n", "[](https://github.com/amonroym99/uva-applied-ml/blob/main/docs/notebooks/2_reg_knn_linreg.ipynb)\n", "\n", "**Author:** Alejandro Monroy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook we will cover two of the most basic regression models: kNN and Linear Regression. Furthermore, we will see some metrics to evaluate regression models." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Loading and preparing the data\n", "We will use the `diabetes` dataset from Sklearn as we did in the previous tutorial. This time, we will set `scaled=True` to skip the normalization step:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | age | \n", "sex | \n", "bmi | \n", "bp | \n", "s1 | \n", "s2 | \n", "s3 | \n", "s4 | \n", "s5 | \n", "s6 | \n", "
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "0.038076 | \n", "0.050680 | \n", "0.061696 | \n", "0.021872 | \n", "-0.044223 | \n", "-0.034821 | \n", "-0.043401 | \n", "-0.002592 | \n", "0.019907 | \n", "-0.017646 | \n", "
| 1 | \n", "-0.001882 | \n", "-0.044642 | \n", "-0.051474 | \n", "-0.026328 | \n", "-0.008449 | \n", "-0.019163 | \n", "0.074412 | \n", "-0.039493 | \n", "-0.068332 | \n", "-0.092204 | \n", "
| 2 | \n", "0.085299 | \n", "0.050680 | \n", "0.044451 | \n", "-0.005670 | \n", "-0.045599 | \n", "-0.034194 | \n", "-0.032356 | \n", "-0.002592 | \n", "0.002861 | \n", "-0.025930 | \n", "
| 3 | \n", "-0.089063 | \n", "-0.044642 | \n", "-0.011595 | \n", "-0.036656 | \n", "0.012191 | \n", "0.024991 | \n", "-0.036038 | \n", "0.034309 | \n", "0.022688 | \n", "-0.009362 | \n", "
| 4 | \n", "0.005383 | \n", "-0.044642 | \n", "-0.036385 | \n", "0.021872 | \n", "0.003935 | \n", "0.015596 | \n", "0.008142 | \n", "-0.002592 | \n", "-0.031988 | \n", "-0.046641 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 437 | \n", "0.041708 | \n", "0.050680 | \n", "0.019662 | \n", "0.059744 | \n", "-0.005697 | \n", "-0.002566 | \n", "-0.028674 | \n", "-0.002592 | \n", "0.031193 | \n", "0.007207 | \n", "
| 438 | \n", "-0.005515 | \n", "0.050680 | \n", "-0.015906 | \n", "-0.067642 | \n", "0.049341 | \n", "0.079165 | \n", "-0.028674 | \n", "0.034309 | \n", "-0.018114 | \n", "0.044485 | \n", "
| 439 | \n", "0.041708 | \n", "0.050680 | \n", "-0.015906 | \n", "0.017293 | \n", "-0.037344 | \n", "-0.013840 | \n", "-0.024993 | \n", "-0.011080 | \n", "-0.046883 | \n", "0.015491 | \n", "
| 440 | \n", "-0.045472 | \n", "-0.044642 | \n", "0.039062 | \n", "0.001215 | \n", "0.016318 | \n", "0.015283 | \n", "-0.028674 | \n", "0.026560 | \n", "0.044529 | \n", "-0.025930 | \n", "
| 441 | \n", "-0.045472 | \n", "-0.044642 | \n", "-0.073030 | \n", "-0.081413 | \n", "0.083740 | \n", "0.027809 | \n", "0.173816 | \n", "-0.039493 | \n", "-0.004222 | \n", "0.003064 | \n", "
442 rows × 10 columns
\n", "