Example with the decision tree:
When we divide the houses amongst many leaves, we also have fewer houses in each leaf. Leaves with very few houses will make predictions that are quite close to those homes' actual values, but they may make very unreliable predictions for new data (because each prediction is based on only a few houses). This is a phenomenon called overfitting, where a model matches the training data almost perfectly, but does poorly in validation and other new data. On the flip side, if we make our tree very shallow, it doesn't divide up the houses into very distinct groups. At an extreme, if a tree divides houses into only 2 or 4, each group still has a wide variety of houses. Resulting predictions may be far off for most houses, even in the training data (and it will be bad in validation too for the same reason). When a model fails to capture important distinctions and patterns in the data, so it performs poorly even in training data, that is called underfitting.
Interactive Linear Regression Plot

Interactive Linear Regression

Adjust the training data, the weight (w), and the bias (b). The plot and the cost (MSE) will update in real-time.

Current Cost (MSE): N/A
function computeCost(
x: number[], // dataset of sq. meters
y: number[], // dataset of price
w: number, // weight
b: number // bias
): number { const m = x.length; let cost = 0; for (let i = 0; i < m; i++) { const f_wb = w * x[i] + b; cost += (f_wb - y[i]) ** 2; } const totalCost = cost / (2 * m); return totalCost; }
const f_wb = w * x[i] + b;
This is house price prediction function.

x and y are arrays of sq. meters and prices respectively.

w and b are parameters of linear regression model - weight (result of the system) and bias (temperature?).


prediction value f_wb:
COST FUNCTION
AI notes