Clustering with K-Means
In this exercise you’ll explore our first unsupervised learning technique for creating features
In this exercise you’ll explore our first unsupervised learning technique for creating features
In this exercise you’ll start developing the features
Mutual information describes relationships in terms of uncertainty.
You will create and submit predictions for a Kaggle competition.
Random Forest is going to be an easy win.
optimize the size of the tree to make better predictions.
You will test how good your model is.
Select the target variable, which corresponds to the sales price.
Filter the data, build model, and improve model.
Create histograms and density plots
Leverage the coordinate plane to explore relationships between variables
Visualize trends over time
Your first introduction to coding for data visualization
You will learn what data leakage is and how to prevent it.
you will learn how to build and optimize models with gradient boosting
you will learn how to use cross-validation for better measures of model performance.
This notebook is an exercise in the Intermediate Machine Learning course. You can reference the tutorial at this link.
Notice that the dataset contains both numerical and categorical variables.
You’ll obtain a more comprehensive understanding of the missing values in the dataset.
Now it’s time to go through the modeling process and make predictions.
Run the following cell to load your data and some utility functions.
Run the following cell to load your data and some utility functions.
In these exercises we’ll apply groupwise analysis to our dataset.
Now you are ready to get a deeper understanding of your data.
In this set of exercises we will work with the Wine Reviews dataset.
The first step in most data analytics projects is reading the data file.
Pipelines are a simple way to keep your data preprocessing and modeling code organized.
The steps to building and using a model are: Define: What type of model will it be? A decison tree? Some other type of model? Some other parameters...
Tips from kaggle’s instructor ‘Ryan Holbrook’.
introduce useful Pandas Snippets
Use color or length to compare categories in a dataset
Use color or length to compare categories in a dataset
Use color or length to compare categories in a dataset