Project Overview
In this project, we are going to clean Nashville Housing dataset. We have performed to do as the following tasks:
- Standardize date format using Date Function.
- Populate missing property address data using joins.
- Parsing long-formatted address into individual columns (Address, City, State) using substring function.
- Standardize “Sold as Vacant” field (from Y/N to Yes and No) using case statements.
- Remove Duplicates using window functions.
Project Outcome
- Date Standadization across the dataset, so the data can be used for further analysis in Tableau, Power BI etc.
- Removed inconsistency by populating the missing address in the records witht the help of parcelID, so it becomes easy to analyse the data.
- Splitted the Property address into state, city and street, so that the data can be analysed on them.
- Completely modified the Y to “Yes” and N to “No” in the Sold as vacant column, so that effective filter can be applied.
- Removed the duplicates in the data across for better performance.
- Removed the unused columns in the database for better performance.