👨🏻💻
- Author: @juansevargasc
- Dataset Source: 2022 U.S. Domestic Flights Departures, Kaggle
- Topics: Data Engineering, ETL, Data Analysis, Data Warehouse
conda create --name <env> --file requirements.txt
# or
pip install -r requirements.txt
python src/main.py
This project aims to explore the US flight departures features in 2022. This will be made through the analysis of weather conditions, cancellations, dates, locations and carriers among others. Nevertheless, it will feature first a ETL pipeline to preprocess different data sources and then load into a OLAP database, for BI consumption.
Objectives
NaN
(empty) values should be dropped.Objectives
Introduction
The project aims to analyze the files that are given in this dataset: 2022 U.S. Domestic Flights Departures
Prework
The prework is made to take some original files and export them to SQL database and a JSON file to simulate we have different data sources in the project. See more in Prework
Documentation of Stages