Project Proposal
Exploratory Analysis & Prediction on PM2.5 dataset
Liu, Jianke - liu.jiank@husky.neu.edu
Krishnamoorthy, Yeshwin - krishnamoorthy.y@husky.neu.edu
Ramakrishnan, Ganapathy Subramaniam - ramakrishnan.ga@husky.neu.edu
Objective
The objective is to analyze the impact of meteorological factors like temperature, pressure and wind direction affecting the aggregation, diffusion and spread of PM2.5 levels in Beijing, China and to predict them in the future.
Approach
Firstly, using data recorded by the US Embassy in Beijing for the years 2010-2014, we understand and define the problem by having a look at the input parameters (temperature, pressure, dew factor, wind direction, timeline of data, etc.) and the output parameter (PM2.5 concentration).
Secondly, we analyze and prepare the data by preprocessing (data cleansing, formatting, and sampling) and transformation (scaling and aggregation).
Thirdly, we choose the type of machine learning algorithm to use. Since we have both the input and output data, we plan to use supervised machine learning algorithms like native Bayes, linear regression and neural networks.
Next, we partition the data into three subsets – training, test and validation set. The proportion of a training and a test set is usually 80 to 20 percent respectively. We split the training set again, and use its 20 percent to form a validation set.
Finally, we predict the PM2.5 level the best model based on testing the data using machine-learning techniques. To evaluate the result, we may compare our results with the real value we get from the internet.
Data Acquisition
We find the datasets mainly from https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data.
Coding
Languages – Python
Libraries - Pandas, NumPy, TensorFlow, Keras
Timeline
Oct 19 – Oct 26. Initial analysis and data preprocessing
Oct. 27 – Nov. 12 Selection of model and training the data
Nov. 13 – Nov. 28 Evaluation and prediction
Nov 29 – Dec 2 Final preparation of report and presentation
Team-member roles
XXX: Algorithm implementation and coding
XXX: Coding and testing data
XXX: Analysis, testing and documentation
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。