代做COSC2960 – Foundations of Artificial Intelligence for STEM Assessment 2B代写留学生Matlab语言-代写Database作业

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

您当前位置：首页 >> Database作业Database作业

代做COSC2960 – Foundations of Artificial Intelligence for STEM Assessment 2B代写留学生Matlab语言

日期：2024-08-27 04:13

COSC2960 – Foundations of Artiﬁcial Intelligence for STEM

Assessment 2B

Data Analysis and Modelling Assignment

1.0 Statement of Problem/Introduction

In the era of data abundance, the signiﬁcance lies not merely in its volume, but in our ability to derive meaningful insights. This task invites the application of class-acquired techniques to dissect and interpret data.

Our focus rests on scrutinizing the pivotal role of sleep-in overall health and well-being, encompassing cognitive, emotional, and physical realms. The analysis of the Sleep Health and Lifestyle Dataset, collected via wearable devices, is paramount. Tasked as data scientists, our mission is to enhance sleep tracking accuracy and elucidate the impact of lifestyle factors on sleep quality.

Through meticulous analysis, we seek to uncover trends and correlations, illuminating pathways for enhanced sleep hygiene and the mitigation of sleep-related disorders. This endeavour extends to the identiﬁcation of predictors for sleep duration and quality, clustering individuals based on sleep behaviour and lifestyle, and discerning patterns indicative of sleep disorders.

Our pursuit is to contribute to the scholarly discourse on sleep health, ofering actionable recommendations for fostering improved sleep habits and overall well-being.

2.0 Data Summary

The following dataset contains a large variety of data categories and variables within them that are respective to our study. To be exact, there are 14 categories which outline all key aspects of sleep health and our knowledge of it. We will segment them into Numerical and Nominal variables respectively:

2.1 Numeric Categories

- Age: The Age of a Person in Years.

- Sleep Duration (hours): The Number of Hours a Person has Slept in a Day.

- Physical Activity Level (minutes/day): The Number of Minutes the Person Engages in Physical Activity per Day.

- Systolic Blood Pressure: The blood pressure measurement of the person.

- Diastolic Blood Pressure: The blood pressure measurement of the person.

- Heart Rate (bpm): The resting heart rate of the person in beats per minute.

- Daily Steps: The number of steps the person takes per day.

- Stress Level: A subjective rating of the stress level experienced by the person.

- Quality of Sleep: A subjective Rating of Quality of Sleep.

2.2 Nominal Categories

- Gender: The gender of the person.

- Occupation: The occupation or profession of the person.

- BMI Category: The BMI category of the person.

- Sleep Disorder: The presence or absence of a sleep disorder in the person.

3.0 Data Cleaning and Processing

3.1 Values Missing:

Category	Values Missing
Sleep Duration	2 (1%)
Quality of Sleep	4 (1%)
Physical Activity Level	1 (>1%)
Stress Level	1 (>1%)
BMI Category	7 (2%)
Systolic BP	1 (>1%)
Diastolic BP	3 (1%)
Heart Rate	4 (1%)
Daily Steps	7 (2%)
TOTAL	30

Table 1. Missing Values in Sleep Data Set

Using WEKA function NumericCleaner to display all missing values, we can clearly identify each single missing unit in our data set.

Fig 1. WEKA Attribute Missing Values Display (Sleep Duration)

Fig 2. WEKA Attribute Missing Value Example (ID 15, Sleep Duration)

As displayed, a total of 30 values were missing from the 374 People surveyed for the study. Missing values in a large dataset can occur due to various reasons such as data entry errors, equipment malfunction, incomplete surveys, participant non-response, or simply because certain variables were not applicable or recorded. These missing values may be sporadic and random, or they could follow a pattern inﬂuenced by speciﬁc factors within the data collection process.

3.2 Dealing with Values Missing:

After having identiﬁed all missing values, still lies the decision on how to approach dealing with them. Two clear options are viable in order to contain the homogeneity of the data, that being:

1. Removing all missing values altogether.

2. Adjusting the missing values with the mean/mode of their respective category.

3. Replacing all missing values with a constant arbitrary variable.

For the following reasons, replacing the missing values using the WEKA function ReplaceMissingValues with their respective mean/mode categories came to be the clearer option:

Preservation of Data Integrity: By ﬁlling in missing values with the mean of their respective categories, you retain more data points, thus preserving the integrity and completeness of your dataset. This ensures that you can still analyse trends and patterns across all variables without signiﬁcant loss of information.

Maintaining Sample Size: Removing missing values altogether could result in a reduction in sample size, which might afect the statistical power of your analysis. By imputing missing values with the mean, you avoid this loss of sample size, allowing for more robust statistical analysis and potentially more accurate results.

Fig 3. WEKA Attribute Information Post Replacement (Sleep Duration)

Mitigating Bias: Removing missing values can introduce bias, especially when the missing values aren’t random. Replacing missing values with the average helps to reduce this bias by keeping the overall data distribution consistent across each category.

Simplicity and Ease of Interpretation: The average is a straightforward method to ﬁll in missing values with. It’s simple to implement and interpret, avoiding the complexity of more complex imputation techniques.

Category	Mean/Mode Value
Sleep Duration	7.135
Quality of Sleep	7.308
Physical Activity Level	59.088
Stress Level	5.389
BMI Category	Normal
Systolic BP	128.523
Diastolic BP	84.668
Heart Rate	68.457
Daily Steps	6843.869

Table 2. Mean/Mode Values of Categories with Missing Values

Fig 4. WEKA Attribute Replaced Value Example (ID 15, Sleep Duration)

3.3 Outliers

Using the WEKA InterquartileRange function, we are able to identify all the outliers of the data set. A total of 14 outliers out of 374 instances is identiﬁed in the Sleep Data set. In other words, any instance labelled with a ‘yes’ is considered as an outlier instance, whether that be one or various values that deﬁne it as that.

Fig 5. WEKA Attribute Outliers Information

Fig 6. WEKA Outliers Example on Data Set Table

3.4 Dealing with Outliers and Scaling Method

To deal with the outliers in the dataset, it was decided that removing all instances deﬁned as outliers are to be removed. This scaling method is called standardisation, removing outliers through the use of the standard deviation which acts as a ‘range’ to easier identify which outliers need removing from the dataset. Utilising the use the WEKA RemoveWithValues function, all outliers are removed from the dataset. We repeat the process with ‘Extreme Values’, these are also considered outliers, just to a larger extreme than outliers.

Fig 7. WEKA Attribute Outliers Information (Post Removal)

Fig 8. WEKA Outliers Removal Example on Data SetTable

Removing outliers from a dataset can be crucial for several reasons:

Maintaining Data Integrity: By deﬁnition, outliers are information focuses that contrast fundamentally from most of the informational index. Even though they might address substantial perceptions, they may likewise be because of blunders, commotion, or exceptions. By eliminating outliers, you guarantee that your dataset precisely mirrors the foundation peculiarities being considered and, in this manner, safeguards its honesty.

Enhancing Statistical Analysis: Outliers can contort factual measures like the mean, middle, and standard deviation, prompting misdirecting translations. Eliminating outliers balances out these measurements, making factual investigation more dependable. This guarantees that elucidating insights give a clearer perspective on the focal inclination and ﬂuctuation of the information.

Improving Model Performance: Outliers can adversely inﬂuence the exhibition of prescient models by adding commotion and predisposition. Models prepared on datasets with outliers can deliver less exact expectations and more unfortunate speculations to new information. By eliminating outliers, you work on the nature of the information used to prepare the model, prompting more exact and solid forecasts.

Facilitating Visualization: Outliers can misshape information representations, making it challenging to distinguish signiﬁcant examples and connections. Eliminating outliers can work on the lucidity and interpretability of perceptions, empowering a superior comprehension of the fundamental construction of the information.

Ensuring Assumptions Hold: Numerous factual procedures and AI calculations expect that the information is regularly dispersed or has speciﬁc properties. Outliers can disregard these presumptions and lead to possibly wrong ends. Eliminating outliers guarantees that the suppositions fundamental the investigation are right, subsequently working on the legitimacy of the outcomes.

4.0 Data Cleaning and Processing

4.1 Unique Values within each Category of the Dataset

Category	Unique Values
Age	1 (>1%)
Occupation	1 (>1%)
Physical Activity Level	2 (1%)
Stress Level	1 (>1%)
Systolic BP	3 (1%)
Diastolic BP	1 (>1%)
Heart Rate	2 (1%)
Daily Steps	2 (1%)
TOTAL	13

Table 3. Unique Values in Sleep Data Set

4.2 Most Common Occupations

In our deep dive into sleep data, we're not just crunching numbers; we're peeling back the layers of diferent jobs and how they shape our snooze habits.

In this massive sea of data, each job tells its own story, giving us a peek into how folks catch their Z's. Whether you're a night owl keeping watch or a day hustler hitting the grind, we're sailing through this patchwork of jobs, shedding light on how work and sleep dance together.

Fig 9. Count of Occupation Graph

Table 4. Count of Occupation

4.3 Sleep Duration Dependency on Occupation Type

In our investigation, we're diving into the impact of various professions on sleep duration. We're interested in comparing how speciﬁc occupations—scientist, salesperson, teacher, software engineer, manager, doctor, nurse, accountant, lawyer, and engineer—afect both the length and quality of sleep.

Through rigorous analysis, we aim to uncover the intricate connections between these professional roles and the crucial aspect of achieving adequate rest. Join us as we explore the relationship between occupation and sleep duration, shedding light on the diverse factors inﬂuencing our nightly rejuvenation.

Fig 10. Occupation vs Sleep Duration

Table 5. Mean/Modal/Median Sleep Duration vs Occupation

Fig 11. Occupation Mean/Modal/Median Sleep Duration

4.4 Most Common BMI Categories

We turn our attention to the prevalence of Body Mass Index (BMI) categories. Speciﬁcally, we aim to dissect the distribution of individuals across the most common BMI classiﬁcations: Normal, Normal Weight, Obese, and Overweight.

By scrutinizing this data, we seek to unveil patterns and trends regarding the prevalence of each BMI category within our sample population.

Through meticulous analysis, we endeavour to gain insights into the frequency and proportions of individuals falling into each category, providing valuable information on the distribution of BMI across our dataset.

Fig 12. BMI Categories Count Graph

Table 6. BMI Category Count

4.5 Age Groups and Their Respective BMI Index Categories

Through rigorous analysis, we aim to explore the demographic composition of individuals across various BMI ranges by examining both mean and median ages. Our goal is to identify any discernible patterns or trends in age distribution within each BMI category. By uncovering these insights, we hope to shed light on the relationship between BMI and age within our dataset. Join us as we delve into this analytical journey to unravel the complexities of BMI categories and age distribution, gaining valuable insights into their interplay.

Fig 13. BMI Categories vs Age Graph

Table 7. BMI Categories and Mean/Median Age

4.6 Blood Pressure dependency on BMI Index Category

Our aim is to uncover any observable patterns or trends in blood pressure levels across diferent BMI categories, ofering valuable insights into how BMI may inﬂuence cardiovascular health within our dataset. Join us as we delve into this analytical endeavour, examining the relationship between BMI categories and blood pressure to gain a deeper understanding of cardiovascular health metrics.

Fig 14. BMI Categories vs Blood Pressures Graph

Table 8. Systolic/Diastolic Blood Pressure Mean Values

4.7 Diferent Sleep Quality against Increasing Stress Levels

We focus on understanding the potential impact of increased stress levels on the quality of sleep within a diverse and extensive dataset. By scrutinizing a range of variables related to stress and sleep quality, we aim to discern any correlations or trends that may exist.

Through rigorous analysis, we shed light on the intricate relationship between stress levels and sleep quality, ofering valuable insights into the factors that may inﬂuence individuals' ability to attain restful sleep amidst varying stress levels.

Fig 15. Stress Level vs Sleep Quality Graph

Table 9. Stress Level vs Sleep Quality Mean/Modal Values

4.8 Male vs Female Sleeping Disorder Count

We're examining the occurrence of sleeping disorders across genders in our dataset, aiming to identify diferences between males and females. By carefully documenting the instances of sleeping disorders in each gender, we aim to quantify and contrast their frequency. Through comprehensive analysis, our goal is to uncover any noticeable patterns or trends that might indicate which gender has a higher prevalence of sleeping disorders.

Fig 16. Female Sleep Disorder Count Graph

Table 10. Female Sleep Disorder Count

Fig 17. Male Sleep Disorder Count Graph

Table 11. Male Sleep Disorder Count

4.9 How Daily Steps afect Quality of Sleep

In this study, we're exploring how boosting daily step count might afect sleep quality. We're analysing data that tracks individuals' daily steps alongside metrics of sleep quality to investigate the connection between physical activity and sleep patterns. Our aim is to gain a deeper understanding of how changes in daily step count could impact factors like sleep duration, eficiency, and disturbances. We're aiming to uncover valuable insights into the potential advantages of upping physical activity levels for better sleep quality. Ultimately, our goal is to enhance our understanding of lifestyle factors that inﬂuence overall sleep health.

Fig 18. Daily Steps vs Sleep Quality Graph

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：代做ES196 Statics and Structures代做Prolog

【下一篇】：代做ES196 Statics and Structures代做Prolog

微信客服：codinghelp

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Database作业Database作业

代做COSC2960 – Foundations of Artificial Intelligence for STEM Assessment 2B代写留学生Matlab语言

日期：2024-08-27 04:13

相关文章