Assessment Title: Predictive Model Creation and Evaluation Assessable Item: · One (1) piece of a written report no more than 10-page long with the signed Assignment Cover Sheet. The submitted report should answer all questions listed in the assignment task section in sequence. 1. Follow the instructions above to split the source data into training and test sets. Answer the following questions after splitting the data.

Assessment Title: Predictive Model Creation and Evaluation
Assessable Item:
· One (1) piece of a written report no more than 10-page long with the signed Assignment Cover Sheet.

The submitted report should answer all questions listed in the assignment task section in sequence.
1. Follow the instructions above to split the source data into training and test sets. Answer the following questions after splitting the data.
1) Past a clear screenshot of the whole workflow of assignment 1 in the report
2) How many tuples are included in the training set?
3) How many species are included in the test set?
4) Do species “Whitefish” and “Smelt” have the same number of tuples included in the test set?

2. Build a Linear Regression Model using all available attributes to predict the value of the “Weight_of_Fish_in_Gram”. Answer the following questions after completing the model training and test.
1) What is the R2 value of your test result?
2) Give the screenshot of the scatter plot result of your test output using “Weight_of_Fish_in_Gram” on the x-axis and the prediction value on the y-axis. Assign different colours to the data points based on the “species.”
3) Which species has the heaviest predicted weight in your test result?
4) How many prediction results are infeasible in your test result?
5) Looking at your source data before splitting them, which two species can be easily separated from others if looking at the “Height_in_cm” and “Diagonal_Width_in_cm” attributes? Post your visualisation result on data observation in the report.
6) Draw a pie chart of the original input data before splitting it into training and test sets. Use different colours for each species and show the percentage of data in the pie chart.

3. Build a Logistic Regression Model with all attributes and use “Smelt” as the reference category. The maximal number of epochs and epsilon should be set to 10,000 and 0.0001, respectively. Use 3122 as the seed in the logistic regression node. Answer the following questions after completing the model training and test.
1) Which species has no “True Positive (TP)” case in the prediction result?
2) For the species with no TP case, which species will be misplaced?
3) What is the overall accuracy of the prediction result?
4) List all species names that have 100% correctly classified test results.
5) Which species has a 50% chance of being misplaced into another species in the test result?
6) In the test result, what percentage of the species “Pike” is misplaced into others?

4. Build a new linear regression model different from the one built when answering question 2. This time, let’s focus on the species “Perch” only. You are limited to using three attributes in the input to predict the “Weight_of_Fish_in_Gram.” Use a “Scatter Matrix (local)” node to observe your data and decide the suitable attributes to be included. The linear regression model should be the same as the one used in question 2 except for the input attributes. Build, train, and test the model and then answer the questions below
1) Give the reasons for each eliminated attribute and why they are not selected as the input.
2) List the R2 of your test result and compare it with the one in question 2. Reveal both R2 values obtained in question 2 and in question 4.

CLAIM YOUR 30% OFF TODAY

X
Don`t copy text!
WeCreativez WhatsApp Support
Our customer support team is here to answer your questions. Ask us anything!
???? Hi, how can I help?