-
How many unique vessels are available in the dataset?
For the following exercise, first download the AIS Dynamic Data available from OA 7.4. This data is collected by a Naval Academy receiver and is available from “Heterogeneous integrated dataset for maritime intelligence, surveillance, and reconnaissance.” Using the dataset, answer the following questions: a. How many unique vessels are available in the dataset? b. List…
-
predict how the predictor attributes impact the overall rating of the restaurant.
A popular restaurant review website has released the dataset you can download from OA 8.3. Here each row represents an average rating of a restaurant’s different aspects as provided by previous customers. The dataset contains records for the restaurants usingthe following attributes: ambience, food, service, and overall rating. The first three attributes are predictor variables…
-
build another regression model to predict the total assets of an airline from the customers served by the airline.
For the next exercise, you are going to use the Airline Costs dataset available to download from OA 8.4. The dataset has the following attributes, among others: i. Airline name ii. Length of flight in miles iii. Speed of plane in miles per hour iv. Daily flight time per plane in hours v. Customers served…
-
create a linear model and check if the perm has linear relationship with the remaining three attributes.
Download data from OA 8.5, which was obtained from BP Research (image analysis by Ronit Katz, University of Oxford). This dataset contains measurements on 48 rock samples from a petroleum reservoir. Here 12 core samples from petroleum reservoirs were sampled in four cross-sections. Each core sample was measured for permeability, and each crosssection has total…
-
What can you tell us about the rating of a movie from its budget and aggregated number of followers in social media channels?
For this exercise, you are going to work again with a movie review dataset. In this dataset, conventional and social media movies, the ratings, budgets, and other information of popular movies released in 2014 and 2015 were collected from social media websites, such as YouTube, Twitter, and IMDB, etc.; the aggregated dataset can be downloaded…
-
Create a logistics regression model to predict the class label from the first eight attributes of the question set.
An automated answer-rating site marks each post in a community forum website as “good” or “bad” based on the quality of the post. The CSV file, which you can download from OA 9.14, contains the various types of quality as measured by the tool. Following are the type of qualities that the dataset contains: i.…
-
create a prediction model using softmax regression for the species of iris flower.
The Iris flower dataset or Fisher’s Iris dataset (built into R or downloadable from OA 9.16) is a multivariate dataset introduced by the British statistician and biologist Ronald Fisher in his 1936 paper.4 The use of multiple measurements in taxonomic problems is an example of linear discriminant analysis. The dataset consists of 50 samples from…
-
How does it affect the evaluation when you include the region while training the model?
In this exercise you will work with the Blues Guitarists Hand Posture and Thumbing Style by Region and Birth Period data, which you can download from OA 9.24. This dataset has 93 entries of various blues guitarists born between 1874 and 1940. Apart from the name of the guitarists, that dataset contains the following four…
-
Use the SVM to see if we can use Height, Weight, Age, and Sex (0 = male or 1 = female) to determine the Race (0 = white or 1 = other) of the child.
For this exercise, we have collected a sample of 198 cases from the NIST’s AnthroKids dataset that is available for download from OA 9.27. The dataset comes from a 1977 anthropometric study of body measurements for children: Foster, T. A., Voors, A. W., Webber, L. S., Frerichs, R. R., & Berenson, G. S. (1977). Anthropometric…
-
Create a report with the description of (1) your data collection method, (2) your data analysis method, and (3) your findings.
For this exercise, we will focus even more on the problem and less on the amount and nature of data. On top of that, we will do cross-data and cross-platform analysis. Imagine you are working as an aide or advisor for a candidate for an upcoming election. You are preparing the candidate for an open…