HMBA/MABS902 – Assignment 1:
Hollywood Movie
Description and goal
In this assignment you will provide data analysis within the context of a business application. The specifications below indicate what you need to produce, but not how to produce it.
MoCo is an organisation investing in movies. Using a multitude of factors, the managers want to
1. Have a better understanding and visualisation of the data through an interactive dashboard
2. Evaluate the potential success or revenues of future movie projects.
Use your business analytics skills and SAS Visual Analytics to solve those questions.
Data – The HOLLYWOODMOVIEDATASET_IM data set.
The data set, available in SAS Viya, was obtained from several movie databases using both automated as well as manual means. It is more than likely that some of the values are captured/entered incorrectly. Hence, the accuracy of the data set cannot be guaranteed. This data set can be used for descriptive and predictive modelling.
The data dictionary is provided below.
Variable Definition Possible values
MovieID A Unique Identifier for the movie. An integer number
StarValue_Director;
StarValue_Producer
StartValue_Cast Signifies the star value of the director, producer and the cast (as per recent past box-office success). An ordinal category from 1 (lowest) to 5 (highest)
OriginalScreePlay The movie is based on an original screen play. Yes/No
Genre_CAT Specifies the content category CAT the movie belongs to. A movie can be classified in more than one content category (e.g. action and comedy). Therefore, each content category is represented with a separate binary variable. Action, Adventure, Animation,
Biography, Comedy, Crime, Drama, Family, Fantasy, History, Mystery, etc.
Binary variable (1 = Yes, 0 =
No)
Competition Indicates the level at which each movie competes for the same pool of entertainment dollars agains movies released at the same time. High, Medium, Low
MPAA Rating The rating assigned by the Motion Picture Association of America. G, PG, PG13, R, NR
MaxScreenCount Indicate the number of screens the movie is expected to be shown at its debut. An integer number
BoxOfficeClass Box-office success category. An integer from 1 (flop) to 9 (blockbuster) – see table below
GrossBoxOffice Box-office gross revenue on theatres. An integer number
ShortStoryLine A short textual description of the script/story. A few sentences
EstimatedBudget Estimated movie budget (this field has values for only a subset of the movies). An integer number
MovieLength The number of minutes the movies runs (this field has values for only a subset of the movies). An integer number
YEAR Year the movie was released, coded as a measure (numerical) value An integer number
YearEnd Year the movie was released, coded as a categorical value. A timestamp
SpecialEffect Specifies the amount of special effect in the movie. An ordinal category ranging from 1 (lowest) to 5 (highest)
The following table shows the breakpoints/bins used to convert the gross box-office revenues to one of nine success categories
Class no 1 2 3 4 5 6 7 8 9
Range
(millions
$) 1 10 20 40 65 100 150 200 200
Additional information
• Regressions can be useful in this context (but you can use other models)
• Report (Word format) to be submitted into the Moodle site before the due date.
• An export of your interactive (SAS) dashboard should be attached to your report as well
• Due date: 30th of September, before 11:30pm
• Maximum 2,500 words (excluding illustrations and appendices). Keep it simple (no need for references and executive summary).
• Do not forget to mention your name on the first page
• Individual assessment