SET11123 Scripting for Data Science Assignment
Assessment Brief Proforma
1. Module number | SET11123 |
2. Module title | Scripting for Data Science |
3. Module leader | Md Zia Ullah |
4. Tutor with responsibility for this Assessment Student’s first point of contact | Md Zia Ullah Kehinde Babaagba |
5. Assessment | Practical coursework |
6. Weighting | 50% of module assessment |
7. Size and/or time limits for assessment | You should be able to complete this assessment within approximately 20 hours (if you have kept up with lecture and practical materials). |
8. Deadline of submission | Friday 28th of April at 3pm Your attention is drawn to the penalties for late submissions. Only your module leader can authorise extensions. |
9. Arrangements for submission | Via Moodle – See coursework document. |
10. Assessment Regulations | All assessments are subject to the University Regulations. Plagiarised work will be dealt with according to the university’s guidelines (Please read – especially if this is the first time in a UK university): Academic Integrity – Edinburgh Napier Students’ Association |
11. The requirements for the assessment | See coursework document. |
12. Special instructions | If you use pieces of code that is not your own work (e.g., copied from an example you found online), you must specify its source (e.g., URL). Failing to do so, your coursework may be deemed as plagiarised. |
13. Return of work and feedback | You should keep a copy of your submitted work. Written feedback with indicative marks will be provided via Moodle within 3 (working) weeks from the date of the submission (read above). Individual oral feedback will be provided in a form of a one-to-one meeting (upon request from the student). Please note that all marks are subjected to internal moderation and verification at the assessment board. If you have any doubts wrt your feedback and/or would like to discuss it, please contact the module leader. |
14. Assessment criteria | See coursework document. |
Overview
The aim of this coursework is to consolidate your knowledge of all the fundamentals of Python. You should be able to solve all the tasks detailed below with all you have learned in the module (data types, conditions, loops, functions, classes, regex, data I/O, numpy arrays, data preprocessing, exceptions).
For this coursework, you will be analysing average temperature data of different cities across the world.
Dataset
You can download the dataset in CSV format Average Temperature of Cities.csv on Moodle (downloaded from https://www.kaggle.com/swapnilbhange/average- temperature-of-cities the 17th January 2022). The dataset contains the following columns: Country, City, Average Temperature from Jan-Dec, year average, and continent. The temperature values are expressed in Celsius.
Task
Please read your tasks CAREFULLY.
Desired features
You should create the necessary functions and/or classes to provide the following functionality:
- read the dataset;
- given a string as a parameter indicating the country (case insensitive), you should return the cities where temperature information was recorded. For example, United Kingdom includes data for London and Edinburgh. Hint: a city could be represented as an object including temperature data.
- Given a string as parameter indicating a continent, you should return all the countries included in the dataset associated to the specified continent.
Desired outputs
- Determine the city with the lowest average temperature recorded over a single year. Do the same with the highest temperature.
- Determine the city and the month where the lowest temperature was recorded. Do the same with the highest.
- For each continent, show the top 5 hottest cities (considering the average yearly temperature reported in the dataset).
- For each country, show the city (and the month) were the coldest temperature was recorded.
Submission
You must submit a .zip file, containing:
- A .ipynb file with your code. Please clear all cells before you submit. A .py file is also fine for this submission.
- [Optional] A README file, if you want to describe how your code should be used.
IMPORTANT: A zip file means a zip file. Other formats (e.g., RAR, 7z, GZ) will not be accepted and your submission and your grade will be 0 (zero). Before submitting your solution, it is your responsibility to check the integrity of your file. A corrupted zipped file will also lead to 0 (zero).
Your code file must be named SET11123_YOURMATR_CW2.ipynb. For instance, if your matriculation number is 40014374, then the python file must be SET11123_40014374_CW2.ipynb. Similarly, the zip file containing your code must be named SET11123_YOURMATR_CW2.zip.
Your zip file must be uploaded via Moodle.
Deadline: [Week 13] 28th April 2023 – 15:00.
Additional Information
- You are allowed to use any external library for your code, including those we have not seen during the module.
- If you want to use PyCharm, instead of a Jupyter Notebook, it is totally fine. In that case, you should submit a .py file within your zip.
- Make sure you add comments to your code to help me to understand your thought process. Moreover, at the top of your script file, please also add your name and matriculation number as a comment.
- If you are using Jupyter Notebook, you are also encouraged to create Markdown cells to describe your code. Markdown cells are considered as comments in your code and, as such, they will contribute towards the code quality assessment criterion.
- Make sure that your code works on any computer. You can easily prove this by using/uploading your code on Google Colab.
- Be free to use all the material on Moodle at the best of your convenience. You are also welcome to search online and get inspiration from someone else’s work. However, in this latter case, you must write in your code (e.g., with a comment) where you took the inspiration from.
- You are welcome to ask any clarification regarding this coursework, in the case that certain aspects of it may be unclear or ambiguous.
Assessment Criteria
Description | Marks |
Data Science Skills | |
Data is read and stored properly within classes | 5 |
The classes contain all the expected methods | 5 |
The classes contain suitable properties and helper methods (e.g. constructors, getter/setters, etc.) | 2.5 |
12.5 | |
Functionality | |
Users can retrieve data per city | 4 |
Users can retrieve data per country | 3 |
Users can retrieve data per continent | 3 |
10 | |
Tasks (matched with the desired outputs above) | |
Exercise 1 | 3 |
Exercise 2 | 3 |
Exercise 3 | 4 |
Exercise 4 | 5 |
15 | |
Code Quality | |
The code makes use of the correct statements to solve appropriate tasks | 5 |
The code includes comments for brief descriptions | 5 |
The name of identifiers (e.g., class names, function names, variables, etc.) | 2.5 |
12.5 | |
Total: | 50 |