Write My Paper Button

WhatsApp Widget

Classification analysis

Classification analysis

MA609 Business Analytics and Data Intelligence Week 10: TUT 10

  1. How businesses collect and produce data? They do it via:
    1. Sales and returns transactions
    1. Bar code scans
    1. Credit card transactions
    1. GPS and RFID tracking
    1. Clicks on a webpage
  • Define data mining
  • Data mining is the process of finding and extracting useful information and insights from large datasets
  • Like geological mining
    • It is often hard, dirty work
    • It takes the right tools
  • Explain data mining process
  • Identify Opportunity
    • Don’t dig randomly
    • Begin with the end in mind
    • What is the business problem/opportunity?
  • Collect Data
    • Decided where to dig
    • Get the right data – internally or externally
    • Millions of records aren’t required – use samples
    • 10p to 15p records is OK (where p = # of variables)
  • Understand, Explore & Prepare the Data
    • Know what the data represents
    • Make sure it is clean & complete
    • Eliminate unneeded/redundant variables
    • Transform variables as needed
    • You might spend most of your data mining time here!
  • Identify Task & Tools
    • Classification (supervised)
    • Prediction (supervised)
    • Segmentation/Clustering (unsupervised)
  • Partition Data
    • Training
    • Validation
    • Testing (optional)
  • Build & Evaluate Models
    • Try different models
    • Try different parameter settings
    • Avoid overfitting
  • Deploy Models
    • Integrate models in operational systems
    • Train users
    • Monitor results
    • Look for opportunities for continuous improvement
  • Define classification and give a few examples of its application.

Classification determines into which of m mutually exclusive group does an observation of unknown origin belong. Some areas of classification application are:

  • Character/target recognition
  • Oil/gold exploration
  • Loan approval
  • Diagnose diseases
  • Identify defects
  • Predict bond ratings
  • Fraud detection (credit card, tax, trading, etc)
  • Predict winners of sports events
  • What are steps to classify using Full Bayes Classifier? What could be the problem in this case?
  • To classify a new record
    • Find all matching records
    • Put new record in most frequently occurring matching group
  • Problem
    • Continuous variables are unlikely to match exactly
    • Even with nominal variables, there might not be a match

The remainder of today’s session is allocated to the group project. When finished tutorial please work on the project.

Don`t copy text!
WeCreativez WhatsApp Support
Our customer support team is here to answer your questions. Ask us anything!
???? Hi, how can I help?