Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

ROHITJAIND/EX-10-SIMULATING-CLUSTERING-USING-WEKA-DATA-MINING-AND-ANALYSIS-TOOL

Folders and files, repository files navigation, ex-10: simulating clustering using weka data mining and analysis tool.

To perform a classification technique using WEKA tool.                  DATE : 26.10.2023

Weka is a comprehensive software that lets you to preprocess the big data, apply different machine learning algorithms on big data and compare various outputs. This software makes it easy to work with big data and train a machine using machine learning algorithms. This tutorial will guide you in the use of WEKA for achieving all the above requirements. WEKA - an open source software provides tools for data preprocessing, implementation of several Data mining and Machine Learning algorithms, and visualization tools so that you can develop machine learning techniques and apply them to real-world data mining problems. What WEKA offers is summarized in the following diagram.

CLUSTERING:

Thus the simulation of clustering technique has been executed using WEKA tool successfully.

weka data mining assignment

Weka is tried and tested open source machine learning software that can be accessed through a graphical user interface, standard terminal applications, or a Java API. It is widely used for teaching, research, and industrial applications, contains a plethora of built-in tools for standard machine learning tasks, and additionally gives transparent access to well-known toolboxes such as scikit-learn , R , and Deeplearning4j .

Getting Started

Video from Josh Gordon , Developer Advocate for @GoogleAI.

Machine Learning without Programming

Weka can be used to build machine learning pipelines, train classifiers, and run evaluations without having to write a single line of code:

weka data mining assignment

Open a dataset

First, we open the dataset that we would like to evaluate.

weka data mining assignment

Choose a classifier

Second, we select a learning algorithm to use, e.g., the J48 classifier, which learns decision trees.

weka data mining assignment

Evaluate predictive accuracy

Finally, we run a 10-fold cross-validation evaluation and obtain an estimate of predictive performance.

Deep Learning with Weka

Deep Learning with WEKA

WekaDeeplearning4j is a deep learning package for Weka. Deep neural networks, including convolutional networks and recurrent networks, can be trained directly from Weka's graphical user interfaces, providing state-of-the-art methods for tasks such as image and text classification.

WEKA Interoperability

WEKA can be integrated with the most popular data science tools.

weka data mining assignment

Weka models can be used, built, and evaluated in R by using the RWeka package for R; conversely, R algorithms and visualization tools can be invoked from Weka using the RPlugin package for Weka.

weka data mining assignment

Weka's functionality can be accessed from Python using the Python Weka Wrapper . Conversely, Python toolkits such as scikit-learn can be used from Weka .

weka data mining assignment

For running Weka-based algorithms on truly large datasets, the distributed Weka for Spark package is available. It makes it possible to train any Weka classifier in Spark, for example.

Data Mining Assignment

Data mining using machine learning algorithms for automatic classification, clustering, and pattern recognition has a wide variety of applications. Weka is a collection of machine learning algorithms for data mining tasks and in this data mining project you should use WEKA to explore the student retention data set available under the course document section of the course in the Biola Blackboard environment .

Examine and experiment with the student retention data set

  • Software : Download and install WEKA .
  • Data : Log into the B iola Blackboard environment to download the retention data sets from the Content area. Carefully read the confidential agreements before you use the data sets and acknowledge the agreements in your study report.   Remember to delete the dataset from your computer after you finish the work. This is our agreement with Biola for using the dataset.  
  • Reference : Look into Part 3 of the textbook Data Mining: Practical Machine Learning Tools and Techniques for the technical details about WEKA in order to conduct the experiments. You can also find additional documentation on the WEKA website when needed.
  • Classification experiments:
  • Classification algorithms to use (under WEKA explorer è classify) : Including J48 in trees and IBk in lazy , pick at least four classifiers from each of the following four categories of classifiers implemented in WEKA: bayes , functions , lazy , and trees . That will give you a collection of at least 16 classifiers. For example, you may pick NaivesBayes, BayesNet, NaiveBayesSimple and NaiveBayesUpdateable in the bayes category and pick VotedPerceptron, SimpleLogistic, RBFnetwork, and SMO in the functions category, and so forth.
  • Classification experiments A : Apply the classifiers you pick to conduct classification experiments (like what you did in Homework #6) using the training dataset in Master Numeric Training List.arff in the numerical version folder to learn to classify the freshman list in Numeric FreshmenList.F09.arff in the numerical version folder. Try at least 4 different parameter settings to fine tune the parameters for the classifiers to improve their performance and for each classifier record the confusion matrix and the estimated precision and recall of the classifier based on the 10 fold cross validation.
  • Classification experiments B : Do the experiments again using the training dataset in Balanced x2 Numeric Training List.arff (which artificially duplicates the all the “lost” cases to increase the percentage of lost cases among all the cases in the training data set) in the numerical version folder to learn to classify the freshman list in Numeric FreshmenList.F09.arff in the numerical version folder.
  • Clustering experiments:
  • Clustering algorithms to use (under WEKA explorer è cluster) :   Pick at least two clustering algorithm from WEKA. For example, you may pick EM and FarthestFirst and so forth.
  • Clustering experiments A : Apply the clustering algorithms you pick to conduct clustering experiments using the training dataset in Master Numeric Training List.arff in the numerical version folder. Try at least two different parameter settings to fine tune the parameters. Use WEKA to visually explore the resulting clusters.
  • Association experiments: :
  • Association algorithms to use (under WEKA explorer è associate) :   Pick at least two association algorithm from WEKA. For example, you may pick Apriori and Tertius and so forth.
  • Association experiments A : Apply the clustering algorithms you pick to conduct clustering experiments using the training dataset in Master Numeric Training List.arff in the numerical version folder. Try at least two different parameter settings to fine tune the parameters and see the resulting association rules found.

What to include in your report for this data mining assignment:

  • Provide an estimate of the amount of time you spent in the work.
  • For the classification experiments A and B,
  • For each individual experiment, report the confusion matrix and the estimated precision and recall of the classifier based on the 10 fold cross validation.
  • Describe the main differences you observe between the results from classification experiments A and B and provide your explanations of the differences observed.
  • If you would provide a list of likely-to-be-lost students to the retention staff, what would be the list based on your findings in the experiments? What is the estimated precision and recall?
  • For the clustering experiments, generally describe the resulting clusters you got and any insight you got when visually inspect the resulting clusters.
  • For the association experiments, report three or more interesting (for example, making sense intuitively) association rules you discovered in the experiments and explain why they are interesting.
  • Write down a short reflection of at least 250 words on Artificial Intelligence and data mining in the context of this assignment.

The Weka Workbench

Weka is open-source machine learning software issued under the GNU General Public License .

Found only on the islands of New Zealand, the Weka is a flightless bird with an inquisitive nature. The name is pronounced like this , and the bird sounds like this .

IMAGES

  1. Data Mining with Weka (1.6: Visualizing your data)

    weka data mining assignment

  2. Data Mining with Weka (1.4: Building a classifier)

    weka data mining assignment

  3. WEKA demostration 1

    weka data mining assignment

  4. Weka data mining tutorial

    weka data mining assignment

  5. DATA PREPROCESSING WITH WEKA

    weka data mining assignment

  6. Data Mining in WEKA

    weka data mining assignment

VIDEO

  1. Data Mining Week 6 Assignment 6 solution || NPTEL 2024

  2. Data Mining ASSIGNMENT 1 WEEK 1 NPTEL SWAYAM 2024

  3. Day 2: Data Mining using Weka_19-03-2024

  4. ID3 using Weka

  5. Data Mining || NPTEL week 8 assignment answers 2023 #nptel #datamining #skumaredu #2023

  6. Data preprocessing using weka

COMMENTS

  1. Data Mining in WEKA

    Data Mining Process. The data mining process consists of several steps. First, data acquisition, cleaning, and integration happen. Then, because different datasets come from various sources, it is necessary to remove inconsistencies and make all of them align. Next, selection of appropriate features takes place.

  2. EX-10: SIMULATING CLUSTERING USING WEKA DATA MINING AND ...

    PROCEDURE: 1. In the WEKA explorer select the Preprocess tab. Click on the Open file ... option and select the iris.arff file in the file selection dialog. 2. Click on the Cluster TAB to apply the clustering algorithms to our loaded data. Click on the Choose button. Now, select EM as the clustering algorithm.

  3. PDF Tutorial Exercises for the Weka Explorer 17

    button near the top of the Classify tab. A dialog window appears showing various types of classifier. Click the trees entry to reveal its subentries, and click J48 to choose that classifier. Classifiers, like filters, are organized in a hierarchy: J48 has the full name weka.classifiers.trees.J48. The classifier is shown in the text box next to the Choose button: It now reads

  4. Data Mining Assignment 1

    Assignment 1: Using the WEKA Workbench. A. Become familiar with the use of the WEKA workbench to invoke several different machine learning schemes. Use latest stable version. Use both the graphical interface (Explorer) and command line interface (CLI). See Weka home page for Weka documentation.

  5. PDF Data Mining A Tutorial-Based Primer

    Here is a suggested methodology for incorporating WEKA into Chapter 4 of the text. o Use Figure 4.1 to provide a general discussion of the components of iDA. o The concept hierarchy is a common data structure for data mining. With the help of Figure 4.3, offer an overview of concept hierarchies and how they are used.

  6. How to Run Your First Classifier in Weka

    Click the " Classify " tab. This is the area for running algorithms against a loaded dataset in Weka. You will note that the " ZeroR " algorithm is selected by default. Click the " Start " button to run this algorithm. Weka Results for the ZeroR algorithm on the Iris flower dataset. The ZeroR algorithm selects the majority class in ...

  7. PDF Data Mining A Tutorial-Based Primer

    Data Mining A Tutorial-Based Primer. 4-1. July 22, 2011. Data Mining A Tutorial-Based Primer. Chapter Five using WEKA. Here is a suggested methodology for incorporating WEKA into Chapter 5 of the text. • The material in Sections 5.1 through 5.9 is not associated with a particular data mining tool. These sections can be covered without ...

  8. Weka 3

    Machine Learning Courses. We have put together several free online courses that teach machine learning and data mining using Weka. The videos for the courses are available on Youtube. The courses are hosted on the FutureLearn platform.

  9. PDF Data mining with WEKA, Part 2: Classification and clustering

    Data mining is a collective term for dozens of techniques to glean information from data and turn it into meaningful trends and rules to improve your understanding of the data. In this second article of the "Data mining with WEKA" series, we'll discuss two common data mining methods — classification and clustering — which can be used

  10. Weka 3

    The workbench for machine learning. Weka is tried and tested open source machine learning software that can be accessed through a graphical user interface, standard terminal applications, or a Java API. It is widely used for teaching, research, and industrial applications, contains a plethora of built-in tools for standard machine learning ...

  11. PDF Data mining with WEKA

    Data mining with WEKA. Data mining with WEKA. A use‐case to help you get started. CharalamposMavroforakis BU CS105, Fall 2011. Starting WEKA. Open Weka : Start > All Programs > Weka 3.x.x > Weka 3.x From the "Weka GUI Chooser", pick "Explorer". This is the main WEKA tool that we are going to use. Opening a dataset.

  12. Data Mining Assignment

    Data Mining Assignment. Data mining using machine learning algorithms for automatic classification, clustering, and pattern recognition has a wide variety of applications. ... Weka is a collection of machine learning algorithms for data mining tasks and in this data mining project you should use WEKA to explore the student retention data set ...

  13. Assignment 2

    Assignment-2 data mining introduction to wekaq.1) use the following learning schemes to analyze the zoo data (in zoo.arff): decision stump weka.classifiers. Skip to document. ... Decision T able - weka.classifiers.DecisionT able -R. d) C4.5 - the J48 classifier. e) P AR T - under "rules"

  14. ASSIGNMENT #2: Classification

    WEKA. We shall use WEKA (The workbench for machine learning) for our hands-on with classification tasks. This provides an extensive library of data-mining algorithms, and a graphical-user interface, Explorer, for running analyses over datasets. WEKA: Explorer User Guide, Version 3-4-5 (local copy). Or you can grab a newer manual from the site.

  15. Data Mining with Weka (1.3: Exploring datasets)

    Data Mining with Weka: online course from the University of WaikatoClass 1 - Lesson 3: Exploring datasetshttps://weka.waikato.ac.nz/Slides (PDF): https://www...

  16. WEKA Explorer: Visualization, Clustering, Association Rule Mining

    This tutorial explains how to perform Data Visualization, K-means Cluster Analysis, and Association Rule Mining using WEKA Explorer: In the previous tutorial, we learned about WEKA Dataset, Classifier, and J48 Algorithm for Decision Tree.. As we have seen before, WEKA is an open-source data mining tool used by many researchers and students to perform many machine learning tasks.

  17. 10

    Association Mining with Weka. Let us consider the 'to-play-or-not-to-play' dataset given in Figure 10.1 for getting hands on experience with association mining in Weka. This dataset is available as default dataset in the data folder of Weka with the file name weather.nominal.arff. This dataset has four attributes describing weather ...

  18. PDF Lab Exercise 1 Association Rule Mining with WEKA

    Exercise 3: Mining Association Rule with WEKA Explorer - Weather dataset 1. To get a feel for how to apply Apriori to prepared data set, start by mining association rules from the weather.nominal.arff data set of Lab One. Note that Apriori algorithm expects data that is purely nominal: If present, numeric attributes must be discretized first. 2.

  19. DATA MINING on WEKA

    2. WEKA WEKA is a collection of open source many data mining and machine learning algorithms. It was created by researchers at the University of Waikato in New Zealand, it is a Java based, open source tool. WEKA is used for pre-processing on data, Classification, clustering and association rule extraction It's main features are as follows 49 data preprocessing tools 76 classification ...

  20. Text categorization with WEKA: A survey

    Given the great possibilities offered by textual categorization and, more generally, by the discipline of Text Mining, over the years numerous tools and software packages that implement the most common algorithms in these fields and facilitate their application on new data have been developed; among these we find toolkits offering a visual experience like Knime, Orange, RapidMiner and WEKA but ...

  21. Data Mining Assignments

    Assignment 0: Data Mining in the News. Assignment 1: Using the Weka Workbench (1 week) Assignment 2: Preparing the data and mining it (beginner version) (2 weeks) Assignment 3: Data Cleaning and Preparing for Modeling (intermediate version) (2 weeks) Assignment 4: Feature Reduction (2 weeks) Assignment 5: Predicting treatment outcome (1 week ...

  22. Weka 3

    Found only on the islands of New Zealand, the Weka is a flightless bird with an inquisitive nature. The name is pronounced like this, and the bird sounds like this.this, and the bird sounds like this.

  23. Assignment 9

    Assignment-9 data mining support vector launch the weka tool, and then activate the environment. open the dataset stored in the sub folder of the installed weka. Skip to document. University; High School. Books; Discovery. ... Data Mining (CS 619) 29 Documents. Students shared 29 documents in this course. University