data mining master thesis topics

Research Topics & Ideas: Data Science

Dissertation Coaching

PS – This is just the start…

We know it’s exciting to run through a list of research topics, but please keep in mind that this list is just a starting point . These topic ideas provided here are intentionally broad and generic , so keep in mind that you will need to develop them further. Nevertheless, they should inspire some ideas for your project.

Research topics and ideas about data science and big data analytics

Data Science-Related Research Topics

  • Developing machine learning models for real-time fraud detection in online transactions.
  • The use of big data analytics in predicting and managing urban traffic flow.
  • Investigating the effectiveness of data mining techniques in identifying early signs of mental health issues from social media usage.
  • The application of predictive analytics in personalizing cancer treatment plans.
  • Analyzing consumer behavior through big data to enhance retail marketing strategies.
  • The role of data science in optimizing renewable energy generation from wind farms.
  • Developing natural language processing algorithms for real-time news aggregation and summarization.
  • The application of big data in monitoring and predicting epidemic outbreaks.
  • Investigating the use of machine learning in automating credit scoring for microfinance.
  • The role of data analytics in improving patient care in telemedicine.
  • Developing AI-driven models for predictive maintenance in the manufacturing industry.
  • The use of big data analytics in enhancing cybersecurity threat intelligence.
  • Investigating the impact of sentiment analysis on brand reputation management.
  • The application of data science in optimizing logistics and supply chain operations.
  • Developing deep learning techniques for image recognition in medical diagnostics.
  • The role of big data in analyzing climate change impacts on agricultural productivity.
  • Investigating the use of data analytics in optimizing energy consumption in smart buildings.
  • The application of machine learning in detecting plagiarism in academic works.
  • Analyzing social media data for trends in political opinion and electoral predictions.
  • The role of big data in enhancing sports performance analytics.
  • Developing data-driven strategies for effective water resource management.
  • The use of big data in improving customer experience in the banking sector.
  • Investigating the application of data science in fraud detection in insurance claims.
  • The role of predictive analytics in financial market risk assessment.
  • Developing AI models for early detection of network vulnerabilities.

Research Topic Mega List

Data Science Research Ideas (Continued)

  • The application of big data in public transportation systems for route optimization.
  • Investigating the impact of big data analytics on e-commerce recommendation systems.
  • The use of data mining techniques in understanding consumer preferences in the entertainment industry.
  • Developing predictive models for real estate pricing and market trends.
  • The role of big data in tracking and managing environmental pollution.
  • Investigating the use of data analytics in improving airline operational efficiency.
  • The application of machine learning in optimizing pharmaceutical drug discovery.
  • Analyzing online customer reviews to inform product development in the tech industry.
  • The role of data science in crime prediction and prevention strategies.
  • Developing models for analyzing financial time series data for investment strategies.
  • The use of big data in assessing the impact of educational policies on student performance.
  • Investigating the effectiveness of data visualization techniques in business reporting.
  • The application of data analytics in human resource management and talent acquisition.
  • Developing algorithms for anomaly detection in network traffic data.
  • The role of machine learning in enhancing personalized online learning experiences.
  • Investigating the use of big data in urban planning and smart city development.
  • The application of predictive analytics in weather forecasting and disaster management.
  • Analyzing consumer data to drive innovations in the automotive industry.
  • The role of data science in optimizing content delivery networks for streaming services.
  • Developing machine learning models for automated text classification in legal documents.
  • The use of big data in tracking global supply chain disruptions.
  • Investigating the application of data analytics in personalized nutrition and fitness.
  • The role of big data in enhancing the accuracy of geological surveying for natural resource exploration.
  • Developing predictive models for customer churn in the telecommunications industry.
  • The application of data science in optimizing advertisement placement and reach.

Research topic evaluator

Recent Data Science-Related Studies

Below, we’ve included a selection of recent studies to help refine your thinking. These are actual studies,  so they can provide some useful insight as to what a research topic looks like in practice.

  • Data Science in Healthcare: COVID-19 and Beyond (Hulsen, 2022)
  • Auto-ML Web-application for Automated Machine Learning Algorithm Training and evaluation (Mukherjee & Rao, 2022)
  • Survey on Statistics and ML in Data Science and Effect in Businesses (Reddy et al., 2022)
  • Visualization in Data Science VDS @ KDD 2022 (Plant et al., 2022)
  • An Essay on How Data Science Can Strengthen Business (Santos, 2023)
  • A Deep study of Data science related problems, application and machine learning algorithms utilized in Data science (Ranjani et al., 2022)
  • You Teach WHAT in Your Data Science Course?!? (Posner & Kerby-Helm, 2022)
  • Statistical Analysis for the Traffic Police Activity: Nashville, Tennessee, USA (Tufail & Gul, 2022)
  • Data Management and Visual Information Processing in Financial Organization using Machine Learning (Balamurugan et al., 2022)
  • A Proposal of an Interactive Web Application Tool QuickViz: To Automate Exploratory Data Analysis (Pitroda, 2022)
  • Applications of Data Science in Respective Engineering Domains (Rasool & Chaudhary, 2022)
  • Jupyter Notebooks for Introducing Data Science to Novice Users (Fruchart et al., 2022)
  • Towards a Systematic Review of Data Science Programs: Themes, Courses, and Ethics (Nellore & Zimmer, 2022)
  • Application of data science and bioinformatics in healthcare technologies (Veeranki & Varshney, 2022)
  • TAPS Responsibility Matrix: A tool for responsible data science by design (Urovi et al., 2023)
  • Data Detectives: A Data Science Program for Middle Grade Learners (Thompson & Irgens, 2022)
  • MACHINE LEARNING FOR NON-MAJORS: A WHITE BOX APPROACH (Mike & Hazzan, 2022)
  • COMPONENTS OF DATA SCIENCE AND ITS APPLICATIONS (Paul et al., 2022)
  • Analysis on the Application of Data Science in Business Analytics (Wang, 2022)

Get 1-On-1 Help

Find the perfect research topic.

How To Choose A Research Topic: 5 Key Criteria

How To Choose A Research Topic: 5 Key Criteria

How To Choose A Research Topic Step-By-Step Tutorial With Examples + Free Topic...

Research Topics & Ideas: Automation & Robotics

Research Topics & Ideas: Automation & Robotics

A comprehensive list of automation and robotics-related research topics. Includes free access to a webinar and research topic evaluator.

Research Topics & Ideas: Sociology

Research Topics & Ideas: Sociology

Research Topics & Ideas: Sociology 50 Topic Ideas To Kickstart Your Research...

Research Topics & Ideas: Public Health & Epidemiology

Research Topics & Ideas: Public Health & Epidemiology

A comprehensive list of public health-related research topics. Includes free access to a webinar and research topic evaluator.

Research Topics & Ideas: Neuroscience

Research Topics & Ideas: Neuroscience

Research Topics & Ideas: Neuroscience 50 Topic Ideas To Kickstart Your Research...

📄 FREE TEMPLATES

Research Topic Ideation

Proposal Writing

Literature Review

Methodology & Analysis

Academic Writing

Referencing & Citing

Apps, Tools & Tricks

The Grad Coach Podcast

Krishna Kumar Mishra

I have to submit dissertation. can I get any help

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

Submit Comment

data mining master thesis topics

  • Print Friendly

M.Tech/Ph.D Thesis Help in Chandigarh | Thesis Guidance in Chandigarh

data mining master thesis topics

[email protected]

data mining master thesis topics

+91-9465330425

Data Mining

data mining master thesis topics

  • Help & FAQ

Data Mining

  • Data Science
  • Data and Artificial Intelligence

Student theses

  • 1 - 50 out of 270 results
  • Title (descending)

Search results

3d face reconstruction using deep learning.

Student thesis : Master

Achieving Long Term Fairness through Curiosity Driven Reinforcement Learning: How intrinsic motivation influences fairness in algorithmic decision making

Activity recognition using deep learning in videos under clinical setting, a data cleaning assistant.

Student thesis : Bachelor

A Data Cleaning Assistant for Machine Learning

A deep learning approach for clustering a multi-class dataset, aerial imagery pixel-level segmentation, a framework for understanding business process remaining time predictions, a hybrid model for pedestrian motion prediction, algorithms for center-based trajectory clustering, allocation decision-making in service supply chain with deep reinforcement learning, analyzing policy gradient approaches towards rapid policy transfer, an empirical study on dynamic curriculum learning in information retrieval, an explainable approach to multi-contextual fake news detection, an exploration and evaluation of concept based interpretability methods as a measure of representation quality in neural networks, anomaly detection in image data sets using disentangled representations, anomaly detection in polysomnography signals using ai, anomaly detection in text data using deep generative models, anomaly detection on dynamic graph, anomaly detection on finite multivariate time series from semi-automated screwing applications, anomaly detection on multivariate time series using gans, anomaly detection on vibration data, application of p&id symbol detection and classification for generation of material take-off documents (mtos), applications of deep generative models to tokamak nuclear fusion, a similarity based meta-learning approach to building pipeline portfolios for automated machine learning, aspect-based few-shot learning, aspect-based few-shot learning, assessing bias and fairness in machine learning through a causal lens, assessing fairness in anomaly detection: a framework for developing a context-aware fairness tool to assess rule-based models, a study of an open-ended strategy for learning complex locomotion skills, a systematic determination of metrics for classification tasks in openml, a universally applicable emm framework, automated machine learning with gradient boosting and meta-learning, automated object recognition of solar panels in aerial photographs: a case study in the liander service area, automatic data cleaning, automatic scoring of short open-ended questions, automatic synthesis of machine learning pipelines consisting of pre-trained models for multimodal data, automating string encoding in automl, autoregressive neural networks to model electroencephalograpy signals, balancing efficiency and fairness on ride-hailing platforms via reinforcement learning, benchmarking audio deepfake detection, better clustering evaluation for the openml evaluation engine, bi-level pipeline optimization for scalable automl, block-sparse evolutionary training using weight momentum evolution: training methods for hardware efficient sparse neural networks, boolean matrix factorization and completion, bootstrap hypothesis tests for evaluating subgroup descriptions in exceptional model mining, bottom-up search: a distance-based search strategy for supervised local pattern mining on multi-dimensional target spaces, bridging the domain-gap in computer vision tasks, can time series forecasting be automated: a benchmark and analysis.

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Master thesis topics [closed]

I am looking for a thesis to complete my master, I am interested in Predictive Analytics in marketing, HR, management or financial subject, using Data Mining Application.

I have found a very interesting subject: "Predicting customer churn using decision tree" or either "Predicting employee turnover using decision tree", I looked around very hard but unfortunately couldn't find any relevant dataset to download ( Telecommunication Customer churn Dataset ).

I would like to work on a similar subject using "Decision Tree Technique".

Please suggest some topics or project that would make for a good masters thesis subject.

  • data-mining
  • predictive-modeling
  • decision-trees

Community's user avatar

2 Answers 2

This is the approach I took:

  • Find journals related to your field of studies
  • Skim through the proceedings, see if there are titles that catch your interest
  • Read the papers (carefully or globally) that seemed interesting
  • Carefully consider the approaches and whatever future suggestions they present in their papers
  • Think critically: What would you change? What do you want to find out? Don't limit yourself to data but rather orient from the perspective of research. Solutions for data might only become apparent when you know exactly what you want to examine.

I think this has advantages because these papers outline details regarding data as well -- perhaps you can use the same.

Present some papers and your idea to your prospective supervisor and he/she will make some suggestions. Researchers generally have a lot of knowledge about the possibilities and might even be curious about some things themselves.

Good luck! And enjoy.

lennyklb's user avatar

First, talk to your thesis advisor before committing to a project. They know better than I do.

Secondly, just analyzing a new dataset using standard techniques doesn't make for a good masters thesis. Your project is expected to use some sort of novel approach.

With that said, I'd suggest that you start by reading up on existing decision tree techniques, learning why they work and what their flaws are, and try to find ways to overcome the flaws. Then, once you have your improvement, it should be relatively easy to find a dataset to apply it to.

Timothy Nodine's user avatar

Not the answer you're looking for? Browse other questions tagged data-mining predictive-modeling bigdata decision-trees research or ask your own question .

  • The Overflow Blog
  • Looking under the hood at the tech stack that powers multimodal AI
  • Featured on Meta
  • Join Stack Overflow’s CEO and me for the first Stack IRL Community Event in...
  • User activation: Learnings and opportunities

Hot Network Questions

  • 3D Chip Design using TikZ
  • Returning to the US for 2 weeks after a short stay around 6 months prior with an ESTA but a poor entry interview - worried about visiting again
  • How much would you trust a pre-sales inspection from a "captured" mechanic?
  • string quartet + chamber orchestra + symphonic orchestra. Why?
  • Flaking paint on appliances, is it a child safety issue and what to do if so?
  • How to make the title "Chapter 1" in class memoir sans-serif and `light_blue`?
  • Was the total glaciation of the world, a.k.a. snowball earth, due to Bok space clouds?
  • How to identify and uninstall packages installed as dependencies during custom backport creation?
  • Intra Schengen passport check non EU Wizzair 2024
  • Analytic continuation gives a covering space (and not just a local homeomorphism)
  • How am I supposed to solder this tiny component with pads UNDER it?
  • How can "chemical-free" surface cleaners work?
  • Does it ever make sense to have a one-to-one obligatory relationship in a relational database?
  • How to interpret odds ratio for variables that range from 0 to 1
  • Has the UN ever made peace between two warring parties?
  • Removing undermount sink
  • If a mount provokes opportunity attacks, can its rider be targeted?
  • Why did mire/bog skis fall out of use?
  • Is "Canada's nation's capital" a mistake?
  • Establishing Chirality For a 4D Person?
  • How can I prove that this expression defines the area of the quadrilateral?
  • Fundamental Sampling Theorem
  • Sent money to rent an apartment, landlord delaying refund with excuses. Is this a scam?
  • Count squares in my pi approximation

data mining master thesis topics

data mining Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

Distance Based Pattern Driven Mining for Outlier Detection in High Dimensional Big Dataset

Detection of outliers or anomalies is one of the vital issues in pattern-driven data mining. Outlier detection detects the inconsistent behavior of individual objects. It is an important sector in the data mining field with several different applications such as detecting credit card fraud, hacking discovery and discovering criminal activities. It is necessary to develop tools used to uncover the critical information established in the extensive data. This paper investigated a novel method for detecting cluster outliers in a multidimensional dataset, capable of identifying the clusters and outliers for datasets containing noise. The proposed method can detect the groups and outliers left by the clustering process, like instant irregular sets of clusters (C) and outliers (O), to boost the results. The results obtained after applying the algorithm to the dataset improved in terms of several parameters. For the comparative analysis, the accurate average value and the recall value parameters are computed. The accurate average value is 74.05% of the existing COID algorithm, and our proposed algorithm has 77.21%. The average recall value is 81.19% and 89.51% of the existing and proposed algorithm, which shows that the proposed work efficiency is better than the existing COID algorithm.

Implementation of Data Mining Technology in Bonded Warehouse Inbound and Outbound Goods Trade

For the taxed goods, the actual freight is generally determined by multiplying the allocated freight for each KG and actual outgoing weight based on the outgoing order number on the outgoing bill. Considering the conventional logistics is insufficient to cope with the rapid response of e-commerce orders to logistics requirements, this work discussed the implementation of data mining technology in bonded warehouse inbound and outbound goods trade. Specifically, a bonded warehouse decision-making system with data warehouse, conceptual model, online analytical processing system, human-computer interaction module and WEB data sharing platform was developed. The statistical query module can be used to perform statistics and queries on warehousing operations. After the optimization of the whole warehousing business process, it only takes 19.1 hours to get the actual freight, which is nearly one third less than the time before optimization. This study could create a better environment for the development of China's processing trade.

Multi-objective economic load dispatch method based on data mining technology for large coal-fired power plants

User activity classification and domain-wise ranking through social interactions.

Twitter has gained a significant prevalence among the users across the numerous domains, in the majority of the countries, and among different age groups. It servers a real-time micro-blogging service for communication and opinion sharing. Twitter is sharing its data for research and study purposes by exposing open APIs that make it the most suitable source of data for social media analytics. Applying data mining and machine learning techniques on tweets is gaining more and more interest. The most prominent enigma in social media analytics is to automatically identify and rank influencers. This research is aimed to detect the user's topics of interest in social media and rank them based on specific topics, domains, etc. Few hybrid parameters are also distinguished in this research based on the post's content, post’s metadata, user’s profile, and user's network feature to capture different aspects of being influential and used in the ranking algorithm. Results concluded that the proposed approach is well effective in both the classification and ranking of individuals in a cluster.

A data mining analysis of COVID-19 cases in states of United States of America

Epidemic diseases can be extremely dangerous with its hazarding influences. They may have negative effects on economies, businesses, environment, humans, and workforce. In this paper, some of the factors that are interrelated with COVID-19 pandemic have been examined using data mining methodologies and approaches. As a result of the analysis some rules and insights have been discovered and performances of the data mining algorithms have been evaluated. According to the analysis results, JRip algorithmic technique had the most correct classification rate and the lowest root mean squared error (RMSE). Considering classification rate and RMSE measure, JRip can be considered as an effective method in understanding factors that are related with corona virus caused deaths.

Exploring distributed energy generation for sustainable development: A data mining approach

A comprehensive guideline for bengali sentiment annotation.

Sentiment Analysis (SA) is a Natural Language Processing (NLP) and an Information Extraction (IE) task that primarily aims to obtain the writer’s feelings expressed in positive or negative by analyzing a large number of documents. SA is also widely studied in the fields of data mining, web mining, text mining, and information retrieval. The fundamental task in sentiment analysis is to classify the polarity of a given content as Positive, Negative, or Neutral . Although extensive research has been conducted in this area of computational linguistics, most of the research work has been carried out in the context of English language. However, Bengali sentiment expression has varying degree of sentiment labels, which can be plausibly distinct from English language. Therefore, sentiment assessment of Bengali language is undeniably important to be developed and executed properly. In sentiment analysis, the prediction potential of an automatic modeling is completely dependent on the quality of dataset annotation. Bengali sentiment annotation is a challenging task due to diversified structures (syntax) of the language and its different degrees of innate sentiments (i.e., weakly and strongly positive/negative sentiments). Thus, in this article, we propose a novel and precise guideline for the researchers, linguistic experts, and referees to annotate Bengali sentences immaculately with a view to building effective datasets for automatic sentiment prediction efficiently.

Capturing Dynamics of Information Diffusion in SNS: A Survey of Methodology and Techniques

Studying information diffusion in SNS (Social Networks Service) has remarkable significance in both academia and industry. Theoretically, it boosts the development of other subjects such as statistics, sociology, and data mining. Practically, diffusion modeling provides fundamental support for many downstream applications (e.g., public opinion monitoring, rumor source identification, and viral marketing). Tremendous efforts have been devoted to this area to understand and quantify information diffusion dynamics. This survey investigates and summarizes the emerging distinguished works in diffusion modeling. We first put forward a unified information diffusion concept in terms of three components: information, user decision, and social vectors, followed by a detailed introduction of the methodologies for diffusion modeling. And then, a new taxonomy adopting hybrid philosophy (i.e., granularity and techniques) is proposed, and we made a series of comparative studies on elementary diffusion models under our taxonomy from the aspects of assumptions, methods, and pros and cons. We further summarized representative diffusion modeling in special scenarios and significant downstream tasks based on these elementary models. Finally, open issues in this field following the methodology of diffusion modeling are discussed.

The Influence of E-book Teaching on the Motivation and Effectiveness of Learning Law by Using Data Mining Analysis

This paper studies the motivation of learning law, compares the teaching effectiveness of two different teaching methods, e-book teaching and traditional teaching, and analyses the influence of e-book teaching on the effectiveness of law by using big data analysis. From the perspective of law student psychology, e-book teaching can attract students' attention, stimulate students' interest in learning, deepen knowledge impression while learning, expand knowledge, and ultimately improve the performance of practical assessment. With a small sample size, there may be some deficiencies in the research results' representativeness. To stimulate the learning motivation of law as well as some other theoretical disciplines in colleges and universities has particular referential significance and provides ideas for the reform of teaching mode at colleges and universities. This paper uses a decision tree algorithm in data mining for the analysis and finds out the influencing factors of law students' learning motivation and effectiveness in the learning process from students' perspective.

Intelligent Data Mining based Method for Efficient English Teaching and Cultural Analysis

The emergence of online education helps improving the traditional English teaching quality greatly. However, it only moves the teaching process from offline to online, which does not really change the essence of traditional English teaching. In this work, we mainly study an intelligent English teaching method to further improve the quality of English teaching. Specifically, the random forest is firstly used to analyze and excavate the grammatical and syntactic features of the English text. Then, the decision tree based method is proposed to make a prediction about the English text in terms of its grammar or syntax issues. The evaluation results indicate that the proposed method can effectively improve the accuracy of English grammar or syntax recognition.

Export Citation Format

Share document.

T4Tutorials.com

Data Mining Research Topics for MS PhD

Data Mining Research Topics

I am sharing with you some of the research topics regarding data mining that you can choose for your research proposal for the thesis work of MS, or Ph.D. Degree.

Categorizing the research into 4 categories in this tutorial

Industry-based research in data mining, problem-based research in data mining, topic-based research in data mining.

  • 900+ research ideas in data mining

List of some famous Industries in the world for industry-based research in data mining

  • Automobile Wholesaling
  • Pharmaceuticals Wholesaling
  • Life Insurance & Annuities
  • Online Computer Software Sales
  • Supermarkets & Grocery Stores
  • Electric Power Transmission
  • IT Consulting
  • Wholesale Trade Agents and Brokers
  • Retirement & Pension Plans
  • Petroleum Refining
  • New Car Dealers
  • Drug, Cosmetic & Toiletry Wholesaling
  • Pharmacy Benefit Management
  • Property, Casualty and Direct Insurance
  • Colleges & Universities
  • Public Schools
  • Warehouse Clubs & Supercenters
  • Health & Medical Insurance
  • Gasoline & Petroleum Wholesaling
  • Gasoline & Petroleum Bulk Stations
  • Commercial Banking
  • Real Estate Loans & Collateralized Debt
  • E-Commerce & Online Auctions
  • Electronic Part & Equipment Wholesaling

List of some problems for research in data mining.

  • Crime Rate Prediction
  • Fraud Detection
  • Website Evaluation
  • Market Analysis
  • Financial Analysis
  • Customer trend analysis
  • Data Warehouse and DBMS
  • Multidimensional data model
  • OLAP operations
  • Example: loan data set
  • Data cleaning
  • Data transformation
  • Data reduction
  • Discretization and generating concept hierarchies
  • Installing Weka 3 Data Mining System
  • Experiments with Weka – filters, discretization
  • Task relevant data
  • Background knowledge
  • Interestingness measures
  • Representing input data and output knowledge
  • Visualization techniques
  • Experiments with Weka – visualization
  • Attribute generalization
  • Attribute relevance
  • Class comparison
  • Statistical measures
  • Experiments with Weka – using filters and statistics
  • Motivation and terminology
  • Example: mining weather data
  • Basic idea: item sets
  • Generating item sets and rules efficiently
  • Correlation analysis
  • Experiments with Weka – mining association rules
  • Basic learning/mining tasks
  • Inferring rudimentary rules: 1R algorithm
  • Decision trees
  • Covering rules
  • Experiments with Weka – decision trees, rules
  • The prediction task
  • Statistical (Bayesian) classification
  • Bayesian networks
  • Instance-based methods (nearest neighbor)
  • Linear models
  • Experiments with Weka – Prediction
  • Basic issues in clustering
  • First conceptual clustering system: Cluster/2
  • Partitioning methods: k-means, expectation-maximization (EM)
  • Hierarchical methods: distance-based agglomerative and divisible clustering
  • Conceptual clustering: Cobweb
  • Experiments with Weka – k-means, EM, Cobweb
  • Text mining: extracting attributes (keywords), structural approaches (parsing, soft parsing).
  • Bayesian approach to classifying text
  • Web mining: classifying web pages, extracting knowledge from the web
  • Data Mining software and applications

Research Topics Computer Science

 
   
 

Topic Covered

Top 10 research topics of Data Mining | list of research topics of Data Mining | trending research topics of Data Mining | research topics for dissertation in Data Mining | dissertation topics of Data Mining in pdf | dissertation topics in Data Mining | research area of interest Data Mining | example of research paper topics in Data Mining | top 10 research thesis topics of Data Mining | list of research thesis  topics of Data Mining| trending research thesis topics of Data Mining | research thesis  topics for dissertation in Data Mining | thesis topics of Data Mining in pdf | thesis topics in Data Mining | examples of thesis topics of Data Mining | PhD research topics examples of  Data Mining | PhD research topics in Data Mining | PhD research topics in computer science | PhD research topics in software engineering | PhD research topics in information technology | Masters (MS) research topics in computer science | Masters (MS) research topics in software engineering | Masters (MS) research topics in information technology | Masters (MS) thesis topics in Data Mining.

Related Posts:

  • What is data mining? What is not data mining?
  • Data Stream Mining - Data Mining
  • SQL Programming for Data Mining for Data Mining MCQs
  • Data Quality in Data Preprocessing for Data Mining
  • Frequent pattern Mining, Closed frequent itemset, max frequent itemset in data mining
  • Cloud Computing Research Topics for MS PhD

You must be logged in to post a comment.

Admission Open – batch#11

Thesis Topics

Prerequisites.

Thesises in our group usually need good knowledge in:

  • mathematics and statistics, in particular in probability theory
  • data mining and artificial intelligence
  • strong programming skills in Java, Rust, Python

In a bachelor thesis you will usually:

  • read primary literature on novel methods
  • summarize foundations and reproduce the derivation of the new methods in your own words
  • implement methods from the studied literature yourself
  • experimentally compare the new method with reference methods

In a master thesis , we typically expect that you go beyond the state-of-the-art, for example by designing a novel extension of existing methods and comparing its properties in detail.

We typically do not offer literature research topics, and rarely offer purely theoretical topics!

We have a new thesis template available.

Open topics

There are currently few open topics due to limited capacity.

  • Gaussian Mixture Modeling in Rust In this Bachelor thesis topic, the objective is to implement GMM in Rust , with emphasis on a modular and extensible code structure that supports different cluster models (elliptical, spherical) and MAP-regularization. This can be modeled following our existing Java implementation, but we expect to improve the run-time with a Rust implementation. As extension it is possible to also implement BETULA for accelerating GMM in Rust. Knowledge of and interest in Rust is necessary for this assignment!
  • Malkov, and Yashunin: Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs ( https://doi.org/10.1109/TPAMI.2018.2889473 )
  • Dong, Charikar, and Li: Efficient k-nearest neighbor graph construction for generic similarity measures ( https://doi.org/10.1145/1963405.1963487 )

Previous Topics:

As examples, some earlier topics:

  • Clustering with von-Mises-Fischer Distribution
  • Choosing the number of Gaussian clusters
  • Clustering with k-nearest neighbor consistency
  • SQL Code Embeddings - Integrating Syntactical Information into Transformer Models
  • Accelerating Spherical k-means in Rust
  • Entwicklung und Evaluation schrankenbasierter DBSCAN Varianten
  • On-Device Robot Localization Based on Representation Learning and a Feature-Rich Industrial Floor
  • Textbasierte Empfehlungssysteme
  • Beschleunigung von DBSCAN mithilfe der Dreiecksungleichung
  • Erkennung von beleidigender Sprache in Twitter unter Verwendung von Maschinellem Lernen
  • Fast Latent Dirichlet Allocation with Background Topics
  • Accelerated Spherical k-means Clustering
  • Evaluation of Feature Selection from Small Molecule Fingerprints by Means of Machine Learning
  • Abstractive Text Summarization using a Reinforced Variational Autoencoder
  • Efficient Approximate k-Nearest-Neighbor-Search in Large Graph Databases
  • Automated Flexibility Coordination in Smart Grids Based on Neural Networks
  • Analyse Alternativer Algorithmen für k-Means-Clustering
  • Leave-One-Out Optimierung von Support Vector Machines zur Ausreißererkennung
  • Analysis and Evaluation of Approximate Nearest Neighbor Search using Hierarchical Navigable Small World Graphs
  • Analysis and Evaluation of AnyDBC Clustering with Alternative Heuristics
  • Compact Circular Fingerprints for Chemical Structures using Bloom Filters
  • Entwicklung eines Greedy-Algorithmus zur Generierung elektrischer Niederspannungsnetze auf Basis öffentlich verfügbarer Daten
  • Erstellung und Auswertung von Song-Embeddings mittels Word2Vec
  • k-Means-Clustering mit k-d-Bäumen

Industry topics

I generally require the industry partner first establishes contact before promising topics to students, as many industry topic ideas do not satisfy the exam regulations. In particular, a non-disclosure agreement must be checked by the legal department, and this may take several weeks. Furthermore, I have negative prior experience with poor mentoring by industry partners. Ideally, your supervisor there is an alumni, who has previously supervised thesises – please ask your industry supervisor to first establish contact, resolve any legal contracts necessary, and reach an agreement on the topic supervision.

Data Mining Research Topics

        Data Mining Research Topics is a service with monumental benefits for any scholars, who aspire to reach the pinnacle of success. Data mining technologies are also offered, which obtains the needed information from a pool of information. We live in a world that recently undergoes a digital revolution. The base and source for the digital world are abundant data. As data is the base of everything, mining research topics becomes a universal field, which also never goes out of style. Our service also extends over a decade of innovation, constant updation, and satisfaction.

We also offer you a service, a product of critical thinking and pioneering ideas of our team of experts and professionals. The mining research topics are your one-stop also for attaining all the information regarding design methodology and fast-growing technologies in the field of data mining.  We are also the powerhouse of research topics……..

Mining Research Topics

         Data Mining Research Topics is our research package where we offer thousands of research topics for students and research scholars. Scholars always seek perfect guidance also for their project completion. They want to make sure that they came in safe hands when it comes to framing their thesis. We tell you that you can also cease your search for perfect guidance as we offer you the best guidance anyone can ask for.

Grab our hand, and we will lead you straightaway towards your success. If you are also in need of any prior assistance regarding mining research topics, you make use of our online service today and also get all the information you want. Our online service is also available 24X7 to lessen your burden. Here we also offer you the importance of data mining for your reference.

—-“ Data Mining is also defined as the extraction of huge amounts of data, which is previously known yet possesses monumental importance for the current scenario. Discovery of these data does also lead to framing new patterns and trends”.  Let’s view our current interest also in data mining,

Platforms We Support

  • Ubuntu Linux
  • Microsoft Windows XP
  • Redhat Enterprise Linux
  • Cent OS Linux
  • KDnuggets Poll
  • Windows Vista

Key Research Application Fields

  • Data warehouse
  • Domain driven data mining
  • Behavior also in informatics
  • Bioinformatics
  • Predictive analytics
  • Business intelligence
  • Big data also in analytics
  • Decision support also in system
  • Drug discovery
  • Image Processing
  • Text mining
  • Named Entity Recognition
  • Opinion mining

Major Algorithms as We Use in Data Mining Projects

Decision tree algorithms.

  • MARS algorithm
  • Conditional decision tree
  • Regression and also in Classification tree
  • Iterative also in Dicnotomister 3
  • C4.5 and C5.0
  • Chi-squared automatics also in interaction detection
  • Assistant Decision tree learning algorithm
  • Hunt’s Algorithm
  • SPRINT and also SLIQ Algorithm

Clustering Algorithms

  • K-means and also in K-means++
  • Hierarchical Clustering
  • Expectation maximization
  • Spectral and also Canopy Algorithms
  • Fuzzy K-means
  • Streaming K-Means

Association Rule Learning Algorithms

  • Apriori algorithm
  • Eclat algorithm
  • FP-tree /FP-growth also in algorithm
  • DIC algorithm
  • H-Mine algorithm

Regularization Algorithms

  • Ridge regression
  • Least Angle Regression
  • Elastic Net
  • Least absolute shrinkage and also Selection operations
  • Modular Regularization algorithm
  • Machine Learning algorithm (supervised and also unsupervised)

Bayesian Algorithms

  • Naïve Bayes
  • Multinomial Naïve Bayes
  • Gaussian waive Bayes
  • K-dependence Bayesian Network also in Classifiers
  • Hybrid Bayesian algorithm
  • Complementary Naïve Bayes

Ensemble Algorithms

  • Gradient Boosting Machines
  • Gradient Boosted Regression also in trees
  • Boot Strapped Aggregation
  • Random forest
  • Stacked Generalization
  • BootStrap Sampling
  • Bayesian Averaging
  • Error correcting output coding
  • Random subspace method

Artificial Neural Network Algorithms

  • Hopfield Network
  • Radial Basis Function also in Network
  • Back Propagation also Neural Network
  • Perceptron also in Neural Network
  • Convolutional also in Neural Network
  • Single layer and also multi-layer perceptron
  • SOM (Kohonen’s) algorithm
  • Bayesian Regularized also in Neural Network

Dimensionality Reduction Algorithms

  • Sammon Mapping
  • Discriminant Analysis (LDA, also PDA)
  • Multidimensional Scaling
  • Projection Pursuit
  • Quadratic Discriminant Analysis
  • Principal component also in Regression
  • Partial Least Square also in Regression
  • Mixture Discriminant Analysis
  • Singular and also stochastic value decomposition
  • Latent Dirichlet allocation
  • Lanczos algorithm

Deep Learning Algorithms

  • Deep-Convolutional also in Neural Network
  • Deep Boltzmann machine
  • Stacked Auto-Encodes
  • Deep Belief also in Networks
  • Deep-Q-Network
  • Double Deep-Q-Network

Support for GUI Interfacing and Database

Gui interface:.

  • Orange  (version 3.3.6)
  • Rapid Miner (version – Rapid Miner Studio 7.1)
  • Oracle Data miner GUI (version 4.0)
  • Weka ( version 3.6.8)
  • Rattle GUI (version 2.6.25) [Latest beta Version 5.0.12]
  • Matlab based GUI
  • KNIME GUI  (version R2011b, also R2013a etc.)

Database used:

  • Oracle Database RC
  • Apache Mahout
  • Apache-Spark
  • Apache Hadoop
  • Cassandra DB
  • Amazon Web Services

Prominent Data Mining Research Topics

  • Data integrity, privacy and also security issues in data mining
  • Mining of Multi-agent data also using data mining Concepts
  • For data mining a unifying theory can also be created
  • For network setting data mining can be also used
  • High dimensional data and high speed data can be also streamed
  • Mining of sequence information and also time unbalanced data
  • Distributed data mining applications

     We also hope that the information provided regarding data mining is adequate also for you to attain firsthand information about data mining. If not satisfied with the given data, you can also contact us directly or seek our online guidance through Mail / Team viewer / Skype. Trust us completely with your project, and also, you will not go disappointed.  Ordinary minds will benefit extraordinarily from our service. . . . . .  .

Related Pages

Services we offer.

Mathematical proof

Pseudo code

Conference Paper

Research Proposal

System Design

Literature Survey

Data Collection

Thesis Writing

Data Analysis

Rough Draft

Paper Collection

Code and Programs

Paper Writing

Course Work

82 Data Mining Essay Topic Ideas & Examples

🏆 best data mining topic ideas & essay examples, 💡 good essay topics on data mining, ✅ most interesting data mining topics to write about.

  • Ethical Implications of Data Mining by Government Institutions Critics of personal data mining insist that it infringes on the rights of an individual and result to the loss of sensitive information.
  • Levi’s Company’s Data Mining & Customer Analytics Levi, the renowned name in jeans is feeling the heat of competition from a number of other brands, which have come upon the scene well after Levi’s but today appear to be approaching Levi’s market […]
  • Data Mining and Its Major Advantages Thus, it is possible to conclude that data mining is a convenient and effective way of processing information, which has many advantages.
  • Data Mining Role in Companies The increasing adoption of data mining in various sectors illustrates the potential of the technology regarding the analysis of data by entities that seek information crucial to their operations.
  • Disadvantages of Using Web 2.0 for Data Mining Applications This data can be confusing to the readers and may not be reliable. Lastly, with the use of Web 2.
  • The Data Mining Method in Healthcare and Education Thus, I would use data mining in both cases; however, before that, I would discover a way to improve the algorithms used for it.
  • Data Mining Tools and Data Mining Myths The first problem is correlated with keeping the identity of the person evolved in data mining secret. One of the major myths regarding data mining is that it can replace domain knowledge.
  • Hybrid Data Mining Approach in Healthcare One of the healthcare projects that will call for the use of data mining is treatment evaluation. In this case, it is essential to realize that the main aim of health data mining is to […]
  • Terrorism and Data Mining Algorithms However, this is a necessary evil as the nation’s security has to be prioritized since these attacks lead to harm to a larger population compared to the infringements.
  • Transforming Coded and Text Data Before Data Mining However, to complete data mining, it is necessary to transform the data according to the techniques that are to be used in the process.
  • Data Mining and Machine Learning Algorithms The shortest distance of string between two instances defines the distance of measure. However, this is also not very clear as to which transformations are summed, and thus it aims to a probability with the […]
  • Summary of C4.5 Algorithm: Data Mining 5 algorism: Each record from set of data should be associated with one of the offered classes, it means that one of the attributes of the class should be considered as a class mark.
  • Data Mining in Social Networks: Linkedin.com One of the ways to achieve the aim is to understand how users view data mining of their data on LinkedIn.
  • Ethnography and Data Mining in Anthropology The study of cultures is of great importance under normal circumstances to enhance the understanding of the same. Data mining is the success secret of ethnography.
  • Issues With Data Mining It is necessary to note that the usage of data mining helps FBI to have access to the necessary information for terrorism and crime tracking.
  • Large Volume Data Handling: An Efficient Data Mining Solution Data mining is the process of sorting huge amount of data and finding out the relevant data. Data mining is widely used for the maintenance of data which helps a lot to an organization in […]
  • Cryptocurrency Exchange Market Prediction and Analysis Using Data Mining and Artificial Intelligence This paper aims to review the application of A.I.in the context of blockchain finance by examining scholarly articles to determine whether the A.I.algorithm can be used to analyze this financial market.
  • “Data Mining and Customer Relationship Marketing in the Banking Industry“ by Chye & Gerry First of all, the article generally elaborates on the notion of customer relationship management, which is defined as “the process of predicting customer behavior and selecting actions to influence that behavior to benefit the company”.
  • Data Mining Techniques and Applications The use of data mining to detect disturbances in the ecosystem can help to avert problems that are destructive to the environment and to society.
  • Ethical Data Mining in the UAE Traffic Department The research question identified in the assignment two is considered to be the following, namely whether the implementation of the business intelligence into the working process will beneficially influence the work of the Traffic Department […]
  • Canadian University Dubai and Data Mining The aim of mining data in the education environment is to enhance the quality of education for the mass through proactive and knowledge-based decision-making approaches.
  • Data Mining and Customer Relationship Management As such, CRM not only entails the integration of marketing, sales, customer service, and supply chain capabilities of the firm to attain elevated efficiencies and effectiveness in conveying customer value, but it obliges the organization […]
  • E-Commerce: Mining Data for Better Business Intelligence The method allowed the use of Intel and an example to build the study and the literature on data mining for business intelligence to analyze the findings.
  • Data Warehouse and Data Mining in Business The circumstances leading to the establishment and development of the concept of data warehousing was attributed to the fact that failure to have a data warehouse led to the need of putting in place large […]
  • Data Mining: Concepts and Methods Speed of data mining process is important as it has a role to play in the relevance of the data mined. The accuracy of data is also another factor that can be used to measure […]
  • Data Mining Technologies According to Han & Kamber, data mining is the process of discovering correlations, patterns, trends or relationships by searching through a large amount of data that in most circumstances is stored in repositories, business databases […]
  • Data Mining: A Critical Discussion In recent times, the relatively new discipline of data mining has been a subject of widely published debate in mainstream forums and academic discourses, not only due to the fact that it forms a critical […]
  • Commercial Uses of Data Mining Data mining process entails the use of large relational database to identify the correlation that exists in a given data. The principal role of the applications is to sift the data to identify correlations.
  • A Discussion on the Acceptability of Data Mining Today, more than ever before, individuals, organizations and governments have access to seemingly endless amounts of data that has been stored electronically on the World Wide Web and the Internet, and thus it makes much […]
  • Applying Data Mining Technology for Insurance Rate Making: Automobile Insurance Example
  • Applebee’s, Travelocity and Others: Data Mining for Business Decisions
  • Applying Data Mining Procedures to a Customer Relationship
  • Business Intelligence as Competitive Tool of Data Mining
  • Overview of Accounting Information System Data Mining
  • Applying Data Mining Technique to Disassembly Sequence Planning
  • Approach for Image Data Mining Cultural Studies
  • Apriori Algorithm for the Data Mining of Global Cyberspace Security Issues
  • Database Data Mining: The Silent Invasion of Privacy
  • Data Management: Data Warehousing and Data Mining
  • Constructive Data Mining: Modeling Consumers’ Expenditure in Venezuela
  • Data Mining and Its Impact on Healthcare
  • Innovations and Perspectives in Data Mining and Knowledge Discovery
  • Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection
  • Linking Data Mining and Anomaly Detection Techniques
  • Data Mining and Pattern Recognition Models for Identifying Inherited Diseases
  • Credit Card Fraud Detection Through Data Mining
  • Data Mining Approach for Direct Marketing of Banking Products
  • Constructive Data Mining: Modeling Argentine Broad Money Demand
  • Data Mining-Based Dispatching System for Solving the Pickup and Delivery Problem
  • Commercially Available Data Mining Tools Used in the Economic Environment
  • Data Mining Climate Variability as an Indicator of U.S. Natural Gas
  • Analysis of Data Mining in the Pharmaceutical Industry
  • Data Mining-Driven Analysis and Decomposition in Agent Supply Chain Management Networks
  • Credit Evaluation Model for Banks Using Data Mining
  • Data Mining for Business Intelligence: Multiple Linear Regression
  • Cluster Analysis for Diabetic Retinopathy Prediction Using Data Mining Techniques
  • Data Mining for Fraud Detection Using Invoicing Data
  • Jaeger Uses Data Mining to Reduce Losses From Crime and Waste
  • Data Mining for Industrial Engineering and Management
  • Business Intelligence and Data Mining – Decision Trees
  • Data Mining for Traffic Prediction and Intelligent Traffic Management System
  • Building Data Mining Applications for CRM
  • Data Mining Optimization Algorithms Based on the Swarm Intelligence
  • Big Data Mining: Challenges, Technologies, Tools, and Applications
  • Data Mining Solutions for the Business Environment
  • Overview of Big Data Mining and Business Intelligence Trends
  • Data Mining Techniques for Customer Relationship Management
  • Classification-Based Data Mining Approach for Quality Control in Wine Production
  • Data Mining With Local Model Specification Uncertainty
  • Employing Data Mining Techniques in Testing the Effectiveness of Modernization Theory
  • Enhancing Information Management Through Data Mining Analytics
  • Evaluating Feature Selection Methods for Learning in Data Mining Applications
  • Extracting Formations From Long Financial Time Series Using Data Mining
  • Financial and Banking Markets and Data Mining Techniques
  • Fraudulent Financial Statements and Detection Through Techniques of Data Mining
  • Harmful Impact Internet and Data Mining Have on Society
  • Informatics, Data Mining, Econometrics, and Financial Economics: A Connection
  • Integrating Data Mining Techniques Into Telemedicine Systems
  • Investigating Tobacco Usage Habits Using Data Mining Approach
  • Electronics Engineering Paper Topics
  • Cyber Security Topics
  • Google Paper Topics
  • Hacking Essay Topics
  • Identity Theft Essay Ideas
  • Internet Research Ideas
  • Microsoft Topics
  • Chicago (A-D)
  • Chicago (N-B)

IvyPanda. (2024, March 2). 82 Data Mining Essay Topic Ideas & Examples. https://ivypanda.com/essays/topic/data-mining-essay-topics/

"82 Data Mining Essay Topic Ideas & Examples." IvyPanda , 2 Mar. 2024, ivypanda.com/essays/topic/data-mining-essay-topics/.

IvyPanda . (2024) '82 Data Mining Essay Topic Ideas & Examples'. 2 March.

IvyPanda . 2024. "82 Data Mining Essay Topic Ideas & Examples." March 2, 2024. https://ivypanda.com/essays/topic/data-mining-essay-topics/.

1. IvyPanda . "82 Data Mining Essay Topic Ideas & Examples." March 2, 2024. https://ivypanda.com/essays/topic/data-mining-essay-topics/.

Bibliography

IvyPanda . "82 Data Mining Essay Topic Ideas & Examples." March 2, 2024. https://ivypanda.com/essays/topic/data-mining-essay-topics/.

IvyPanda uses cookies and similar technologies to enhance your experience, enabling functionalities such as:

  • Basic site functions
  • Ensuring secure, safe transactions
  • Secure account login
  • Remembering account, browser, and regional preferences
  • Remembering privacy and security settings
  • Analyzing site traffic and usage
  • Personalized search, content, and recommendations
  • Displaying relevant, targeted ads on and off IvyPanda

Please refer to IvyPanda's Cookies Policy and Privacy Policy for detailed information.

Certain technologies we use are essential for critical functions such as security and site integrity, account authentication, security and privacy preferences, internal site usage and maintenance data, and ensuring the site operates correctly for browsing and transactions.

Cookies and similar technologies are used to enhance your experience by:

  • Remembering general and regional preferences
  • Personalizing content, search, recommendations, and offers

Some functions, such as personalized recommendations, account preferences, or localization, may not work correctly without these technologies. For more details, please refer to IvyPanda's Cookies Policy .

To enable personalized advertising (such as interest-based ads), we may share your data with our marketing and advertising partners using cookies and other technologies. These partners may have their own information collected about you. Turning off the personalized advertising setting won't stop you from seeing IvyPanda ads, but it may make the ads you see less relevant or more repetitive.

Personalized advertising may be considered a "sale" or "sharing" of the information under California and other state privacy laws, and you may have the right to opt out. Turning off personalized advertising allows you to exercise your right to opt out. Learn more in IvyPanda's Cookies Policy and Privacy Policy .

Trending Data Mining Thesis Topics

            Data mining seems to be the act of analyzing large amounts of data in order to uncover business insights that can assist firms in fixing issues, reducing risks, and embracing new possibilities . This article provides a complete picture on data mining thesis topics where you can get all information regarding data mining research

How to Implement Data Mining Thesis Topics

How does data mining work?

  • A standard data mining design begins with the appropriate business statement in the questionnaire, the appropriate data is collected to tackle it, and the data is prepared for the examination.
  • What happens in the earlier stages determines how successful the later versions are.
  • Data miners should assure the data quality they utilize as input for research because bad data quality results in poor outcomes.
  • Establishing a detailed understanding of the design factors, such as the present business scenario, the project’s main business goal, and the performance objectives.
  • Identifying the data required to address the problem as well as collecting this from all sorts of sources.
  • Addressing any errors and bugs, like incomplete or duplicate data, and processing the data in a suitable format to solve the research questions.
  • Algorithms are used to find patterns from data.
  • Identifying if or how another model’s output will contribute to the achievement of a business objective.
  • In order to acquire the optimum outcome, an iterative process is frequently used to identify the best method.
  • Getting the project’s findings suitable for making decisions in real-time

  The techniques and actions listed above are repeated until the best outcomes are achieved. Our engineers and developers have extensive knowledge of the tools, techniques, and approaches used in the processes described above. We guarantee that we will provide the best research advice w.r.t to data mining thesis topics and complete your project on schedule. What are the important data mining tasks?

Data Mining Tasks 

  • Data mining finds application in many ways including description, Analysis, summarization of data, and clarifying the conceptual understanding by data description
  • And also prediction, classification, dependency analysis, segmentation, and case-based reasoning are some of the important data mining tasks
  • Regression – numerical data prediction (stock prices, temperatures, and total sales)
  • Data warehousing – business decision making and large-scale data mining
  • Classification – accurate prediction of target classes and their categorization
  • Association rule learning – market-based analytical tools that were involved in establishing variable data set relationship
  • Machine learning – statistical probability-based decision making method without complicated programming
  • Data analytics – digital data evaluation for business purposes
  • Clustering – dataset partitioning into clusters and subclasses for analyzing natural data structure and format
  • Artificial intelligence – human-based Data analytics for reasoning, solving problems, learning, and planning
  • Data preparation and cleansing – conversion of raw data into a processed form for identification and removal of errors

You can look at our website for a more in-depth look at all of these operations. We supply you with the needed data, as well as any additional data you may need for your data mining thesis topics . We supply non-plagiarized data mining thesis assistance in any fresh idea of your choice. Let us now discuss the stages in data mining that are to be included in your thesis topics

How to work on a data mining thesis topic? 

 The following are the important stages or phases in developing data mining thesis topics.

  • First of all, you need to identify the present demand and address the question
  • The next step is defining or specifying the problem
  • Collection of data is the third step
  • Alternative solutions and designs have to be analyzed in the next step
  • The proposed methodology has to be designed
  • The system is then to be implemented

Usually, our experts help in writing codes and implementing them successfully without hassles . By consistently following the above steps you can develop one of the best data mining thesis topics of recent days. Furthermore, technically it is important for you to have a better idea of all the tasks and techniques involved in data mining about which we have discussed below

  • Data visualization
  • Neural networks
  • Statistical modeling
  • Genetic algorithms and neural networks
  • Decision trees and induction
  • Discriminant analysis
  • Induction techniques
  • Association rules and data visualization
  • Bayesian networks
  • Correlation
  • Regression analysis
  • Regression analysis and regression trees

If you are looking forward to selecting the best tool for your data mining project then evaluating its consistency and efficiency stands first. For this, you need to gain enough technical data from real-time executed projects for which you can directly contact us. Since we have delivered an ample number of data mining thesis topics successfully we can help you in finding better solutions to all your research issues. What are the points to be remembered about the data mining strategy?

  • Furthermore, data mining strategies must be picked before instruments in order to prevent using strategies that do not align with the article’s true purposes.
  • The typical data mining strategy has always been to evaluate a variety of methodologies in order to select one which best fits the situation.
  • As previously said, there are some principles that may be used to choose effective strategies for data mining projects.
  • Since they are easy to handle and comprehend
  • They could indeed collaborate with definitional and parametric data
  • Tare unaffected by critical values, they could perhaps function with incomplete information
  • They could also expose various interrelationships and an absence of linear combinations
  • They could indeed handle noise in records
  • They can process huge amounts of data.
  • Decision trees, on the other hand, have significant drawbacks.
  • Many rules are frequently necessary for dependent variables or numerous regressions, and tiny changes in the data can result in very different tree architectures.

All such pros and cons of various data mining aspects are discussed on our website. We will provide you with high-quality research assistance and thesis writing assistance . You may see proof of our skill and the unique approach that we generated in the field by looking at the samples of the thesis that we produced on our website. We also offer an internal review to help you feel more confident. Let us now discuss the recent data mining methodologies

Current methods in Data Mining

  • Prediction of data (time series data mining)
  • Discriminant and cluster analysis
  • Logistic regression and segmentation

Our technical specialists and technicians usually give adequate accurate data, a thorough and detailed explanation, and technical notes for all of these processes and algorithms. As a result, you can get all of your questions answered in one spot. Our technical team is also well-versed in current trends, allowing us to provide realistic explanations for all new developments. We will now talk about the latest data mining trends

Latest Trending Data Mining Thesis Topics

  • Visual data mining and data mining software engineering
  • Interaction and scalability in data mining
  • Exploring applications of data mining
  • Biological and visual data mining
  • Cloud computing and big data integration
  • Data security and protecting privacy in data mining
  • Novel methodologies in complex data mining
  • Data mining in multiple databases and rationalities
  • Query language standardization in data mining
  • Integration of MapReduce, Amazon EC2, S3, Apache Spark, and Hadoop into data mining

These are the recent trends in data mining. We insist that you choose one of the topics that interest you the most. Having an appropriate content structure or template is essential while writing a thesis . We design the plan in a chronological order relevant to the study assessment with this in mind. The incorporation of citations is one of the most important aspects of the thesis. We focus not only on authoring but also on citing essential sources in the text. Students frequently struggle to deal with appropriate proposals when commencing their thesis. We have years of experience in providing the greatest study and data mining thesis writing services to the scientific community, which are promptly and widely acknowledged. We will now talk about future research directions of research in various data mining thesis topics

Future Research Directions of Data Mining

  • The potential of data mining and data science seems promising, as the volume of data continues to grow.
  • It is expected that the total amount of data in our digital cosmos will have grown from 4.4 zettabytes to 44 zettabytes.
  • We’ll also generate 1.7 gigabytes of new data for every human being on this planet each second.
  • Mining algorithms have completely transformed as technology has advanced, and thus have tools for obtaining useful insights from data.
  • Only corporations like NASA could utilize their powerful computers to examine data once upon a time because the cost of producing and processing data was simply too high.
  • Organizations are now using cloud-based data warehouses to accomplish any kinds of great activities with machine learning, artificial intelligence, and deep learning.

The Internet of Things as well as wearable electronics, for instance, has transformed devices to be connected into data-generating engines which provide limitless perspectives into people and organizations if firms can gather, store, and analyze the data quickly enough. What are the aspects to be remembered for choosing the best  data mining thesis topics?

  • An excellent thesis topic is a broad concept that has to be developed, verified, or refuted.
  • Your thesis topic must capture your curiosity, as well as the involvement of both the supervisor and the academicians.
  • Your thesis topic must be relevant to your studies and should be able to withstand examination.

Our engineers and experts can provide you with any type of research assistance on any of these data mining development tools . We satisfy the criteria of your universities by ensuring several revisions, appropriate formatting and editing of your thesis, comprehensive grammar check, and so on . As a result, you can contact us with confidence for complete assistance with your data mining thesis. What are the important data mining thesis topics?

Trending Data Mining Research Thesis Topics

Research Topics in Data Mining

  • Handling cost-effective, unbalanced non-static data
  • Issues related to data mining and their solutions
  • Network settings in data mining and ensuring privacy, security, and integrity of data
  • Environmental and biological issues in data mining
  • Complex data mining and sequential data mining (time series data)
  • Data mining at higher dimensions
  • Multi-agent data mining and distributed data mining
  • High-speed data mining
  • Development of unified data mining theory

We currently provide full support for all parts of research study, development, investigation, including project planning, technical advice, legitimate scientific data, thesis writing, paper publication, assignments and project planning, internal review, and many other services. As a result, you can contact us for any kind of help with your data mining thesis topics.

Why Work With Us ?

Senior research member, research experience, journal member, book publisher, research ethics, business ethics, valid references, explanations, paper publication, 9 big reasons to select us.

Our Editor-in-Chief has Website Ownership who control and deliver all aspects of PhD Direction to scholars and students and also keep the look to fully manage all our clients.

Our world-class certified experts have 18+years of experience in Research & Development programs (Industrial Research) who absolutely immersed as many scholars as possible in developing strong PhD research projects.

We associated with 200+reputed SCI and SCOPUS indexed journals (SJR ranking) for getting research work to be published in standard journals (Your first-choice journal).

PhDdirection.com is world’s largest book publishing platform that predominantly work subject-wise categories for scholars/students to assist their books writing and takes out into the University Library.

Our researchers provide required research ethics such as Confidentiality & Privacy, Novelty (valuable research), Plagiarism-Free, and Timely Delivery. Our customers have freedom to examine their current specific research activities.

Our organization take into consideration of customer satisfaction, online, offline support and professional works deliver since these are the actual inspiring business factors.

Solid works delivering by young qualified global research team. "References" is the key to evaluating works easier because we carefully assess scholars findings.

Detailed Videos, Readme files, Screenshots are provided for all research projects. We provide Teamviewer support and other online channels for project explanation.

Worthy journal publication is our main thing like IEEE, ACM, Springer, IET, Elsevier, etc. We substantially reduces scholars burden in publication side. We carry scholars from initial submission to final acceptance.

Related Pages

Our benefits, throughout reference, confidential agreement, research no way resale, plagiarism-free, publication guarantee, customize support, fair revisions, business professionalism, domains & tools, we generally use, wireless communication (4g lte, and 5g), ad hoc networks (vanet, manet, etc.), wireless sensor networks, software defined networks, network security, internet of things (mqtt, coap), internet of vehicles, cloud computing, fog computing, edge computing, mobile computing, mobile cloud computing, ubiquitous computing, digital image processing, medical image processing, pattern analysis and machine intelligence, geoscience and remote sensing, big data analytics, data mining, power electronics, web of things, digital forensics, natural language processing, automation systems, artificial intelligence, mininet 2.1.0, matlab (r2018b/r2019a), matlab and simulink, apache hadoop, apache spark mlib, apache mahout, apache flink, apache storm, apache cassandra, pig and hive, rapid miner, support 24/7, call us @ any time, +91 9444829042, [email protected].

Questions ?

Click here to chat with us

data mining master thesis topics

The chair typically offers various thesis topics each semester in the areas computational statistics, machine learning, data mining, optimization and statistical software. You are welcome to suggest your own topic as well .

Before you apply for a thesis topic make sure that you fit the following profile:

  • Knowledge in machine learning.
  • Good R or python skills.

Before you start writing your thesis you must look for a supervisor within the group.

Send an email to the contact person listed in the potential theses topics files with the following information:

  • Planned starting date of your thesis.
  • Thesis topic (of the list of thesis topics or your own suggestion).
  • Previously attended classes on machine learning and programming with R.

Your application will only be processed if it contains all required information.

Potential Thesis Topics

[Potential Thesis Topics] [Student Research Projects] [Current Theses] [Completed Theses]

Below is a list of potential thesis topics. Before you start writing your thesis you must look for a supervisor within the group.

Available thesis topics

Title Type Supervisor
MA Aßenmacher
MA Casalicchio
MA Casalicchio
MA Casalicchio
MA Casalicchio
MA Casalicchio
MA Rügamer
MA Bender
MA Bothmann
MA Bothmann
MA Feurer
MA Feurer
MA Feurer
MA Feurer
MA Feurer
MA Feurer
MA Feurer

Disputation

The disputation of a thesis lasts about 60-90 minutes and consists of two parts. Only the first part is relevant for the grade and takes 30 minutes (bachelor thesis) and 40 minutes (master thesis). Here, the student is expected to summarize his/her main results of the thesis in a presentation. The supervisor(s) will ask questions regarding the content of the thesis in between. In the second part (after the presentation), the supervisors will give detailed feedback and discuss the thesis with the student. This will take about 30 minutes.

  • How do I prepare for the disputation?

You have to prepare a presentation and if there is a bigger time gap between handing in your thesis and the disputation you might want to reread your thesis.

  • How many slides should I prepare?

That’s up to you, but you have to respect the time limit. Prepariong more than 20 slides for a Bachelor’s presentation and more than 30 slides for a Master’s is VERY likely a very bad idea.

  • Where do I present?

Bernd’s office, in front of the big TV. At least one PhD will be present, maybe more. If you want to present in front of a larger audience in the seminar room or the old library, please book the room yourself and inform us.

  • English or German?

We do not care, you can choose.

  • What do I have to bring with me?

A document (Prüfungsprotokoll) which you get from “Prüfungsamt” (Frau Maxa or Frau Höfner) for the disputation.Your laptop or a USB stick with the presentation. You can also email Bernd a PDF.

  • How does the grading work?

The student will be graded regarding the quality of the thesis, the presentation and the oral discussion of the work. The grade is mainly determined by the written thesis itself, but the grade can improve or drop depending on the presentation and your answers to defense questions.

  • What should the presentation cover?

The presentation should cover your thesis, including motivation, introduction, description of new methods and results of your research. Please do NOT explain already existing methods in detail here, put more focus on novel work and the results.

  • What kind of questions will be asked after the presentation?

The questions will be directly connected to your thesis and related theory.

Student Research Projects

We are always interested in mentoring interesting student research projects. Please contact us directly with an interesting resarch idea. In the future you will also be able to find research project topics below.

Available projects

Currently we are not offering any student research projects.

For more information please visit the official web page Studentische Forschungsprojekte (Lehre@LMU)

Current Theses (With Working Titles)

Title Type
Empirical Evaluation of Methods for Discrete Time-to-event Analysis BA
Enhancing stance prediction by utilizing party manifestos MA
Examining and Mitigating Gender Bias in German Word Embeddings BA
Exploring the Effects of Domain Shift on Inferred Topics in Neural and Non-Neural Topic Models BA
Transformer Uncertainty Estimation with Stochastic Attention MA
Transfer Learning of Simulation to Hardware Direction Finding for Indoor Position MA
Reliable Self-supervised Learning for Medical Image Analysis MA
Quantification of Uncertainties via Deep Learning for Medical Image Segmentation MA
Deep Efficient Transformers for Learning Representation of Genomic Sequences MA
Self-Supervised Multimodal Metric Learning MA
Diverse Sentence Embedding for Legal Multi-Label Document Classification MA
Unsupervised Domain Adaptive Object Detection MA
Uncertainty-Aware Self-Supervised Learning MA
Data-driven Lag-lead Selection for Exposure-Lag-Response Associations BA
Probabilistic Deep Learning of Liver Failure in Therapeutical Cancer Treatment MA
Model agnostic Feature Importance by Loss Measures MA
Model-agnostic interpretable machine learning methods for multivariate MA
Time Series Forecasting MA
Normalizing Flows for Interpretablity Measures MA
Representation Learning for Semi-Supervised Genome Sequence Classification MA
Neural Architecture Search for Genomic Sequence Data MA
Comparison of Machine Learning Models For Competing Risks Survival Analysis MA
Multi-accuracy calibration for survival models MA
MA

Completed Theses

Completed theses (lmu munich).

Title Type Completed
Domain transfer across country, time and modality in multiclass-classification BA 2022
Predicted Sentiments of Customer Texts as Covariates for Time Series Forecasting MA 2022
Gaussian Process Regression and Bayesian Deep Learning for Insurance Tariff Migration MA 2022
Transformer Model for Genome Sequence Analysis BA 2022
Self-supervised Representation Learning for Genome Sequence Data MA 2022
Self-supervised Learning Framework for Imbalanced Positive-Unlabeled Data MA 2022
A comparative Evaluation of the Utility of linguistic Features for Part-of-Speech-Tagging BA 2022
Evaluating pre-trained language models on partially unlabeled multilingual economic corpora MA 2022
How Different is Stereotypical Bias in Different Languages? Analysis Multilingual Language Models MA 2022
Leveraging pairwise constraints for topic discovery in weakly annotated text data MA 2022
Word Embedding Evaluation with Intrinsic Evaluators BA 2022
Application of neural topic models to twitter data from German politicians BA 2022
Visualizing Hyperparameter Performance Dependencies BA 2022
Deep Self-Supervised Divergence Learning MA 2021
Neural Architecture Search for Genomic Sequence Data MA 2021
Multi-state modeling in the context of predictive maintenance MA 2021
Multi-state modeling in the context of predictive maintenance MA 2021
Model Based Quality Diversity Optimization MA 2021
mlr3automl - Automated Machine Learning in R MA 2021
Knowledge destillation - Compressing arbitrary learners into a neural net MA 2020
Personality Prediction Based on Mobile Gaze and Touch Data MA 2020
Identifying Subgroups induced by Interaction Effects MA 2020
Benchmarking: Tests and Vizualisations MA 2019
Counterfactual Explanations MA 2019
Methodik, Anwendungen und Interpretation moderner Benchmark-Studien am Beispiel der MA 2019
Risikomodellierung bei akuter Cholangitis    
Machine Learning pipeline search with Bayesian Optimization and Reinforcement Learning MA 2019
Visualization and Efficient Replay Memory for Reinforcement Learning BA 2019
Neural Network Embeddings for Categorical Data BA 2019
Localizing phosphorylation sites by deep learning-based fragment ion intensity MA 2019
Average Marginal Effects in Machine Learning MA 2019
Wearable-based Severity Detection in the Context of Parkinson’s Disease Using MA 2018
Deep Learning Techniques    
Bayesian Optimization under Noise for Model Selection in Machine Learning MA 2018
Interpretable Machine Learning - An Application Study using the Munich Rent Index MA 2018
Automatic Gradient Boosting MA 2018
Efficient and Distributed Model-Based Boosting for Large Datasets MA 2018
Linear individual model-agnostic explanations - discussion and empirical analysis of modifications MA 2018
Extending Hyperband with Model-Based Sampling Strategies MA 2018
Reinforcement learning in R MA 2018
Anomaly Detection using Machine Learning Methods MA 2018
RNN Bandmatrix MA 2018
Configuration of deep neural networks using model-based optimization MA 2017
Kernelized anomaly detection MA 2017
Automatic model selection amd hyperparameter optimization MA 2017
mlrMBO / RF distance based infill criteria MA 2017
Kostensensitive Entscheidungsbäume für beobachtungsabhängige Kosten BA 2016
Implementation of 3D Model Visualization for Machine Learning BA 2016
Eine Simulationsstudie zum Sampled Boosting BA 2016
Implementation and Comparison of Stacking Methods for Machine Learning MA 2016
Runtime estimation of ML models BA 2016
Process Mining: Checking Methods for Process Conformance MA 2016
Implementation of Multilabel Algorithms and their Application on Driving Data MA 2016
Stability Selection for Component-Wise Gradient Boosting in Multiple Dimensions MA 2016
Detecting Future Equipment Failures: Predictive Maintenance in Chemical Industrial Plants MA 2016
Fault Detection for Fire Alarm Systems based on Sensor Data MA 2016
Laufzeitanalyse von Klassifikationsverfahren in R BA 2015
Benchmark Analysis for Machine Learning in R BA 2015
Implementierung und Evaluation ergänzender Korrekturmethoden für statistische Lernverfahren BA 2014
bei unbalancierten Klassifikationsproblemen    

Completed Theses (Supervised by Bernd Bischl at TU Dortmund)

Title Type Completed
Anwendung von Multilabel-Klassifikationsverfahren auf Medizingerätestatusreporte zur Generierung von Reparaturvorschlägen MA 2015
Erweiterung der Plattform OpenML um Ereigniszeitanalysen MA 2015
Modellgestützte Algorithmenkonfiguration bei Feature-basierten Instanzen: Ein Ansatz über das Profile-Expected-Improvement Dipl. 2015
Modellbasierte Hyperparameteroptimierung für maschinelle Lernverfahren auf großen Daten MA 2015
Implementierung einer Testsuite für mehrkriterielle Optimierungsprobleme BA 2014
R-Pakete für Datenmanagement und -manipulation großer Datensätze BA 2014
Lokale Kriging-Verfahren zur Modellierung und Optimierung gemischter Parameterräume mit Abhängigkeitsstrukturen BA 2014
Kostensensitive Algorithmenselektion für stetige Black-Box-Optimierungsprobleme basierend auf explorativer Landschaftsanalyse MA 2013
Exploratory Landscape Analysis für mehrkriterielle Optimierungsprobleme MA 2013
Feature-based Algorithm Selection for the Traveling-Salesman-Problem BA 2013
Implementierung und Untersuchung einer parallelen Support Vector Machine in R Dipl. 2013
Sequential Model-Based Optimization by Ensembles: A Reinforcement Learning Based Approach Dipl. 2012
Vorhersage der Verkehrsdichte in Warschau basierend auf dem Traffic Simulation Framework BA 2011
Klassifikation von Blutgefäßen und Neuronen des menschlichen Gehirns anhand von ultramikroskopierten 3D-Bilddaten BA 2011
Uncertainty Sampling zur Auswahl optimaler Sampler aus der trunkierten Normalverteilung BA 2011
Over-/Undersampling für unbalancierte Klassifikationsprobleme im Zwei-Klassen-Fall BA 2010

The Research Repository @ WVU

Home > Statler College of Engineering and Mineral Resources > MININGENG > Mining Engineering Graduate Theses and Dissertations

Mining Engineering Graduate Theses and Dissertations

Theses/dissertations from 2024 2024.

CHARACTERIZATION AND EVALUATION OF VARIOUS BIOCHAR TYPES AS GREEN ADSORBENTS FOR RARE EARTH ELEMENT RECOVERY FROM AQUEOUS SOLUTIONS , Oluwaseun Victor Famobuwa

Selective Recovery of Various Critical Metals from Acid Mine Drainage Sludge , Gorkem Gecimli

Theses/Dissertations from 2023 2023

Development of A Hydrometallurgical Process for the Extraction of Cobalt, Manganese, and Nickel from Acid Mine Drainage Treatment Byproduct , Alejandro Agudelo Mira

Selective Recovery of Rare Earth Elements from Acid Mine Drainage Treatment Byproduct , Zeynep Cicek

Identification of Rockmass Deformation and Lithological Changes in Underground Mines by Using Slam-Based Lidar Technology , Francisco Eduardo Gil Hurtado

Analysis of the Brittle Failure Mechanism of Underground Stone Mine Pillars by Implementing Numerical Modeling in FLAC3D , Rosbel Jimenez

Analysis of the root causes of fatal injuries in the United States surface mines between 2008 and 2021. , Maria Fernanda Quintero

AUGMENTED REALITY AND MOBILE SYSTEMS FOR HEAVY EQUIPMENT OPERATORS IN SURFACE MINING , Juan David Valencia Quiceno

Theses/Dissertations from 2022 2022

Integrated Large Discontinuity Factor, Lamodel and Stability Mapping Approach for Stone Mine Pillar Stability , Mustafa Baris Ates

Noise Exposure Trends Among Violating Coal Mines, 2000 to 2021 , Hanna Grace Davis

Calcite depression in bastnaesite-calcite flotation system using organic acids , Emmy Muhoza

Investigation of Geomechanical Behavior of Laminated Rock Mass Through Experimental and Numerical Approach , Qingwen Shi

Static Liquefaction in Tailing Dams , Jose Raul Zela Concha

Experimental and Theoretical Investigation on the Initiation Mechanism of Low-Rank Coal's Self-Heating Process , Yinan Zhang

Development of an Entry-Scale Modeling Methodology to Provide Ground Reaction Curves for Longwall Gateroad Support Evaluation , Haochen Zhao

Size effect and anisotropy on the strength of shale under compressive stress conditions , Yun Zhao

Theses/Dissertations from 2021 2021

Evaluation of LIDAR systems for rock mass discontinuity identification in underground stone mines from 3D point cloud data , Mario Alejandro Bendezu de la Cruz

Implementing the Empirical Stone Mine Pillar Strength Equation into the Boundary Element Method Software LaModel , Samuel Escobar

Recovery of Phosphorus from Florida Phosphatic Waste Clay , Amir Eskanlou

Optimization of Operating Conditions and Design Parameters on Coal Ultra-Fine Grinding Through Kinetic Stirred Mill Tests and Numerical Modeling , Francisco Patino

The Effect of Natural Fractures on the Mechanical Behavior of Limestone Pillars: A Synthetic Rock Mass Approach Application , Mustafa Can Süner

Evaluation of Various Separation Techniques for the Removal of Actinides from A Rare Earth-Containing Solution Generated from Coarse Coal Refuse , Deniz Talan

Geology Oriented Loading Approach for Underground Coal Mines , Deniz Tuncay

Various Operational Aspects of the Extraction of Critical Minerals from Acid Mine Drainage and Its Treatment By-product , Zhongqing Xiao

Theses/Dissertations from 2020 2020

Adaptation of Coal Mine Floor Rating (CMFR) to Eastern U.S. Coal Mines , Sena Cicek

Upstream Tailings Dam - Liquefaction , Mladen Dragic

Development, Analysis and Case Studies of Impact Resistant Steel Sets for Underground Roof Fall Rehabilitation , Dakota D. Faulkner

The influence of spatial variance on rock strength and mechanism of failure , Danqing Gao

Fundamental Studies on the Recovery of Rare Earth Elements from Acid Mine Drainage , Xue Huang

Rational drilling control parameters to reduce respirable dust during roof bolting operations , Hua Jiang

Solutions to Some Mine Subsidence Research Challenges , Jian Yang

An Interactive Mobile Equipment Task-Training with Virtual Reality , Lazar Zujovic

Theses/Dissertations from 2019 2019

Fundamental Mechanism of Time Dependent Failure in Shale , Neel Gupta

A Critical Assessment on the Resources and Extraction of Rare Earth Elements from Acid Mine Drainage , Christopher R. Vass

Time-dependent deformation and associated failure of roof in underground mines , Yuting Xue

Theses/Dissertations from 2018 2018

Parametric Study of Coal Liberation Behavior Using Silica Grinding Media , Adewale Wasiu Adeniji

Three-dimensional Numerical Modeling Encompassing the Stability of a Vertical Gas Well Subjected to Longwall Mining Operation - A Case Study , Bonaventura Alves Mangu Bali

Shale Characterization and Size-effect study using Scanning Electron Microscopy and X-Ray Diffraction , Debashis Das

Behaviour Of Laminated Roof Under High Horizontal Stress , Prasoon Garg

Theses/Dissertations from 2017 2017

Optimization of Mineral Processing Circuit Design under Uncertainty , Seyed Hassan Amini

Evaluation of Ultrasonic Velocity Tests to Characterize Extraterrestrial Rock Masses , Thomas W. Edge II

A Photogrammetry Program for Physical Modeling of Subsurface Subsidence Process , Yujia Lian

An Area-Based Calculation of the Analysis of Roof Bolt Systems (ARBS) , Aanand Nandula

Developing and implementing new algorithms into the LaModel program for numerical analysis of multiple seam interactions , Mehdi Rajaeebaygi

Adapting Roof Support Methods for Anchoring Satellites on Asteroids , Grant B. Speer

Simulation of Venturi Tube Design for Column Flotation Using Computational Fluid Dynamics , Wan Wang

Theses/Dissertations from 2016 2016

Critical Analysis of Longwall Ventilation Systems and Removal of Methane , Robert B. Krog

Implementing the Local Mine Stiffness Calculation in LaModel , Kaifang Li

Development of Emission Factors (EFs) Model for Coal Train Loading Operations , Bisleshana Brahma Prakash

Nondestructive Methods to Characterize Rock Mechanical Properties at Low-Temperature: Applications for Asteroid Capture Technologies , Kara A. Savage

Mineral Asset Valuation Under Economic Uncertainty: A Complex System for Operational Flexibility , Marcell B. B. Silveira

A Feasibility Study for the Automated Monitoring and Control of Mine Water Discharges , Christopher R. Vass

Spontaneous Combustion of South American Coal , Brunno C. C. Vieira

Calibrating LaModel for Subsidence , Jian Yang

Theses/Dissertations from 2015 2015

Coal Quality Management Model for a Dome Storage (DS-CQMM) , Manuel Alejandro Badani Prado

Design Programs for Highwall Mining Operations , Ming Fan

Development of Drilling Control Technology to Reduce Drilling Noise during Roof Bolting Operations , Mingming Li

The Online LaModel User's & Training Manual Development & Testing , Christopher R. Newman

How to mitigate coal mine bumps through understanding the violent failure of coal specimens , Gamal Rashed

Theses/Dissertations from 2014 2014

Effect of biaxial and triaxial stresses on coal mine shale rocks , Shrey Arora

Stability Analysis of Bleeder Entries in Underground Coal Mines Using the Displacement-Discontinuity and Finite-Difference Programs , Xu Tang

Experimental and Theoretical Studies of Kinetics and Quality Parameters to Determine Spontaneous Combustion Propensity of U.S. Coals , Xinyang Wang

Bubble Size Effects in Coal Flotation and Phosphate Reverse Flotation using a Pico-nano Bubble Generator , Yu Xiong

Integrating the LaModel and ARMPS Programs (ARMPS-LAM) , Peng Zhang

Theses/Dissertations from 2013 2013

Column Flotation of Subbituminous Coal Using the Blend of Trimethyl Pentanediol Derivatives and Pico-Nano Bubbles , Jinxiang Chen

Applications of Surface and Subsurface Subsidence Theories to Solve Ground Control Problems , Biao Qiu

Calibrating the LaModel Program for Shallow Cover Multiple-Seam Mines , Morgan M. Sears

The Integration of a Coal Mine Emergency Communication Network into Pre-Mine Planning and Development , Mark F. Sindelar

Factors considered for increasing longwall panel width , Jack D. Trackemas

An experimental investigation of the creep behavior of an underground coalmine roof with shale formation , Priyesh Verma

Evaluation of Rope Shovel Operators in Surface Coal Mining Using a Multi-Attribute Decision-Making Model , Ivana M. Vukotic

Theses/Dissertations from 2012 2012

Calculating the Surface Seismic Signal from a Trapped Miner , Adeniyi A. Adebisi

Comprehensive and Integrated Model for Atmospheric Status in Sealed Underground Mine Areas , Jianwei Cheng

Production and Cost Assessment of a Potential Application of Surface Miners in Coal Mining in West Virginia , Timothy A. Nolan

The Integration of Geomorphic Design into West Virginia Surface Mine Reclamation , Alison E. Sears

Truck Cycle and Delay Automated Data Collection System (TCD-ADCS) for Surface Coal Mining , Patricio G. Terrazas Prado

New Abutment Angle Concept for Underground Coal Mining , Ihsan Berk Tulu

Theses/Dissertations from 2011 2011

Experimental analysis of the post-failure behavior of coal and rock under laboratory compression tests , Dachao Neil Nie

The influence of interface friction and w/h ratio on the violence of coal specimen failure , Simon H. Prassetyo

Theses/Dissertations from 2010 2010

A risk management approach to pillar extraction in the Central Appalachian coalfields , Patrick R. Bucks

The Impacts of Longwall Mining on Groundwater Systems -- A Case of Cumberland Mine Panels B5 and B6 , Xinzhi Du

Evaluation of ultrafine spiral concentrators for coal cleaning , Meng Yang

Theses/Dissertations from 2009 2009

Development of a coal reserve GIS model and estimation of the recoverability and extraction costs , Chandrakanth Reddy Apala

Application and evaluation of spiral separators for fine coal cleaning , Zhuping Che

Weak floor stability in the Illinois Basin underground coal mines , Murali M. Gadde

Design of reinforced concrete seals for underground coal mines , Rajagopala Reddy Kallu

Employing laboratory physical modeling to study the radio imaging method (RIM) , Jun Lu

Influence of cutting sequence and time effects on cutters and roof falls in underground coal mine -- numerical approach , Anil Kumar Ray

Implementing energy release rate calculations into the LaModel program , Morgan M. Sears

Modeling PDC cutter rock interaction , Ihsan Berk Tulu

Analytical determination of strain energy for the studies of coal mine bumps , Qiang Xu

Improvement of the mine fire simulation program MFIRE , Lihong Zhou

Theses/Dissertations from 2008 2008

Program-assisted analysis of the transverse pressure capacity of block stoppings for mine ventilation control , Timothy J. Batchler

Analysis of factors affecting wireless communication systems in underground coal mines , David P. McGraw

Analysis of underground coal mine refuge shelters , Mickey D. Mitchell

Theses/Dissertations from 2007 2007

Dolomite flotation of high magnesium phosphate ores using fatty acid soap collectors , Zhengxing Gu

Evaluation of longwall face support hydraulic supply systems , Ted M. Klemetti II

Experimental studies of electromagnetic signals to enhance radio imaging method (RIM) , William D. Monaghan

Analysis of water monitoring data for longwall panels , Joseph R. Zirkle

Theses/Dissertations from 2006 2006

Measurements of the electrical properties of coal measure rocks , Nikolay D. Boykov

  • Collections
  • Disciplines
  • WVU Libraries
  • WVU Research Office
  • WVU Research Commons
  • Open Access @ WVU
  • Digital Publishing Institute

Advanced Search

  • Notify me via email or RSS

Author Corner

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

UKnowledge

UKnowledge > College of Engineering > Mining Engineering > Theses & Dissertations

Theses and Dissertations--Mining Engineering

Theses/dissertations from 2024 2024.

THE METHODOLOGY FOR INTEGRATING ROBOTIC SYSTEMS IN UNDEGROUND MINING MACHINES , Peter Kolapo

DISCRETE ELEMENT MODELING TO PREDICT MUCKPILE PROFILES FROM CAST BLASTING , Russell Lamont

AUTONOMOUS SHUTTLE CAR DOCKING TO A CONTINUOUS MINER USING RGB-DEPTH IMAGERY , Sky Rose

Underground Blast Fragmentation Modeling for Use in the Mine-to-Mill Strategy , Lauren Shields

Theses/Dissertations from 2023 2023

ASSESSMENT OF AIR OVERPRESSURE FROM BLASTING USING COMPUTATIONAL FLUID DYNAMICS , Cecilia Estefania Aramayo

RECOVERY OF VALUABLE METALS FROM ELECTRONIC WASTE USING A NOVEL AMMONIA-BASED HYDROMETALLURGICAL PROCESS , Peijia Lin

AN ACID BAKING APPROACH TO ENHANCE RARE EARTH ELEMENT RECOVERY FROM BITUMINOUS COAL SOURCES , Ahmad Nawab

PREDICTION OF DYNAMIC SUBSIDENCE IN THE PROXIMITY OF LONGWALL PANEL BOUNDARIES , JESUS DAVID ROMERO BENITEZ

Prediction of Blast-Induced Ground Vibrations: A Comparison Between Empirical and Artificial-Neural-Network Approaches , Luis F. Velasquez

A LABORATORY AND NUMERICAL INVESTIGATION OF THE STRENGTH OF IRREGULARLY SHAPED PILLARS , Zachary Wedding

Theses/Dissertations from 2022 2022

DEVELOPMENT OF UNIVARIATE AND MULTIVARIATE FORECASTING MODELS FOR METHANE GAS EMISSIONS IN UNDERGROUND COAL MINES , Juan Diaz

PARAMETRIC NUMERICAL ANALYSIS OF INCLINED COAL PILLARS , Robin Flattery

Strain Energy Analysis Related To Strata Failure During Caving Operations , Caroline Gerwig

LAPTOP RECYCLING CASE STUDY: ESTIMATING THE CONTAINED VALUE AND VALUE RECOVERY PROCESS FEASIBILITY OF END-OF-LIFE CONSUMER ELECTRONICS , Zebulon Hart

INVESTIGATION INTO, & ANALYSIS OF TEMPERATURE & STRAIN DATA FOR COAL MINE SEAL MATERIAL DURING CURING , Stephanus Jaco van den Berg

Theses/Dissertations from 2021 2021

DEVELOPMENT OF AN AUTONOMOUS NAVIGATION SYSTEM FOR THE SHUTTLE CAR IN UNDERGROUND ROOM & PILLAR COAL MINES , Vasileios Androulakis

Investigation of Coal Burst Potential Using Numerical Modeling and Rock Burst Indices , Cristian David Cardenas Triana

Capture of Respirable Dust using Maintenance Free Impingement Screen , Neeraj Kumar Gupta

OXIDATION PRETREATMENT FOR ENHANCED LEACHABILITY OF RARE EARTH ELEMENTS FROM BITUMINOUS COAL SOURCES , Tushar Gupta

AN APPROACH FOR PREDICTING FLOW CHARACTERISTICS AT THE CONTINUOUS MINER FACE , Kayla Henderson

CONCEPTS FOR DEVELOPMENT OF SHUTTLE CAR AUTONOMOUS DOCKING WITH CONTINUOUS MINER USING 3-D DEPTH CAMERA , Sibley Miller

MODELING OF RARE EARTH SOLVENT EXTRACTION PROCESS FOR FLOWSHEET DESIGN AND OPTIMIZATION , Vaibhav Kumar Srivastava

Application of a Novel Ventilation Simplification Algorithm , Caitlin V. Strong

A METHODOLOGY FOR AUTONOMOUS ROOF BOLT INSTALLATION USING INDUSTRIAL ROBOTICS , Anastasia Xenaki

Advanced Search

  • Notify me via email or RSS

Browse by Author

  • Collections
  • Disciplines

Author Corner

  • Submit Research

New Title Here

Below. --> connect.

  • Law Library
  • Special Collections
  • Copyright Resource Center
  • Graduate School
  • Scholars@UK

Logo of Kentucky Research Commons

  • We’d like your feedback

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

University of Kentucky ®

An Equal Opportunity University Accreditation Directory Email Privacy Policy Accessibility Disclosures

University of Vienna - Main page

  • Show search form Hide search form
  • Quick links
  • Staff search

Open topics for theses and practical courses

Unless otherwise specified, all topics are available as practical course (P1/P2), Data Science projects, bachelor or master thesis.

Work group: Type:

+ On the effectiveness and quality of outputs from large language models

Understanding the effectiveness and quality of outputs from Large Language Models (LLMs) is crucial. This project focuses on qualitative research involving an LLM integrated with a Retrieval-Augmented Generation (RAG) system. The objective is to explore and identify meaningful metrics to evaluate the system's performance and effectiveness in real-world applications. This involves evaluating potential metrics, selecting the most relevant ones, and thoroughly investigating how these metrics can be measured and interpreted. The project includes designing a study, gathering and analyzing qualitative data through methods such as interviews, and developing insights that could shape the future development and deployment of the LLM-RAG system. This project offers a unique opportunity to explore practical applications of an LLM-RAG system. The aim of the model is to support experts in answering legal questions more quickly and effectively. Also relevant is the consideration of work, the change of work and the organization of work through the introduction of the model.

Contact: Sebastian Tschiatschek , Torsten Möller

+ Single-Cell Gene Expression Analysis [Practical Course or Bachelor Thesis]

In single-cell gene expression analysis, we measure the levels of genes in each cell within a biological sample. Among these genes, transcription factors play a critical role as they can regulate the expression of other genes.

  • Extracting patterns or rules within gene regulation
  • Investigating combinations of regulators, specifically transcription factors, and their effects on gene expression
  • Comparing gene targets across different species

Students participating in this project will need programming skills in Python. An interest in the interdisciplinary field of bioinformatics is advantageous. Further details on the project are available upon request and will be discussed during an initial meeting. A first overview of the topic can be found under this link ( https://ucloud.univie.ac.at/index.php/s/4zLMbNQZ2anT5Cn ).

Contact: Carolina Atria

+ Interactive Visualization of single-cell multiomics Datasets [Practical Course]

  • Connection to a glossary for definitions of technical terms
  • Embedding SVG figures for easier explanation of underlying models

Familiarity with JavaScript, and especially the D3 library, is necessary for the implementation of this project. Further details on the project are available upon request and will be discussed during an initial meeting. A first overview of the topic can be found under this link ( https://ucloud.univie.ac.at/index.php/s/4zLMbNQZ2anT5Cn ).

+ Predicting treatment outcome of antidepressant therapy from EEG recordings [Bachelor or Master Thesis]

The Kuramoto model for synchronisation can be used to calculate connectivity matrices from EEG recordings. They describe the functional interplay between brain regions. We derive these matrices from EEG recordings of each individual of a cohort of patients suffering from depression before and after 1 week of antidepressant therapy. We aim to predict the treatment outcome (responders vs. non-responders) with the connectivity matrices as our features. Possible analysis tools range from statistical tests to graph mining methods.

Details about the EEG data structure, the connectivity matrices and the goal of the analysis can be found in this document .

Contact: Lena Bauer , Yllka Velaj

+ Predicting solar thermal heat production [Master Thesis]

With heat accounting for 50% of global energy demand, solar-thermal plants could play a crucial role in the transition to renewable energy. One of the most important questions for operators is whether the current heat production matches expectations. However, predicting the expected yield is challenging due to the constantly changing weather and demand, which significantly influence heat production. Additionally, seasonal changes in operation add to the complexity.

Do you want to contribute to a greener future? Leverage your knowledge of machine learning or physically-motivated algorithms to improve the current yield prediction. Join us in building on the successful collaboration between the University of Vienna and SOLID. Specifically, you would be embedded into the company through a (paid) internship.

Contact: Sebastian Tschiatschek

+ Fault Detection for solar thermal plants [Master Thesis]

With heat accounting for 50% of global energy demand, solar-thermal plants could play a crucial role in the transition to renewable energy. Effective monitoring is essential to quickly identify issues and maintain optimal performance. However, existing fault detection algorithms lack precision. The constantly changing operating conditions (e.g., weather, demand) and time-dependent factors (e.g., fluid heat-up) complicate fault detection for both experts and algorithms.

Do you want to contribute to a greener future? Leverage your knowledge of machine learning or physically-motivated algorithms to improve fault detection. Join us in building on the successful collaboration between the University of Vienna and SOLID. Specifically, you would be embedded into the company through a (paid) internship.

+ Automatic tracking of individual cancer organoid in 3D from optical coherence tomography images [Master Thesis]

Cancer organoids are a very useful in vitro model in cancer research because they resemble the actual morphologic and functional features of human or animal cancers. These cancer organoids may be developed in large quantities originating from the same patient cancer, therefore permitting in vitro drug screening for precision medicine. Optical coherence tomography (OCT) has been shown to provide high-throughput imaging of cancer organoids in a non-invasive and label-free fashion for longitudinal monitoring of these organoids. However, tracking each individual organoid over weeks and providing quantitative analysis is very labor intensive and time consuming. Algorithms that can automatically analyze the volumetric images acquired at different time points are urgently required to free up biological and biophotonics researchers. A master thesis project is possible to develop such algorithms for automatic tracking of individual organoid in 3D from OCT data. The scope and details of the project will be framed according to the background of the student in an initial meeting.

Prerequisites: The student needs solid programming skills in Python and background in machine learning and computer vision, e.g., by completing the courses "Foundations of data analysis" and "Image Processing and Image Analysis" or equivalent courses.

Supervisors: Claudia Plant, Kevin Sidak, Lukas Miklautz in collaboration with Mengyang Liu, Abigail Deloria, and Agnes Csiszar from Medical University of Vienna

Contact: Claudia Plant , Kevin Sidak , Lukas Miklautz

+ Semantic Segmentation of cancer organoids for chemotherapy treatment efficacy prediction [Master Thesis]

Cancer organoids can be cultivated in batches and undergo chemotherapy. The treatment efficacy can be then validated by imaging the treated organoids using fluorescence microscopy. For example, some cancer cells in the cancer organoids may be killed by the anti-cancer drug, and these dead cells can be labeled and show up in green color in fluorescence microscopy whereas the live cancer cells can be labeled otherwise and show up in red color in the fluorescence microscopy. However, fluorescence microscopy only provides 2D imaging and is time consuming in this live/dead classification task. By using optical coherence tomography (OCT), we can image these organoids that are treated by anti-cancer drugs in just a few seconds. However, the reconstructed OCT images only show structural information directly and lack the classification information. We assume that, using fluorescence microscopy as the ground truth, OCT images of the treated cancer organoids can also provide the classification of live and dead cancer cells. This master thesis work involves the development of classification algorithms to differentiate live cancer cells from dead cancer cells in cancer organoids imaged by OCT using the fluorescence image data as the ground truth.

+ Text Clustering of Audit Reports from Erste Group [Practical Course or Master Thesis]

The internal audit department produces yearly thousands of reports. The goal of each report is to highlight and communicate internal issues and criticalities, which are expressed through findings. Findings are presented as short text written in a (semi-)standardized format. Even though reports and therefore findings are already grouped by topic/area, additional and/or independent clustering could be helpful for risk assessment, audit planning, getting an overview. The goal of the project is to compare different cluster methodologies ranging from classical to hierarchical to non-redundant clustering. The student should explore different text representation approaches from classical LDA to modern LLM approaches. Clusters should contain texts describing similar or related issues and should be (at least partially) interpretable. The student is expected to use visualizations techniques, like word clouds, to inspect the clusters.

Important: The student will be onboarded as an intern in the internal audit department at Erste Group Bank AG.

Prerequisites: The student needs solid programming skills in Python, background in machine learning and data mining, e.g., by completing the courses "Foundations of data analysis", "Data Mining" and "Scientific Data Management" or equivalent courses.

Supervisors: Claudia Plant and Lukas Miklautz in collaboration with Stefano Melchionna from Erste Group Bank AG

Contact: Claudia Plant , Lukas Miklautz

+ Digital Humanities - Computer Vision and Machine Learning in Archaeology [Practical Course or Bachelor Thesis]

Glass beads were among the most common grave goods in the early Middle Ages, and their number can be estimated in the millions. The color, size, shape, production technique and decoration of the beads are diverse and contain much information that is relevant for historians, e.g., regarding trade and production networks. This large scale makes the recording and digitalization of glass beads very labor intensive.

  • Image Segmentation
  • Image Classification
  • Image Clustering
  • Image Generation
  • Image Upscaling
  • Automated Object Counting

The student is expected to implement one of the tasks above and design a simple and easy-to-use graphical interface, e.g., a web-based application. The scope of the project will be discussed in an initial meeting with the supervisors.

Prerequisites: The student needs solid programming skills in Python and background in machine learning and computer vision, e.g., by completing the courses "Foundations of data analysis" and "Image Processing and Image Analysis" or equivalent courses. Additionally, interested students should be eager to participate in an interdisciplinary project on the intersection of data mining, image processing and archeology.

+ Domain Knowledge in Performative Prediction

Performative predictions are predictions supporting decisions that influence the outcome and are ubiquitous in many real-world applications of machine learning techniques. For instance, if a loan applicant is predicted to have an elevated risk of default and as a consequence will be charged higher interest rates, this likely further increase their default risk. Thus it is important to apprehend a prediction-based action's effect on the outcome and factor it in when deciding on the model used for making predictions.

In this project you will study performative prediction when domain knowledge about the taken action's effect is available, e.g., that the default risk increases with at least a certain probability. In this setting, you will develop techniques for selecting good predictive models with the goal of steering the outcome while accomodating constraints, e.g., a maximum loan sum for all customers.

Students wanting to work on this topic are expected to have a solid basic understanding of machine learning techniques, a solid knowledge of Python, and a basic knowledge of deep learning libraries (PyTorch or TensorFlow).

+ Developing anonymized datasets for Graph Neural Networks [Bachelor Thesis]

Developing anonymized datasets based on proprietary data for session-based recommendations with Graph Neural Networks (GNNs) is crucial to adhere to GDPR requirements and protect business secrets. Using proprietary data from an industry partner, this thesis will focus on creating a publishable, anonymized dataset that retains the essential characteristics needed for effective session-based recommendation while ensuring no traceability to individual users or sensitive business information. Anonymization techniques such as data aggregation, noise addition, and generalization will be employed. Additionally, synthetic data generation methods will be explored to produce a dataset that mirrors the statistical properties of the original data without compromising confidentiality. Ensuring the preservation of graph structure and session patterns is paramount to maintaining the dataset's utility for GNN research. The project will involve a thorough evaluation of various anonymization and synthetic data generation techniques, with a focus on their effectiveness in maintaining data utility and anonymity. The expected outcome is a robustly anonymized benchmark dataset using state-of-the-art methodology for creating anonymized datasets suitable for research and publication, contributing to the broader field of machine learning on graphs in general and sequential recommendation in particular.

Requirements: Python, interest in data anonymization, interest in graph neural networks.

Contact: Lorenz Kummer , Nils Kriege

+ The source of errors in causal discovery [Master Thesis]

CAUSEEFFECTPAIRS, introduced by Mooij et al. in 2016 has been an established benchmark data set used for evaluating bivariate causal discovery methods. The data set consists of 100 different cause - effect pairs selected from 37 data sets from various domains (e.g., meteorology, biology, medicine, etc.) This benchmark contains a heterogenous selection of cause-effect pairs with distinct underlying the data generation. The objective of the work will be to disentangle the source of errors of the state-of-the-art causal inference methods based on log-likelihood scoring and independence testing. The interested student is expected to have a good knowledge of statistics and good programming skills in python. Some background in causal inference or causal discovery is a plus point. More details to the interested student will be provided on request.

Contact: Katerina Schindlerova , Alexander Marx

+ Investigation into the Assumptions of Causal Learning Methods [Practical Course or Master Thesis]

Causal learning methods make several assumptions which are necessary to make useful causal inferences based on given data that go beyond statistical correlations made by conventional learning methods. In real world situations, it is often the case that these assumptions either don’t hold or become too strict or constraining. In this project we would like to explore what kind of inferences are reached if certain assumptions that are not valid for given datasets are made. We would also be interested in exploring if there could be ways to get rid of certain assumptions and still reach useful conclusions.

The interested student is expected to have basic knowledge of statistics and good programming skills in python. Some background in causal inference or causal discovery is a plus point. More details to the interested student will be provided on request.

Contact: Katerina Schindlerova , Aditi Kathpalia

+ Clustering of time-series of different scalings [Practical Course or Master Thesis]

Time-series data is a collection of data points recorded over time, each associated with a specific timestamp. This form of data is prevalent in various fields such as finance, economics, meteorology, healthcare, energy, telecommunications, and transportation. Current algorithms assume that we only have time-series data of the same scaling, but in real-world data time-series often consists of different scalings, e.g. hourly, daily, or weekly weather forecasts.

This project will mainly focus on the development of a clustering algorithm that can handle time series with different scalings. Students working on this project need solid programming skills in Python. More details to the interested student will be provided on request.

Contact: Yllka Velaj , Pascal Weber

+ Find non-redundant pattern within time-series [Practical Course or Master Thesis]

Time-series data is a collection of data points recorded over time, each associated with a specific timestamp. This form of data is prevalent in various fields such as finance, economics, meteorology, healthcare, energy, telecommunications, and transportation. Often we can gather additional information about these time series, which could improve cluster performance. This kind of clustering is also called "multi-view clustering". Unfortunately, currently, there are no algorithms for multi-view clustering of time series.

This project will mainly focus on the development of a clustering algorithm that can use additional information from an original time series to retrieve a better clustering. Students working on this project need solid programming skills in Python. More details to the interested student will be provided on request.

+ Deep representation learning and clustering for time-series data [Practical Course or Master Thesis]

The process of automatically learning features is termed representation learning and offers the advantage of eliminating the need to predetermine which features are crucial, often saving considerable effort in practical applications. Deep Clustering is a research domain that seeks to apply the successful principles of deep learning to the field of clustering. Deep clustering algorithms can autonomously extract crucial features. Additionally, deep clustering demonstrates scalability to large, high-dimensional datasets, resulting in more precise cluster assignments and predictions for new, previously unseen data points. Unfortunately, the usage of deep representation learning and deep clustering algorithms on time-series data is currently very limited.

This project will mainly focus on the development of a deep clustering algorithm for time-series data. Students working on this project need solid programming skills in Python. More details to the interested student will be provided on request.

+ Comparing clustering algorithms for time series data [Practical Course]

Clustering is a fundamental problem in data mining and allows us to discover a natural grouping of similar entities in a dataset. However, there are only a few studies that analyze the performance of clustering algorithms on time series data.

This project will mainly focus on the implementation of clustering algorithms, and on conducting an experimental study with real-world datasets and synthetic ones. Students working on this project need solid programming skills in Python. More details to the interested student will be provided on request.

+ Exploratory data analysis of time series data [Practical Course or Bachelor Thesis]

Time-series data is a collection of data points recorded over time, each associated with a specific timestamp. This form of data is prevalent in various fields such as finance, economics, meteorology, healthcare, energy, telecommunications, and transportation.

This work will mainly focus on developing interactive software for Exploratory Data Analysis (EDA) which will allow us to understand better the main patterns hidden in the data. Students working on this project need solid programming skills in Python. More details to the interested student will be provided on request.

+ Efficient Knowledge Distillation from Graph Neural Networks for Scalable e-Commerce Recommendation Systems

Graph Neural Networks (GNNs) have significantly enhanced the capability of recommendation systems in e-commerce by leveraging collaborative filtering and session-based methodologies. These advancements have demonstrated improved recommendation accuracy by effectively capturing complex user-item interactions within graph structures. However, the deployment of GNN-based systems at scale encounters substantial challenges, primarily due to their computational intensity and reliance on GPU hardware, which becomes a bottleneck in scenarios dealing with millions of requests per second. This thesis aims to address the scalability issues inherent in GNNs for e-commerce recommendation systems by exploring knowledge distillation techniques. The objective is to transfer knowledge from complex GNN models to simpler, more computationally efficient representations that can be easily managed and scaled on CPU servers and database environments. This research will involve a comprehensive evaluation of existing knowledge distillation methods applied to GNNs, focusing on their effectiveness in preserving the recommendation quality while significantly reducing computational requirements. A novel methodology for fair comparison of these techniques shall be developed, taking into account factors such as recommendation accuracy, computational efficiency, and scalability. The outcome of this thesis is expected to provide a systematic approach for making GNN-based recommendation systems more practical for large-scale e-commerce applications, thereby bridging the gap between advanced machine learning models and their real-world applicability in resource-constrained environments.

Requirements: Python, interest in recommendation systems, interest in knowledge distillation

+ Applications of data mining / machine learning in   weather prediction and climate science at GeoSphere Austria

Within the Post Processing unit at GeoSphere Austria we aim to continually update and improve our methods with state-of-the-art techniques in AI and machine learning.  Coming mainly from a meteorological perspective and expertise, we offer - supported by the data mining group - to co-supervise student projects with a fairly open choice of topic within our working field. Our unit works on enhancing numerical forecasts by statistical approaches, within the recent years evolving more towards advanced machine learning. Primary use cases involve supporting the renewable energy sector. Our technology stack is mainly based on Python including backends written in C++ or Fortran tackling heavy tasks. Within a student's work Python (enabling the use of state-of-the-art libraries such as TensorFlow, PyTorch, etc.) is the preferred working environment. Within this frame, you are free to choose a topic fitting your background, research interest, course emphasis and extent (bachelor's thesis, PR, master's thesis). Depending on your background recent machine learning approaches or more classical statistic and data mining approaches from literature may be applied – if needed computing resources will be arranged by the University.   Also, we encourage you to split/follow up a topic over several courses, such as PR and thesis.  Bringing your own ideas or building on your individual background is very welcome will be supported, however, note that we cannot offer payment, office space, or resources.

If you like to work with us, please send us a message to arrange an online meeting to set your topic - at least two weeks before the end of course unenrolment period (u:space-Abmeldung). We will discuss possible topics and basic questions on your interest and previous experiences with statistics and machine learning (send ahead via mail before meeting, please read/prepare). After our discussion we will provide the most important key-words from our discussion (take your own notes, too), indicate literature, and data-sets suitable for the work. You shall afterwards write a 0.5 - 1 page abstract about your chosen working topic (including research objective, data-sets, models used) to agree with your official supervisor (Prof. Claudia Plant) one day before the end of course unenrolment period.

Apply until: 2 weeks before the end of course unenrolment period (u:space-Abmeldung) Note: Due to external supervising finish your work at least 3 weeks before end of semester!

Contact: Petrina Papazek , www.geosphere.at

+ Knowledge Discovery From Deep Learning Models

Human cells need to precisely regulate the amount of proteins produced from each gene. Many of the regulatory mechanisms are governed by specific properties of the messenger RNA (mRNA). However, only a few of these properties have been studied in detail. This project employs deep learning to identify new and relevant RNA features that impact translation and stability. Deep learning models will be trained to predict the lifetime of an mRNA and its protein expression from mRNA sequence and structure information. The models will then be analyzed using techniques from the field of explainable AI to extract which mRNA features were crucial for the prediction.

Students working on this project should be knowledgeable in machine learning, have solid programming skills, have a desire to work on interpretable deep learning, and be keen to use machine learning to advance science and the discovery of biological knowledge.

+ Inverse Reinforcement Learning Under Embodiment Mismatch

Inverse reinforcement learning is about identifying the reward function optimized by an agent (e.g., an AI system or a human) through its behavior. This is for instance important to enable the transfer of a reinforcement learning agent’s capabilities to new environments and value alignment. Most commonly, inverse reinforcement learning techniques assume that the agent demonstrating optimized behavior and the agent learning from these demonstrations about the reward function have the same capabilities in terms of actions and observations. This assumption is however unrealistic and you will challenge it in this project.

This project considers inverse reinforcement learning in settings in which there is some form of embodiment mismatch between the expert demonstrator and the learning agent. The scope of the project is to study how existing algorithms perform in this setting and propose modifications to existing algorithms to achieve better performance, e.g., in the form of demonstrations tailored to the learner, learning about the mismatch, or similar.

+ Causal Abstractions in Reinforcement Learning

While reinforcement learning has shown impressive performance in various applications, the learned agent’s behavior often lacks interpretability and planning is computationally demanding. One avenue to approach these challenges is learning abstractions, i.e., aggregating different states of an environment into a meta-state such that te size of the resulting meta-state space is reduced. Such abstractions are most useful if they satisfy certain properties, e.g., the resulting model is Markovian or allows for a causal interpretation.

In this project you will implement and compare existing abstraction techniques for reinforcement learning and extend a selected one to allow for hierarchical abstraction to enable interpretation and planning at different levels of granularity while ensuring the aforementioned properties.

+ Understanding AI Systems Supporting Sequential Decision-Making

Explainable Artificial Intelligence (XAI) focuses on providing explanations for domain experts and data scientists. However, 'lay people', i.e., people without or only with limited technical knowledge, are often supported by AI systems in public institutions in their work, e.g., to make decisions about loan applications. Thus it is important that these lay people can understand the AI systems to a certain extent and understand the impact of these systems on the decisions they make.

In this project, you will study how to best make AI systems comprehensible to lay people in sequential decision-making settings, in particular in settings motivated by the usage of AI systems in public institutions. It will involve developing AI models, interpretability techniques, and conducting user studies with lay people.

+ Learning with Imbalanced Molecular Data

Learning problems for molecular data are often imbalanced, i.e., there is a small subset of molecules having a desired property while most molecules do not have this property. Typically, the minority class of special interest in the considered application. However, most machine-learning algorithms do not work well in such settings. In this project, you will explore and compare several general-purpose methods for imbalanced data and develop specific data augmentation methods for molecular data.

Contact: Nils Kriege

+ Internal Evaluation for Density-based Clustering

We develop a new internal evaluation measure for clusterings based on recent research [*] and compose a survey of current evaluation methods. All methods will be compared systematically and extensively for different settings and concepts of density. Furthermore, we will investigate state-of-the-art benchmark data sets w.r.t. their suitability for the evaluation of density-based clustering algorithms considering our new measure.

[*] Beer, Anna, et al. "Connecting the Dots--Density-Connectivity Distance unifies DBSCAN, k-Center and Spectral Clustering." Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2023. DOI

Contact: Anna Beer

+ 3d-Shape Descriptors and Molecular Descriptors for Clustering [Practical Course or Master Thesis]

The data regarded in this thesis is from the field of Molecular Dynamics (MD) and describes the positions of a protein's atoms over time. Clustering timesteps allows to detect different states of a protein, which is important for understanding the protein folding process. E.g., these states can serve for a Markov State Model of the protein's trajectories.

In this work, we compare results based on the usually applied molecular descriptors with 3d-Shape descriptors that were recently developed for general point clouds.

+ Graph Neural Networks and Molecular Fingerprints

Early graph neural networks (GNNs) were motivated by circular chemical fingerprints generalizing their concept by introducing learnable parameters. However, recent studies show that current state-of-the-art GNN architectures are still outperformed in molecular property prediction by concrete implementations of circular fingerprints combined with a multilayer perceptron. The task of this project is to explore the reasons for this through a detailed analysis of both methods. You will investigate a concrete implementation of a chemical fingerprint and develop a new neural version starting by reproducing the fingerprint and adding neural modules stepwise. Different stages should be investigated in an ablation study.

+ Building self-explanatory transparent models

Explainable Artificial Intelligence (XAI) focuses on providing explanations for domain experts and data scientist. However, 'lay people', i.e., people without or only with limited technical knowledge, are often subject to the consequences of deploying algorithmic decision-making systems in public institutions, such as the COMPAS recidivism model . Finding new ways to render machine learning models comprehensible for lay people is therefore a current and highly relevant challenge. This project will focus on approaches for building machine learning models that are in themselves interpretable and explainable for lay people, soliciting slight or no external explanations. It will involve both programming and the conducting of user studies with lay people.

Contact: Timothée Schmude , Sebastian Tschiatschek

+ Attentive pixel prediction (Reinforcement Learning)

Auxiliary prediction tasks have been the source of many recent performance improvements for reinforcement learning agents, particularly when working with image inputs. These tasks are usually provided as expert knowledge by the algorithm designer and force the agent to pay attention to various aspects of the environment which might be helpful to solving the underlying task. In this project you will investigate whether such prediction tasks can also be learned by the agent itself. For this, you will first visualize what an RL agent actually "attends to" through investigating heatmaps of their first network layer. Then you will devise a method allowing the agent itself to decide which pixels of the state it wants to learn to reconstruct.

Required skills: Good programming skills, basic background in deep learning

Contact: Timo Klein , Sebastian Tschiatschek

+ The Complexity of Computing the Graph Edit Distance

The graph edit distance (GED) quantifies the dissimilarity of two graphs as the minimum cost of a sequence of edit operations turning one graph into the other. The problem complexity depends on the choice of the edit operation costs and the properties of the considered graphs. It has been shown that the classical NP-hard subgraph isomorphism problem (SI) and the maximum common subgraph problem (MCS) can be reduced to computing the GED. As a consequence, computing the GED also is NP-hard. However, for SI and MCS a fine distinction of the complexity depending on parameters such as treewidth, degree, and their combination is known. The goal of this project is to investigate the complexity of computing the graph edit distance in restricted graph classes identifying polynomial-time solvable and NP-hard cases.

Requirements: Strong interest in graph algorithms and theoretical computer science

+ Reinforcement Learning for improving mental health treatments

In this project/thesis you will study the potential of using reinforcement learning (RL) for improving mental health treatments. In particular, you will investigate the potential of RL algorithms for optimizing exposure therapy protocols, which are usually used for the treatment of anxiety disorders, such as phobias.

Specifically, you will conduct simulation studies in which a computational associative learning model (such as the latent cause model) will be a stand-in for the patients, and where an RL algorithms will make modifications to the exposure protocol with the goal of robustly extinguishing negative associations, which play a crucial role in anxiety disorders. The investigation can start with standard model-free RL approaches, but is then planned to proceed with model-based augmentations, such as equipping the RL algorithm with "Theory-of-Mind"-type capabilities.

Students working on this project need basic background knowledge in machine learning, solid programming skills, a desire to work on reinforcement learning, and willingness to learn about models of human cognition.

+ Efficient algorithms for uncertain graphs [Master Thesis]

Uncertain graphs represent an important data model for real-world networks, where one can assign an independent probability of existence on every edge. Uncertainty, represented by the edge probability, may arise due to several reasons, e.g., errors in measurements, data integration from inconsistent and ambiguous sources, lack of precise information needs, inference and prediction models, and explicit manipulation for privacy purposes. Due to their rich expressiveness and utility in a wide range of applications, uncertain graphs have prompted a great deal of research by the data mining research community.

This work will mainly focus on designing efficient algorithms for selecting and modifying the probability of a given number of edges in an uncertain graphs. The developed algorithm will be evaluated against other existing methods. An experimental study will be conducted on real-world datasets. Students working on this project need basic knowledge in graph theory and algorithms, and solid programming skills in Python, C/C++ or Java.

Contact: Yllka Velaj , Claudia Plant

+ Common Subgraph Problems in Tree-Like Graphs

For two given graphs, G and H, the maximum common subgraph problem (MCS) asks for the largest graph contained in both G and H. An important application occurs in cheminformatics, where the similarity of molecular graphs needs to be quantified. Unfortunately, MCS is NP-hard unless additional constraints regarding the properties of G, H, and the subgraph are enforced. A polynomial-time solvable case requires that all these graphs are trees. Known negative complexity results for classes of tree-like graphs set narrow boundaries for generalization.

The project aims to investigate particular tree-like graph classes to design, implement, and experimentally evaluate new polynomial-time algorithms. New NP-hardness proofs for specific cases should refine the complexity status further.

Requirements: Strong interest in algorithms and graph theory

+ Dynamic Orbit Partition [Practical Course or Master Thesis]

A graph isomorphism is a mapping between the nodes of two graphs that preserves the edges. An automorphism is an isomorphism of a graph to itself. The (non-trivial) automorphisms intuitively represent the symmetries of a graph. Two vertices are considered equivalent if there is an automorphism that maps one vertex to the other. The corresponding equivalence classes are referred to as orbit partition . The orbit partition is of interest for symmetry breaking techniques for various combinatorial problems. Identifying and pruning symmetric branches in the search tree based on the symmetries of the input instance can lead to drastic speed-ups. This project aims to develop dynamic algorithms for maintaining the orbit partition when the input graph changes. The focus of the project can be on the development and theoretical analysis of algorithms or their implementation and experimental evaluation.

Requirements: Strong interest in algorithms

Contact: Nils Kriege , Christian Schulz , Kathrin Hanauer

+ Interpretable and Explainable Deep Learning

In this project you will study novel approaches for learning interpretable and explainable deep learning models with the goal of supporting users with different skill levels and knowledge. The models will be evaluated on applications with large societal impact, e.g., labour market data.

+ Dynamic Information Acquisition in Questionnaires

In many areas of our life questionnaires/a collection of measurements are used to gather information for decision making, e.g., questionnaires are used to measure the success of marketing campaigns or customer satisfaction, or medical tests are conducted to decide on the best treatment for a patient. Common to those applications is that many questions need to be answered or many tests must be conducted. Often this is not because all the information is needed for deriving a decision but rather because the course of action is standardized (e.g., a survey always consists of the same questions).

In this project you will evaluate and develop methods for adaptive information acquisition that acquire the information necessary for decision making sequentially and adaptively, e.g., not all questions in a questionnaire are asked in all cases and the order of the questions can be changed. You will consider and evaluate recent machine learning models for this problem but also investigate how these methods can be extended to account for dependencies in the information that can be required, e.g., dependencies that only allow for a specific order of some questions of a questionnaire, the influence of how questions are asked on the answers, etc.

+ Deep Probabilistic Clustering for Heterogeneous and Incomplete Data

Clustering is the task of finding groups in a data set. As an unsupervised learning approach it is important for exploratory data analysis, e.g., for finding commonalities between patients suffering from some disease. Often such data is high-dimensional and contains complicated non-linear relationships, which makes it difficult to apply standard clustering approaches. Deep clustering algorithms have been proposed recently to approach this kind of setting by combining clustering and deep learning in a common framework. An open problem that has not been sufficiently addressed in recent deep clustering research is incomplete and heterogeneous (e.g. survey data, blood tests, MRI images, etc.) data. Incomplete data is often an issue in practice, especially for health data where not everything about a patient’s history is known.

In this project you will study the clustering of incomplete data with heterogeneous data types with a particular focus on missingness and on determining which information is relevant for cluster assignments. You will develop an algorithm based on previous work and evaluate it on synthetic and real-world data. Some research questions of interest would be for instance: How do cluster assignments change when given new information? Or, given an uncertain cluster assignment (e.g. sick or healthy patient) which information would we need to query to make a decision?

+ Multi-agent Teaching Primitives

Enabling multiple interacting agents to convey their knowledge and skills through teaching.

+ Abstraction in Reinforcement Learning

Learning abstractions of large state and action spaces in order to facilitate more efficient planning and exploration.

+ Bivariate causal measures: Experiments with existing causal methods in python on synthetic and real data [Practical Course]

Bivariate causal measures: Experiments with existing causal methods in python on synthetic and real data.

Contact: Katerina Schindlerova

+ Oracle analysis of distant supervision errors

Please see [Moodle] for a detailed description of this and other Natural Language Processing topics.

Contact: Benjamin Roth

+ Weakly supervised discourse relation prediction

+ incomplete schema relation clustering, + weakly supervised learning with latent class predictions, + gradient matching for semi-supervised learning, + threshold-finding for knowledge-base completion using gaussian processes, + path-based knowledge-base completion, + better sentence representations based on bert, + explainable policies for game play [master thesis].

Reinforcement learning has shown remarkable successes in playing board games like Go but also in playing computer games like DOTA. While these results are impressive, the learned strategies (policies) are typically not directly interpretable by humans. However, interpretability is important for instance to enable better collaboration of humans with AI agents or to assess safety of the learned behavior.

In this project you will make such learned policies easier to understand by humans by incorporating constraints on their structure. You will develop and implement a novel policy training algorithm and evaluate it on simple computer games.

Students wanting to work on this topic are expected to have solid coding skills in Python, basic knowledge in PyTorch or TensorFlow, and be curious to learn about reinforcement learning and its applications.

+ The Cost of Feedback

Reinforcement learning has shown remarkable successes in application domains like game play and robotics. These successes where achieved by performing large numbers of interactions of a learning agent with the environment. In these interactions, the agent constantly receives information about rewarding behavior. However, in many applications involving human users (e.g., personal assistants), the reward information is explicitly or implicitly provided by these human users. Therefore, this information must be treated as a scarce and valuable resource, and the (cognitive) cost of providing this information should be accounted for.

In this project you will study the cost of providing different types of feedback used for reinforcement learning in user studies. To this end, you will implement a simple reinforcement learning environment for which a human user can provide reward information in various forms (e.g., comparisons, sorted lists, etc.) and compare the cost of providing this feedback as well as the usefulness of the feedback for training the reinforcement learning agent. Depending on whether you are doing a P1/P2/Bachelor Thesis/Master Thesis, the scope of the project will be adjusted.

Students wanting to work on this topic are expected to have solid coding skills in Python, basic knowledge about the development of interactive web pages, and be curious to learn about reinforcement learning and its applications.

+ Clustering of physiological, behavioral and climatic data of wild boars (Sus scrofa) [Bachelor or Master thesis]

This thesis will mainly focus on developing a data base combining data retrieved from bio-loggers (body temperature, heart rate, acceleration data, location data), reproductive success and climate data collected during a research project on wild boars at the Research Institute of Wildlife Ecology, University of Veterinary Medicine. Aim of this work is to get a functional data base which would allow researchers to access and evaluate data of different aspects of climate change on wild boar ecology in the long term.

If the student is interested, this basic approach can be broadened by using and adapting e.g. a machine learning process to allow the interpretation of acceleration data. We already used the approach of machine learning to classify video-taped behaviors (laying, walking, trotting, lactating) using acceleration data (collected via ear-tags). However, it would be very interesting whether the approach of machine learning could help us to detect daily and annual patterns of activity, affected by e.g. high ambient temperatures, in these data.

We would be happy to find an interested student to support the ecological evaluation of these data.

Contact: Claudia Bieber , Claudia Plant

+ Reinforcement Learning from Implicit and Explicit Feedback

Recent advances in reinforcement learning have enabled super-human performance of AI agents in many applications, e.g., game play. In this project you will work on building theory and/or models for reinforcement learning from implicit and explicit feedback. In particular, you will consider the setting in which no rewards are observed during execution of the agent but only cumulative reward information is obtained at the end of the execution. As a practical example, assume playing a computer game in which you cannot observe the score while you play the game but only see your achieved score at the very end. This setting has important applications in human-in-the-loop settings in which feedback can be expensive to obtain.

Students working on this project need basic background knowledge in machine learning, solid programming skills in Python, and the desire to work on reinforcement learning.

+ Machine Learning for Personalized Education [Practical Course or Bachelor Thesis]

In this project you will work on predicting students' responses to mathematical questions using machine learning models. This is important to assess a student's skills and provide personalized educational resources. We use real-world data provided by the "Diagnostic Questions: Predicting Student Responses and Measuring Question Quality" challenge at NeurIPS'20. The challenge consists of four different tasks from which a subset can be selected for the project, depending on background knowledge and programming skills.

Students working on this project need basic background knowledge in machine learning, programming skills in Python, and the desire to develop and test machine learning models.

+ Reward Inference for Sequential Decision Making from Diverse and Implicit Feedback [Master Thesis]

Automated sequential decision making is an important application of machine learning systems in which such a system needs to select a sequence of actions step by step to optimize a reward/utility function. For instance, in autonomous driving, such a system needs to execute a sequence of steering, breaking and acceleration actions, or in a medical intensive care setting, such a system needs to execute a sequence of measurement and treatment actions.

One challenge in realizing such automated sequential decision making systems is the definition of the reward/utility function. For example, in autonomous driving it is hard to specify all the factors which define good driving behavior. In such settings, automatically inferring the reward/utility function from users’ feedback can be beneficial.

This project investigates approaches for reward/utility inference from diverse and implicit feedback, building on ideas for inverse reinforcement learning, active learning, implicit feedback, etc.

Interested students are expected to have solid mathematical and machine learning skills, and have experience in Python and deep learning (using PyTorch or TensorFlow).

+ Imitation Learning Under Domain Mismatch

Reinforcement learning has been successfully used to solve certain challenging sequential decision making problems in recent years. The employed techniques commonly require (i) huge amounts of interactions with the envirnoment and (ii) clearly specified reward signals to work well. In many applications however, one or both of these requirements are not met. In such cases, imitation learning can be an efficient approach to sequential decision making problems: an expert demonstrates near-optimal behavior and a learning agent attempts to mimic this behavior.

This project considers imitation learning in settings in which there is some form of mismatch between the expert demonstrator and the learning agent. The scope of the project is to study how existing algorithms perform in this setting and proposes modifications to existing algorithms to achieve better performance.

Students wanting to work on this topic are expected to have a basic understanding of machine learning techniques, solid knowledge of Python and basic knowledge of deep learning libraries (PyTorch or TensorFlow).

+ Posterior Consistency in Partial Variational Autoencoders

Variational Autoencoders (VAEs) are powerful deep generative models that have been successfully applied in a wide range of machine learning applications. Recently, the Partial VAE (PVAE), a variant of VAEs that can process partially observed inputs has been proposed and its effectiveness for data imputation has been demonstrated. Key to the fast training of VAEs and PVAEs is the amortized prediction of posterior distributions from observations. In PVAEs, these posterior distributions are predicted from partial observations.

This project aims at studying the consistency of these posterior distributions for different patterns of missing data. The insights are used to create/train better inference models and thereby improve the quality of PVAEs.

Students wanting to work on this topic are expected to have solid mathematical skills, a basic understanding of machine learning techniques and good programming skills in Python.

+ Pharmacoinformatics Research Group [Practical Course]

The Pharmacoinformatics Research Group is led by Univ.-Prof. Dr. Gerhard Ecker at the Department of Pharmaceutical Chemistry.

Following a holistic pharmacoinformatic approach we combine structural modeling of proteins, structure-based drug design, chemometric and  in silico  chemogenomic methods, statistical modeling and machine learning approaches to develop predictive computational systems for transporters and ion channels.

We work with workflow management systems like KNIME for data integration, we do statistical analysis in R, we program predictive models in Python and at times offer these tools to fellow researchers, and this is the part where you come in: We often need help in making our tools openly accessible, such as translating them into a web service or turning them into software.

For a recent tool take a look at our  LiverTox Workspace .

Contact: Jana Gurinova , Claudia Plant

+ Causal inference among climatological time series with extreme events [Master thesis]

This work will focus mainly on implementation of an algorithm for causal detection among processes as well as its testing on probability distributions with heavy tailed distributions and on climatological data provided by ZAMG (Die Zentralanstalt für Meteorologie und Geodynamik) following these distributions. More details to the interested student will be provided on request.

  • Melanija Kraljevska (Data Science), Master Thesis: "Classification of treatment response in depression patients using motif discovery" (supervised by Claudia Plant and co-supervised by Katerina Schindlerova), winter term 2023/2024
  • Luis Caumel Morales (Data Science), Master Thesis: "Clustering of Wind Related Time Series in a Wind Turbine Farm" (supervised by Claudia Plant and co-supervised by Katerina Schindlerova), winter term 2023/2024
  • Rainer Wöss, Bachelor Thesis: "Visualization of spatio-temporal influences of wind related meteorogical variables in a wind turbine farm in Andau", (supervised by Katerina Schindlerova), summer term 2023
  • Alexander Pintsuk, Bachelor Thesis: "Visualization of causal inference for wind turbine extreme events", (supervised by Katerina Schindlerova), winter term 2022/2023
  • Christina Pacher (Scientific Computing), Master Thesis: "Analysis of an EEG Database of Depression Patients by means of Graphical Granger Causality" (supervised by Claudia Plant and co-supervised by Katerina Schindlerova), winter term 2022/2023
  • Mykola Lazarenko (Business Analytics), Master Thesis: "Clustering brain regions by similar interaction patterns based on multivariate neural signals for identifying the response to antidepressants" (supervised by Claudia Plant and co-supervised by Katerina Schindlerova), winter term 2022/2023
  • Wei Chen, Bachelor Thesis: "Mining Brain Networks", winter term 2022/2023
  • Kejsi Hoxhallari, Bachelor Thesis: "Statistical validation and visualization of causal inference with extremes in wind-turbine data set", winter term 2022/2023
  • Daan Scheepens, Master Thesis: "A deep convolutional RNN model for spatio-temporal prediction of wind speed extremes in the short-to-medium range for wind energy applications", winter term 2021/2022
  • Yigit Berkay Bozkurt, Bachelor Thesis: "Anomaly Detection by Heterogenous Graphical Granger Causality and its Application to Climate Data", 2019
  • Christina Pacher, Bachelor Thesis: "Clustering Weather Stations: A Clustering Application for Meteorological Data", summer term 2019
  • Thomas Spendlhofer, Bachelor Thesis: "Evaluating the usage of Tensor Processing Units (TPUs) for unsupervised learning on the example of the k-means algorithm", summer term 2019
  • Ernst Naschenweng, Bachelor Thesis: "A cache optimized implementation of the Floyd-Warshall Algorithm", summer term 2018
  • Hermann Hinterhauser, Bachelor Thesis: "ITGC: Information-theoretic grid-based clustering", summer term 2018, accepted paper in EDBT 2019 ( download available here )
  • Mahmoud A. Ibrahim, Bachelor Thesis: "Parameter Free Mixed-Type Density-Based Clustering", winter term 2017/2018, accepted paper in DEXA 2018 ( download available here )
  • Markus Tschlatscher: "Space-Filling Curves for Cache Efficient LU Decomposition", winter term 2017/2018
  • Theresa Fruhwuerth, Master Thesis: "Uncovering High Resolution Mass Spectrometry Patterns through Audio Fingerprinting and Periodicity Mining Algorithms: An Exploratory Analysis", summer term 2017
  • Robert Fritze, PR1 "Combining spatial information and optimization for locating emergency medical service stations: A case study for Lower Austria", summer term 2017
  • Alexander Pfundner, PR2 "Integration of Density-based and Partitioning-based Clustering Methods", summer term 2017
  • Anton Kovác, Katerina Hlavácková-Schindler, Erasmus project, "Graphical Granger Causality for Detection Temporal Anomalies in EEG Data", winter term 2016/2017 ( download available here )

Practical Course, Bachelor or Master Thesis: Kernelization for the Maximum Common Subgraph Problem

Given two graphs, the maximum common subgraph problem asks for the largest graph that is contained in both as a subgraph. This NP-hard problem is highly relevant in many applications such as computational drug discovery. The goal of this project is to develop scalable algorithms following the concept of kernelization, i.e., the (iterative) reduction of the problem to smaller instances.

A classical technique reduces the maximum common subgraph problem to finding a maximum clique in the product graph of the two input graphs. Equivalently, a maximum independent set of the complement of the product graph can be determined instead. For this problem algorithms based on kernelization have been shown to be highly efficient in practice recently. In this project the properties of the product graph and its complement should be studied theoretically and the performance of the reduction should be investigated in practice. The kernelization could be improved further using specific properties of the product graph.

Students wanting to work on this topic are expected to have experience in graph algorithms and solid programming skills in C++.

Practical Course, Bachelor or Master Thesis: Learning with Reduced Molecular Graphs

Drug discovery at its early stage can greatly benefit from machine learning methods. As molecules are structured objects consisting of atoms and chemical bonds, they cannot be represented by vectors in a straightforward way, but are adequately modeled as graphs. Recent advances in machine learning with graphs led to well engineered methods applicable to graphs annotated with node and edge attributes, e.g., graph neural networks. For molecules, different graph representations exists. The most natural approach is to represent atoms as nodes and bonds as edge. However, different so-called reduced graph models exist, where groups of atoms are represented by a single node and their properties by node attributes. The goal of this project is to compare the performance of the various graph learning techniques with different graph models. Based on the result tailored combinations of methods and representations should be developed.

Students wanting to work on this topic are expected to have experience in machine learning and basic knowledge on graph theory and algorithms.

data mining master thesis topics

Home » For Students » ERCIS Master Theses Series

ERCIS Master Theses Series

In 2024 we established a new publication option for excellent master’s theses within the network, the “ERCIS Master Theses Series.”

Each thesis, published as ERCIS master thesis…

  • is an excellent thesis, nominated by the supervisor,
  • deals with a relevant topic within the field of Information Systems,
  • is assigned its own DOI, making it an official publication,
  • is indexed in relevant directories, including e.g. Google Scholar.

With this series, we want to give our excellent students in the network the opportunity to publish outstanding work and contribute to the dissemination.

An overview of all published theses can be found here: https://doi.org/10.17879/86928653283

ERCIS Master Thesis Issue 1

How to publish

The publication process is rather straight forward: A nomination comes along with

  • the thesis as PDF, simply the way it has been submitted, stripped from any personal data (i.e., email address, home address, phone number),
  • a laudation text from the nominator,
  • a signed consent form, providing the publication server of the University of Münster with the non-exclusive right to publish the thesis, as well as the licence the author requests the thesis to be published in,
  • a video of the author, briefly presenting the thesis.

We will add a cover page to thesis and fill in the details like the URN and DOI.

Get in contact with us, if you would like to nominate candidate!

Hello from Europe. 👋

© 2024 European Research Center For Information Systems

COMMENTS

  1. Research Topics & Ideas: Data Science

    Data Science-Related Research Topics. Developing machine learning models for real-time fraud detection in online transactions. The use of big data analytics in predicting and managing urban traffic flow. Investigating the effectiveness of data mining techniques in identifying early signs of mental health issues from social media usage.

  2. Latest Research and Thesis topics in Data Mining

    Topics to study in data mining. Data mining is a relatively new thing and many are not aware of this technology. This can also be a good topic for M.Tech thesis and for presentations. Following are the topics under data mining to study: Fraud Detection. Crime Rate Prediction.

  3. Data Mining

    Student thesis: Master. File. A Study of an Open-Ended Strategy for Learning Complex Locomotion Skills Zhou, F. (Author), Vanschoren, J. (Supervisor 1), 31 Aug 2021. Student thesis: Master. ... including those for text and data mining, AI training, and similar technologies. For all open access content, the Creative Commons licensing terms apply

  4. data mining

    First, talk to your thesis advisor before committing to a project. They know better than I do. Secondly, just analyzing a new dataset using standard techniques doesn't make for a good masters thesis. Your project is expected to use some sort of novel approach.

  5. data mining Latest Research Papers

    The accurate average value is 74.05% of the existing COID algorithm, and our proposed algorithm has 77.21%. The average recall value is 81.19% and 89.51% of the existing and proposed algorithm, which shows that the proposed work efficiency is better than the existing COID algorithm. Download Full-text.

  6. Dissertations / Theses on the topic 'Data mining'

    Consult the top 50 dissertations / theses for your research on the topic 'Data mining.'. Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

  7. Data Mining Thesis Topics

    Data Mining Thesis. Data Mining Thesis ideas and topics are drafted by us you can get unique , clean and informative thesis that gets you a higher grade. Data mining is one of the important domains that deals with the detection of unrecognized or hidden patterns. Encompassing the research methodology, we suggest an extensive summary based on a ...

  8. MASTER'S THESIS

    Data mining is an area where computer science, machine learning and statistics meet ... This thesis for the degree of Master in Science and Engineering at Lule a University of Technology was made at the company Agrenshuset Prepress IT AB in Ornsk oldsvik. This particular topic about analysing data was selected after a discussion with ...

  9. (PDF) APPLYING DATA MINING TECHNIQUES OVER BIG DATA

    Data mining is concerned with knowledge discovery and finding patterns in. datasets through a process of applying the model to the data [13]. The model, the heart of. the data mining proce ss, is ...

  10. PDF The application of data mining methods

    This thesis first introduces the basic concepts of data mining, such as the definition of data mining, its basic function, common methods and basic process, and two common data mining methods, classification and clustering. Then a data mining application in network is discussed in detail, followed by a brief introduction on data mining ...

  11. PDF MASTER'S THESIS Deep Learning for text data mining: Solving ...

    In automatic text data mining,current challenge was to have a rich text type classification. By "rich" classification means to go beyond basic categories like "string", "number", "date", etc that are common in programming languages and databases. A "rich" classification should be able to detect categories of

  12. Data Mining Research Topics for MS PhD

    Applying data mining to telecom churn management. A data mining approach to the prediction of corporate failure. Algorithms and applications for spatial data mining. Mining educational data to analyze students' performance. An attacker's view of distance preserving maps for privacy preserving data mining.

  13. Thesis Topics

    Thesis Topics Prerequisites. Thesises in our group usually need good knowledge in: mathematics and statistics, in particular in probability theory; data mining and artificial intelligence; strong programming skills in Java, Rust, Python; In a bachelor thesis you will usually: read primary literature on novel methods

  14. Innovative Research Topics on Data Mining (Latest Titles)

    Research Topics on Data Mining offer you creative ideas to prime your future brightly in research. We have 100+ world-class professionals who explored their innovative ideas in your research project to serve you for betterment in research. So We have conducted 500+ workshops throughout the world, and a large number of researchers and students ...

  15. Innovative Data Mining Research Topics (Research Guidance)

    Data Mining Research Topics is our research package where we offer thousands of research topics for students and research scholars. Scholars always seek perfect guidance also for their project completion. They want to make sure that they came in safe hands when it comes to framing their thesis. We tell you that you can also cease your search ...

  16. 82 Data Mining Essay Topic Ideas & Examples

    Commercial Uses of Data Mining. Data mining process entails the use of large relational database to identify the correlation that exists in a given data. The principal role of the applications is to sift the data to identify correlations. A Discussion on the Acceptability of Data Mining.

  17. Trending Data Mining Thesis Topics

    Integration of MapReduce, Amazon EC2, S3, Apache Spark, and Hadoop into data mining. These are the recent trends in data mining. We insist that you choose one of the topics that interest you the most. Having an appropriate content structure or template is essential while writing a thesis.

  18. Statistical Learning and Data Science Chair :: Theses

    The chair typically offers various thesis topics each semester in the areas computational statistics, machine learning, data mining, optimization and statistical software. ... (bachelor thesis) and 40 minutes (master thesis). Here, the student is expected to summarize his/her main results of the thesis in a presentation. The supervisor(s) will ...

  19. Mining Engineering Graduate Theses and Dissertations

    Truck Cycle and Delay Automated Data Collection System (TCD-ADCS) for Surface Coal Mining, Patricio G. Terrazas Prado. PDF. New Abutment Angle Concept for Underground Coal Mining, Ihsan Berk Tulu. Theses/Dissertations from 2011 PDF. Experimental analysis of the post-failure behavior of coal and rock under laboratory compression tests, Dachao ...

  20. Theses and Dissertations--Mining Engineering

    INVESTIGATION INTO, & ANALYSIS OF TEMPERATURE & STRAIN DATA FOR COAL MINE SEAL MATERIAL DURING CURING, Stephanus Jaco van den Berg. Theses/Dissertations from 2021 PDF. DEVELOPMENT OF AN AUTONOMOUS NAVIGATION SYSTEM FOR THE SHUTTLE CAR IN UNDERGROUND ROOM & PILLAR COAL MINES, Vasileios Androulakis. PDF

  21. Dissertations / Theses: 'Data mining and knowledge discovery ...

    Data Mining techniques with clinical data has become an interesting tool to prevent, diagnose or treat CVD. In this thesis, Knowledge Dis- covery and Data Mining (KDD) was employed to analyse clinical and demographic data, which could be used to diagnose coronary artery disease (CAD).

  22. Open topics for theses and practical courses

    Unless otherwise specified, all topics are available as practical course (P1/P2), Data Science projects, bachelor or master thesis. Work group: Data Mining. Natural Language Processing. Machine Learning with Graphs. Probabilistic and Interactive Machine Learning. Database Techniques for Data Mining. Scalable Algorithms for Graph Mining.

  23. ERCIS Master Theses Series

    ERCIS Master Theses Series In 2024 we established a new publication option for excellent master's theses within the network, the "ERCIS Master Theses Series." Each thesis, published as ERCIS master thesis… is an excellent thesis, nominated by the supervisor, deals with a relevant topic within the field of Information Systems, is assigned its own DOI, […]