Grad Coach

Research Topics & Ideas: Data Science

50 Topic Ideas To Kickstart Your Research Project

Research topics and ideas about data science and big data analytics

If you’re just starting out exploring data science-related topics for your dissertation, thesis or research project, you’ve come to the right place. In this post, we’ll help kickstart your research by providing a hearty list of data science and analytics-related research ideas , including examples from recent studies.

PS – This is just the start…

We know it’s exciting to run through a list of research topics, but please keep in mind that this list is just a starting point . These topic ideas provided here are intentionally broad and generic , so keep in mind that you will need to develop them further. Nevertheless, they should inspire some ideas for your project.

To develop a suitable research topic, you’ll need to identify a clear and convincing research gap , and a viable plan to fill that gap. If this sounds foreign to you, check out our free research topic webinar that explores how to find and refine a high-quality research topic, from scratch. Alternatively, consider our 1-on-1 coaching service .

Research topic idea mega list

Data Science-Related Research Topics

  • Developing machine learning models for real-time fraud detection in online transactions.
  • The use of big data analytics in predicting and managing urban traffic flow.
  • Investigating the effectiveness of data mining techniques in identifying early signs of mental health issues from social media usage.
  • The application of predictive analytics in personalizing cancer treatment plans.
  • Analyzing consumer behavior through big data to enhance retail marketing strategies.
  • The role of data science in optimizing renewable energy generation from wind farms.
  • Developing natural language processing algorithms for real-time news aggregation and summarization.
  • The application of big data in monitoring and predicting epidemic outbreaks.
  • Investigating the use of machine learning in automating credit scoring for microfinance.
  • The role of data analytics in improving patient care in telemedicine.
  • Developing AI-driven models for predictive maintenance in the manufacturing industry.
  • The use of big data analytics in enhancing cybersecurity threat intelligence.
  • Investigating the impact of sentiment analysis on brand reputation management.
  • The application of data science in optimizing logistics and supply chain operations.
  • Developing deep learning techniques for image recognition in medical diagnostics.
  • The role of big data in analyzing climate change impacts on agricultural productivity.
  • Investigating the use of data analytics in optimizing energy consumption in smart buildings.
  • The application of machine learning in detecting plagiarism in academic works.
  • Analyzing social media data for trends in political opinion and electoral predictions.
  • The role of big data in enhancing sports performance analytics.
  • Developing data-driven strategies for effective water resource management.
  • The use of big data in improving customer experience in the banking sector.
  • Investigating the application of data science in fraud detection in insurance claims.
  • The role of predictive analytics in financial market risk assessment.
  • Developing AI models for early detection of network vulnerabilities.

Research topic evaluator

Data Science Research Ideas (Continued)

  • The application of big data in public transportation systems for route optimization.
  • Investigating the impact of big data analytics on e-commerce recommendation systems.
  • The use of data mining techniques in understanding consumer preferences in the entertainment industry.
  • Developing predictive models for real estate pricing and market trends.
  • The role of big data in tracking and managing environmental pollution.
  • Investigating the use of data analytics in improving airline operational efficiency.
  • The application of machine learning in optimizing pharmaceutical drug discovery.
  • Analyzing online customer reviews to inform product development in the tech industry.
  • The role of data science in crime prediction and prevention strategies.
  • Developing models for analyzing financial time series data for investment strategies.
  • The use of big data in assessing the impact of educational policies on student performance.
  • Investigating the effectiveness of data visualization techniques in business reporting.
  • The application of data analytics in human resource management and talent acquisition.
  • Developing algorithms for anomaly detection in network traffic data.
  • The role of machine learning in enhancing personalized online learning experiences.
  • Investigating the use of big data in urban planning and smart city development.
  • The application of predictive analytics in weather forecasting and disaster management.
  • Analyzing consumer data to drive innovations in the automotive industry.
  • The role of data science in optimizing content delivery networks for streaming services.
  • Developing machine learning models for automated text classification in legal documents.
  • The use of big data in tracking global supply chain disruptions.
  • Investigating the application of data analytics in personalized nutrition and fitness.
  • The role of big data in enhancing the accuracy of geological surveying for natural resource exploration.
  • Developing predictive models for customer churn in the telecommunications industry.
  • The application of data science in optimizing advertisement placement and reach.

Recent Data Science-Related Studies

While the ideas we’ve presented above are a decent starting point for finding a research topic, they are fairly generic and non-specific. So, it helps to look at actual studies in the data science and analytics space to see how this all comes together in practice.

Below, we’ve included a selection of recent studies to help refine your thinking. These are actual studies,  so they can provide some useful insight as to what a research topic looks like in practice.

  • Data Science in Healthcare: COVID-19 and Beyond (Hulsen, 2022)
  • Auto-ML Web-application for Automated Machine Learning Algorithm Training and evaluation (Mukherjee & Rao, 2022)
  • Survey on Statistics and ML in Data Science and Effect in Businesses (Reddy et al., 2022)
  • Visualization in Data Science VDS @ KDD 2022 (Plant et al., 2022)
  • An Essay on How Data Science Can Strengthen Business (Santos, 2023)
  • A Deep study of Data science related problems, application and machine learning algorithms utilized in Data science (Ranjani et al., 2022)
  • You Teach WHAT in Your Data Science Course?!? (Posner & Kerby-Helm, 2022)
  • Statistical Analysis for the Traffic Police Activity: Nashville, Tennessee, USA (Tufail & Gul, 2022)
  • Data Management and Visual Information Processing in Financial Organization using Machine Learning (Balamurugan et al., 2022)
  • A Proposal of an Interactive Web Application Tool QuickViz: To Automate Exploratory Data Analysis (Pitroda, 2022)
  • Applications of Data Science in Respective Engineering Domains (Rasool & Chaudhary, 2022)
  • Jupyter Notebooks for Introducing Data Science to Novice Users (Fruchart et al., 2022)
  • Towards a Systematic Review of Data Science Programs: Themes, Courses, and Ethics (Nellore & Zimmer, 2022)
  • Application of data science and bioinformatics in healthcare technologies (Veeranki & Varshney, 2022)
  • TAPS Responsibility Matrix: A tool for responsible data science by design (Urovi et al., 2023)
  • Data Detectives: A Data Science Program for Middle Grade Learners (Thompson & Irgens, 2022)
  • MACHINE LEARNING FOR NON-MAJORS: A WHITE BOX APPROACH (Mike & Hazzan, 2022)
  • COMPONENTS OF DATA SCIENCE AND ITS APPLICATIONS (Paul et al., 2022)
  • Analysis on the Application of Data Science in Business Analytics (Wang, 2022)

As you can see, these research topics are a lot more focused than the generic topic ideas we presented earlier. So, for you to develop a high-quality research topic, you’ll need to get specific and laser-focused on a specific context with specific variables of interest.  In the video below, we explore some other important things you’ll need to consider when crafting your research topic.

Get 1-On-1 Help

If you’re still unsure about how to find a quality research topic, check out our Research Topic Kickstarter service, which is the perfect starting point for developing a unique, well-justified research topic.

Research Topic Kickstarter - Need Help Finding A Research Topic?

You Might Also Like:

IT & Computer Science Research Topics

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

eml header

37 Research Topics In Data Science To Stay On Top Of

Stewart Kaplan

  • February 22, 2024

As a data scientist, staying on top of the latest research in your field is essential.

The data science landscape changes rapidly, and new techniques and tools are constantly being developed.

To keep up with the competition, you need to be aware of the latest trends and topics in data science research.

In this article, we will provide an overview of 37 hot research topics in data science.

We will discuss each topic in detail, including its significance and potential applications.

These topics could be an idea for a thesis or simply topics you can research independently.

Stay tuned – this is one blog post you don’t want to miss!

37 Research Topics in Data Science

1.) predictive modeling.

Predictive modeling is a significant portion of data science and a topic you must be aware of.

Simply put, it is the process of using historical data to build models that can predict future outcomes.

Predictive modeling has many applications, from marketing and sales to financial forecasting and risk management.

As businesses increasingly rely on data to make decisions, predictive modeling is becoming more and more important.

While it can be complex, predictive modeling is a powerful tool that gives businesses a competitive advantage.

predictive modeling

2.) Big Data Analytics

These days, it seems like everyone is talking about big data.

And with good reason – organizations of all sizes are sitting on mountains of data, and they’re increasingly turning to data scientists to help them make sense of it all.

But what exactly is big data? And what does it mean for data science?

Simply put, big data is a term used to describe datasets that are too large and complex for traditional data processing techniques.

Big data typically refers to datasets of a few terabytes or more.

But size isn’t the only defining characteristic – big data is also characterized by its high Velocity (the speed at which data is generated), Variety (the different types of data), and Volume (the amount of the information).

Given the enormity of big data, it’s not surprising that organizations are struggling to make sense of it all.

That’s where data science comes in.

Data scientists use various methods to wrangle big data, including distributed computing and other decentralized technologies.

With the help of data science, organizations are beginning to unlock the hidden value in their big data.

By harnessing the power of big data analytics, they can improve their decision-making, better understand their customers, and develop new products and services.

3.) Auto Machine Learning

Auto machine learning is a research topic in data science concerned with developing algorithms that can automatically learn from data without intervention.

This area of research is vital because it allows data scientists to automate the process of writing code for every dataset.

This allows us to focus on other tasks, such as model selection and validation.

Auto machine learning algorithms can learn from data in a hands-off way for the data scientist – while still providing incredible insights.

This makes them a valuable tool for data scientists who either don’t have the skills to do their own analysis or are struggling.

Auto Machine Learning

4.) Text Mining

Text mining is a research topic in data science that deals with text data extraction.

This area of research is important because it allows us to get as much information as possible from the vast amount of text data available today.

Text mining techniques can extract information from text data, such as keywords, sentiments, and relationships.

This information can be used for various purposes, such as model building and predictive analytics.

5.) Natural Language Processing

Natural language processing is a data science research topic that analyzes human language data.

This area of research is important because it allows us to understand and make sense of the vast amount of text data available today.

Natural language processing techniques can build predictive and interactive models from any language data.

Natural Language processing is pretty broad, and recent advances like GPT-3 have pushed this topic to the forefront.

natural language processing

6.) Recommender Systems

Recommender systems are an exciting topic in data science because they allow us to make better products, services, and content recommendations.

Businesses can better understand their customers and their needs by using recommender systems.

This, in turn, allows them to develop better products and services that meet the needs of their customers.

Recommender systems are also used to recommend content to users.

This can be done on an individual level or at a group level.

Think about Netflix, for example, always knowing what you want to watch!

Recommender systems are a valuable tool for businesses and users alike.

7.) Deep Learning

Deep learning is a research topic in data science that deals with artificial neural networks.

These networks are composed of multiple layers, and each layer is formed from various nodes.

Deep learning networks can learn from data similarly to how humans learn, irrespective of the data distribution.

This makes them a valuable tool for data scientists looking to build models that can learn from data independently.

The deep learning network has become very popular in recent years because of its ability to achieve state-of-the-art results on various tasks.

There seems to be a new SOTA deep learning algorithm research paper on  https://arxiv.org/  every single day!

deep learning

8.) Reinforcement Learning

Reinforcement learning is a research topic in data science that deals with algorithms that can learn on multiple levels from interactions with their environment.

This area of research is essential because it allows us to develop algorithms that can learn non-greedy approaches to decision-making, allowing businesses and companies to win in the long term compared to the short.

9.) Data Visualization

Data visualization is an excellent research topic in data science because it allows us to see our data in a way that is easy to understand.

Data visualization techniques can be used to create charts, graphs, and other visual representations of data.

This allows us to see the patterns and trends hidden in our data.

Data visualization is also used to communicate results to others.

This allows us to share our findings with others in a way that is easy to understand.

There are many ways to contribute to and learn about data visualization.

Some ways include attending conferences, reading papers, and contributing to open-source projects.

data visualization

10.) Predictive Maintenance

Predictive maintenance is a hot topic in data science because it allows us to prevent failures before they happen.

This is done using data analytics to predict when a failure will occur.

This allows us to take corrective action before the failure actually happens.

While this sounds simple, avoiding false positives while keeping recall is challenging and an area wide open for advancement.

11.) Financial Analysis

Financial analysis is an older topic that has been around for a while but is still a great field where contributions can be felt.

Current researchers are focused on analyzing macroeconomic data to make better financial decisions.

This is done by analyzing the data to identify trends and patterns.

Financial analysts can use this information to make informed decisions about where to invest their money.

Financial analysis is also used to predict future economic trends.

This allows businesses and individuals to prepare for potential financial hardships and enable companies to be cash-heavy during good economic conditions.

Overall, financial analysis is a valuable tool for anyone looking to make better financial decisions.

Financial Analysis

12.) Image Recognition

Image recognition is one of the hottest topics in data science because it allows us to identify objects in images.

This is done using artificial intelligence algorithms that can learn from data and understand what objects you’re looking for.

This allows us to build models that can accurately recognize objects in images and video.

This is a valuable tool for businesses and individuals who want to be able to identify objects in images.

Think about security, identification, routing, traffic, etc.

Image Recognition has gained a ton of momentum recently – for a good reason.

13.) Fraud Detection

Fraud detection is a great topic in data science because it allows us to identify fraudulent activity before it happens.

This is done by analyzing data to look for patterns and trends that may be associated with the fraud.

Once our machine learning model recognizes some of these patterns in real time, it immediately detects fraud.

This allows us to take corrective action before the fraud actually happens.

Fraud detection is a valuable tool for anyone who wants to protect themselves from potential fraudulent activity.

fraud detection

14.) Web Scraping

Web scraping is a controversial topic in data science because it allows us to collect data from the web, which is usually data you do not own.

This is done by extracting data from websites using scraping tools that are usually custom-programmed.

This allows us to collect data that would otherwise be inaccessible.

For obvious reasons, web scraping is a unique tool – giving you data your competitors would have no chance of getting.

I think there is an excellent opportunity to create new and innovative ways to make scraping accessible for everyone, not just those who understand Selenium and Beautiful Soup.

15.) Social Media Analysis

Social media analysis is not new; many people have already created exciting and innovative algorithms to study this.

However, it is still a great data science research topic because it allows us to understand how people interact on social media.

This is done by analyzing data from social media platforms to look for insights, bots, and recent societal trends.

Once we understand these practices, we can use this information to improve our marketing efforts.

For example, if we know that a particular demographic prefers a specific type of content, we can create more content that appeals to them.

Social media analysis is also used to understand how people interact with brands on social media.

This allows businesses to understand better what their customers want and need.

Overall, social media analysis is valuable for anyone who wants to improve their marketing efforts or understand how customers interact with brands.

social media

16.) GPU Computing

GPU computing is a fun new research topic in data science because it allows us to process data much faster than traditional CPUs .

Due to how GPUs are made, they’re incredibly proficient at intense matrix operations, outperforming traditional CPUs by very high margins.

While the computation is fast, the coding is still tricky.

There is an excellent research opportunity to bring these innovations to non-traditional modules, allowing data science to take advantage of GPU computing outside of deep learning.

17.) Quantum Computing

Quantum computing is a new research topic in data science and physics because it allows us to process data much faster than traditional computers.

It also opens the door to new types of data.

There are just some problems that can’t be solved utilizing outside of the classical computer.

For example, if you wanted to understand how a single atom moved around, a classical computer couldn’t handle this problem.

You’ll need to utilize a quantum computer to handle quantum mechanics problems.

This may be the “hottest” research topic on the planet right now, with some of the top researchers in computer science and physics worldwide working on it.

You could be too.

quantum computing

18.) Genomics

Genomics may be the only research topic that can compete with quantum computing regarding the “number of top researchers working on it.”

Genomics is a fantastic intersection of data science because it allows us to understand how genes work.

This is done by sequencing the DNA of different organisms to look for insights into our and other species.

Once we understand these patterns, we can use this information to improve our understanding of diseases and create new and innovative treatments for them.

Genomics is also used to study the evolution of different species.

Genomics is the future and a field begging for new and exciting research professionals to take it to the next step.

19.) Location-based services

Location-based services are an old and time-tested research topic in data science.

Since GPS and 4g cell phone reception became a thing, we’ve been trying to stay informed about how humans interact with their environment.

This is done by analyzing data from GPS tracking devices, cell phone towers, and Wi-Fi routers to look for insights into how humans interact.

Once we understand these practices, we can use this information to improve our geotargeting efforts, improve maps, find faster routes, and improve cohesion throughout a community.

Location-based services are used to understand the user, something every business could always use a little bit more of.

While a seemingly “stale” field, location-based services have seen a revival period with self-driving cars.

GPS

20.) Smart City Applications

Smart city applications are all the rage in data science research right now.

By harnessing the power of data, cities can become more efficient and sustainable.

But what exactly are smart city applications?

In short, they are systems that use data to improve city infrastructure and services.

This can include anything from traffic management and energy use to waste management and public safety.

Data is collected from various sources, including sensors, cameras, and social media.

It is then analyzed to identify tendencies and habits.

This information can make predictions about future needs and optimize city resources.

As more and more cities strive to become “smart,” the demand for data scientists with expertise in smart city applications is only growing.

21.) Internet Of Things (IoT)

The Internet of Things, or IoT, is exciting and new data science and sustainability research topic.

IoT is a network of physical objects embedded with sensors and connected to the internet.

These objects can include everything from alarm clocks to refrigerators; they’re all connected to the internet.

That means that they can share data with computers.

And that’s where data science comes in.

Data scientists are using IoT data to learn everything from how people use energy to how traffic flows through a city.

They’re also using IoT data to predict when an appliance will break down or when a road will be congested.

Really, the possibilities are endless.

With such a wide-open field, it’s easy to see why IoT is being researched by some of the top professionals in the world.

internet of things

22.) Cybersecurity

Cybersecurity is a relatively new research topic in data science and in general, but it’s already garnering a lot of attention from businesses and organizations.

After all, with the increasing number of cyber attacks in recent years, it’s clear that we need to find better ways to protect our data.

While most of cybersecurity focuses on infrastructure, data scientists can leverage historical events to find potential exploits to protect their companies.

Sometimes, looking at a problem from a different angle helps, and that’s what data science brings to cybersecurity.

Also, data science can help to develop new security technologies and protocols.

As a result, cybersecurity is a crucial data science research area and one that will only become more important in the years to come.

23.) Blockchain

Blockchain is an incredible new research topic in data science for several reasons.

First, it is a distributed database technology that enables secure, transparent, and tamper-proof transactions.

Did someone say transmitting data?

This makes it an ideal platform for tracking data and transactions in various industries.

Second, blockchain is powered by cryptography, which not only makes it highly secure – but is a familiar foe for data scientists.

Finally, blockchain is still in its early stages of development, so there is much room for research and innovation.

As a result, blockchain is a great new research topic in data science that vows to revolutionize how we store, transmit and manage data.

blockchain

24.) Sustainability

Sustainability is a relatively new research topic in data science, but it is gaining traction quickly.

To keep up with this demand, The Wharton School of the University of Pennsylvania has  started to offer an MBA in Sustainability .

This demand isn’t shocking, and some of the reasons include the following:

Sustainability is an important issue that is relevant to everyone.

Datasets on sustainability are constantly growing and changing, making it an exciting challenge for data scientists.

There hasn’t been a “set way” to approach sustainability from a data perspective, making it an excellent opportunity for interdisciplinary research.

As data science grows, sustainability will likely become an increasingly important research topic.

25.) Educational Data

Education has always been a great topic for research, and with the advent of big data, educational data has become an even richer source of information.

By studying educational data, researchers can gain insights into how students learn, what motivates them, and what barriers these students may face.

Besides, data science can be used to develop educational interventions tailored to individual students’ needs.

Imagine being the researcher that helps that high schooler pass mathematics; what an incredible feeling.

With the increasing availability of educational data, data science has enormous potential to improve the quality of education.

online education

26.) Politics

As data science continues to evolve, so does the scope of its applications.

Originally used primarily for business intelligence and marketing, data science is now applied to various fields, including politics.

By analyzing large data sets, political scientists (data scientists with a cooler name) can gain valuable insights into voting patterns, campaign strategies, and more.

Further, data science can be used to forecast election results and understand the effects of political events on public opinion.

With the wealth of data available, there is no shortage of research opportunities in this field.

As data science evolves, so does our understanding of politics and its role in our world.

27.) Cloud Technologies

Cloud technologies are a great research topic.

It allows for the outsourcing and sharing of computer resources and applications all over the internet.

This lets organizations save money on hardware and maintenance costs while providing employees access to the latest and greatest software and applications.

I believe there is an argument that AWS could be the greatest and most technologically advanced business ever built (Yes, I know it’s only part of the company).

Besides, cloud technologies can help improve team members’ collaboration by allowing them to share files and work on projects together in real-time.

As more businesses adopt cloud technologies, data scientists must stay up-to-date on the latest trends in this area.

By researching cloud technologies, data scientists can help organizations to make the most of this new and exciting technology.

cloud technologies

28.) Robotics

Robotics has recently become a household name, and it’s for a good reason.

First, robotics deals with controlling and planning physical systems, an inherently complex problem.

Second, robotics requires various sensors and actuators to interact with the world, making it an ideal application for machine learning techniques.

Finally, robotics is an interdisciplinary field that draws on various disciplines, such as computer science, mechanical engineering, and electrical engineering.

As a result, robotics is a rich source of research problems for data scientists.

29.) HealthCare

Healthcare is an industry that is ripe for data-driven innovation.

Hospitals, clinics, and health insurance companies generate a tremendous amount of data daily.

This data can be used to improve the quality of care and outcomes for patients.

This is perfect timing, as the healthcare industry is undergoing a significant shift towards value-based care, which means there is a greater need than ever for data-driven decision-making.

As a result, healthcare is an exciting new research topic for data scientists.

There are many different ways in which data can be used to improve healthcare, and there is a ton of room for newcomers to make discoveries.

healthcare

30.) Remote Work

There’s no doubt that remote work is on the rise.

In today’s global economy, more and more businesses are allowing their employees to work from home or anywhere else they can get a stable internet connection.

But what does this mean for data science? Well, for one thing, it opens up a whole new field of research.

For example, how does remote work impact employee productivity?

What are the best ways to manage and collaborate on data science projects when team members are spread across the globe?

And what are the cybersecurity risks associated with working remotely?

These are just a few of the questions that data scientists will be able to answer with further research.

So if you’re looking for a new topic to sink your teeth into, remote work in data science is a great option.

31.) Data-Driven Journalism

Data-driven journalism is an exciting new field of research that combines the best of both worlds: the rigor of data science with the creativity of journalism.

By applying data analytics to large datasets, journalists can uncover stories that would otherwise be hidden.

And telling these stories compellingly can help people better understand the world around them.

Data-driven journalism is still in its infancy, but it has already had a major impact on how news is reported.

In the future, it will only become more important as data becomes increasingly fluid among journalists.

It is an exciting new topic and research field for data scientists to explore.

journalism

32.) Data Engineering

Data engineering is a staple in data science, focusing on efficiently managing data.

Data engineers are responsible for developing and maintaining the systems that collect, process, and store data.

In recent years, there has been an increasing demand for data engineers as the volume of data generated by businesses and organizations has grown exponentially.

Data engineers must be able to design and implement efficient data-processing pipelines and have the skills to optimize and troubleshoot existing systems.

If you are looking for a challenging research topic that would immediately impact you worldwide, then improving or innovating a new approach in data engineering would be a good start.

33.) Data Curation

Data curation has been a hot topic in the data science community for some time now.

Curating data involves organizing, managing, and preserving data so researchers can use it.

Data curation can help to ensure that data is accurate, reliable, and accessible.

It can also help to prevent research duplication and to facilitate the sharing of data between researchers.

Data curation is a vital part of data science. In recent years, there has been an increasing focus on data curation, as it has become clear that it is essential for ensuring data quality.

As a result, data curation is now a major research topic in data science.

There are numerous books and articles on the subject, and many universities offer courses on data curation.

Data curation is an integral part of data science and will only become more important in the future.

businessman

34.) Meta-Learning

Meta-learning is gaining a ton of steam in data science. It’s learning how to learn.

So, if you can learn how to learn, you can learn anything much faster.

Meta-learning is mainly used in deep learning, as applications outside of this are generally pretty hard.

In deep learning, many parameters need to be tuned for a good model, and there’s usually a lot of data.

You can save time and effort if you can automatically and quickly do this tuning.

In machine learning, meta-learning can improve models’ performance by sharing knowledge between different models.

For example, if you have a bunch of different models that all solve the same problem, then you can use meta-learning to share the knowledge between them to improve the cluster (groups) overall performance.

I don’t know how anyone looking for a research topic could stay away from this field; it’s what the  Terminator  warned us about!

35.) Data Warehousing

A data warehouse is a system used for data analysis and reporting.

It is a central data repository created by combining data from multiple sources.

Data warehouses are often used to store historical data, such as sales data, financial data, and customer data.

This data type can be used to create reports and perform statistical analysis.

Data warehouses also store data that the organization is not currently using.

This type of data can be used for future research projects.

Data warehousing is an incredible research topic in data science because it offers a variety of benefits.

Data warehouses help organizations to save time and money by reducing the need for manual data entry.

They also help to improve the accuracy of reports and provide a complete picture of the organization’s performance.

Data warehousing feels like one of the weakest parts of the Data Science Technology Stack; if you want a research topic that could have a monumental impact – data warehousing is an excellent place to look.

data warehousing

36.) Business Intelligence

Business intelligence aims to collect, process, and analyze data to help businesses make better decisions.

Business intelligence can improve marketing, sales, customer service, and operations.

It can also be used to identify new business opportunities and track competition.

BI is business and another tool in your company’s toolbox to continue dominating your area.

Data science is the perfect tool for business intelligence because it combines statistics, computer science, and machine learning.

Data scientists can use business intelligence to answer questions like, “What are our customers buying?” or “What are our competitors doing?” or “How can we increase sales?”

Business intelligence is a great way to improve your business’s bottom line and an excellent opportunity to dive deep into a well-respected research topic.

37.) Crowdsourcing

One of the newest areas of research in data science is crowdsourcing.

Crowdsourcing is a process of sourcing tasks or projects to a large group of people, typically via the internet.

This can be done for various purposes, such as gathering data, developing new algorithms, or even just for fun (think: online quizzes and surveys).

But what makes crowdsourcing so powerful is that it allows businesses and organizations to tap into a vast pool of talent and resources they wouldn’t otherwise have access to.

And with the rise of social media, it’s easier than ever to connect with potential crowdsource workers worldwide.

Imagine if you could effect that, finding innovative ways to improve how people work together.

That would have a huge effect.

crowd sourcing

Final Thoughts, Are These Research Topics In Data Science For You?

Thirty-seven different research topics in data science are a lot to take in, but we hope you found a research topic that interests you.

If not, don’t worry – there are plenty of other great topics to explore.

The important thing is to get started with your research and find ways to apply what you learn to real-world problems.

We wish you the best of luck as you begin your data science journey!

Other Data Science Articles

We love talking about data science; here are a couple of our favorite articles:

  • Why Are You Interested In Data Science?
  • Recent Posts

Stewart Kaplan

  • How much do software technicians make in the US? [Maximize Your Earning Potential] - June 6, 2024
  • How to Compare Two Histograms [Boost Your Data Analysis Skills] - June 6, 2024
  • Working remotely at Amazon: Challenges and Must-Have Tools [Insider Secrets Unveiled] - June 6, 2024

LIBRARIES | ARCH

Data science masters theses.

The Master of Science in Data Science program requires the successful completion of 12 courses to obtain a degree. These requirements cover six core courses, a leadership or project management course, two required courses corresponding to a declared specialization, two electives, and a capstone project or thesis. This collection contains a selection of masters theses or capstone projects by MSDS graduates.

Collection Details

Mon - Sat 9:00am - 12:00am

  • Get a quote

List of Best Research and Thesis Topic Ideas for Data Science in 2022

In an era driven by digital and technological transformation, businesses actively seek skilled and talented data science potentials capable of leveraging data insights to enhance business productivity and achieve organizational objectives. In keeping with an increasing demand for data science professionals, universities offer various data science and big data courses to prepare students for the tech industry. Research projects are a crucial part of these programs and a well- executed data science project can make your CV appear more robust and compelling. A  broad range of data science topics exist that offer exciting possibilities for research but choosing data science research topics can be a real challenge for students . After all, a good research project relies first and foremost on data analytics research topics that draw upon both mono-disciplinary and multi-disciplinary research to explore endless possibilities for real –world applications.

As one of the top-most masters and PhD online dissertation writing services , we are geared to assist students in the entire research process right from the initial conception to the final execution to ensure that you have a truly fulfilling and enriching research experience. These resources are also helpful for those students who are taking online classes .

By taking advantage of our best digital marketing research topics in data science you can be assured of producing an innovative research project that will impress your research professors and make a huge difference in attracting the right employers.

Get an Immediate Response

Discuss your requirments with our writers

Get 3 Customize Research Topic within 24 Hours

Undergraduate Masters PhD Others

Data science thesis topics

We have compiled a list of data science research topics for students studying data science that can be utilized in data science projects in 2022. our team of professional data experts have brought together master or MBA thesis topics in data science  that cater to core areas  driving the field of data science and big data that will relieve all your research anxieties and  provide a solid grounding for  an interesting research projects . The article will feature data science thesis ideas that can be immensely beneficial for students as they cover a broad research agenda for future data science . These ideas have been drawn from the 8 v’s of big data namely Volume, Value, Veracity, Visualization, Variety, Velocity, Viscosity, and Virility that provide interesting and challenging research areas for prospective researches  in their masters or PhD thesis . Overall, the general big data research topics can be divided into distinct categories to facilitate the research topic selection process.

  • Security and privacy issues
  • Cloud Computing Platforms for Big Data Adoption and Analytics
  • Real-time data analytics for processing of image , video and text
  • Modeling uncertainty

How “The Research Guardian” Can Help You A lot!

Our top thesis writing experts are available 24/7 to assist you the right university projects. Whether its critical literature reviews to complete your PhD. or Master Levels thesis.

DATA SCIENCE PHD RESEARCH TOPICS

The article will also guide students engaged in doctoral research by introducing them to an outstanding list of data science thesis topics that can lead to major real-time applications of big data analytics in your research projects.

  • Intelligent traffic control ; Gathering and monitoring traffic information using CCTV images.
  • Asymmetric protected storage methodology over multi-cloud service providers in Big data.
  • Leveraging disseminated data over big data analytics environment.
  • Internet of Things.
  • Large-scale data system and anomaly detection.

What makes us a unique research service for your research needs?

We offer all –round and superb research services that have a distinguished track record in helping students secure their desired grades in research projects in big data analytics and hence pave the way for a promising career ahead. These are the features that set us apart in the market for research services that effectively deal with all significant issues in your research for.

  • Plagiarism –free ; We strictly adhere to a non-plagiarism policy in all our research work to  provide you with well-written, original content  with low similarity index   to maximize  chances of acceptance of your research submissions.
  • Publication; We don’t just suggest PhD data science research topics but our PhD consultancy services take your research to the next level by ensuring its publication in well-reputed journals. A PhD thesis is indispensable for a PhD degree and with our premier best PhD thesis services that  tackle all aspects  of research writing and cater to  essential requirements of journals , we will bring you closer to your dream of being a PhD in the field of data analytics.
  • Research ethics: Solid research ethics lie at the core of our services where we actively seek to protect the  privacy and confidentiality of  the technical and personal information of our valued customers.
  • Research experience: We take pride in our world –class team of computing industry professionals equipped with the expertise and experience to assist in choosing data science research topics and subsequent phases in research including findings solutions, code development and final manuscript writing.
  • Business ethics: We are driven by a business philosophy that‘s wholly committed to achieving total customer satisfaction by providing constant online and offline support and timely submissions so that you can keep track of the progress of your research.

Now, we’ll proceed to cover specific research problems encompassing both data analytics research topics and big data thesis topics that have applications across multiple domains.

Get Help from Expert Thesis Writers!

TheresearchGuardian.com providing expert thesis assistance for university students at any sort of level. Our thesis writing service has been serving students since 2011.

Multi-modal Transfer Learning for Cross-Modal Information Retrieval

Aim and objectives.

The research aims to examine and explore the use of CMR approach in bringing about a flexible retrieval experience by combining data across different modalities to ensure abundant multimedia data.

  • Develop methods to enable learning across different modalities in shared cross modal spaces comprising texts and images as well as consider the limitations of existing cross –modal retrieval algorithms.
  • Investigate the presence and effects of bias in cross modal transfer learning and suggesting strategies for bias detection and mitigation.
  • Develop a tool with query expansion and relevance feedback capabilities to facilitate search and retrieval of multi-modal data.
  • Investigate the methods of multi modal learning and elaborate on the importance of multi-modal deep learning to provide a comprehensive learning experience.

The Role of Machine Learning in Facilitating the Implication of the Scientific Computing and Software Engineering

  • Evaluate how machine learning leads to improvements in computational APA reference generator tools and thus aids in  the implementation of scientific computing
  • Evaluating the effectiveness of machine learning in solving complex problems and improving the efficiency of scientific computing and software engineering processes.
  • Assessing the potential benefits and challenges of using machine learning in these fields, including factors such as cost, accuracy, and scalability.
  • Examining the ethical and social implications of using machine learning in scientific computing and software engineering, such as issues related to bias, transparency, and accountability.

Trustworthy AI

The research aims to explore the crucial role of data science in advancing scientific goals and solving problems as well as the implications involved in use of AI systems especially with respect to ethical concerns.

  • Investigate the value of digital infrastructures  available through open data   in  aiding sharing  and inter linking of data for enhanced global collaborative research efforts
  • Provide explanations of the outcomes of a machine learning model  for a meaningful interpretation to build trust among users about the reliability and authenticity of data
  • Investigate how formal models can be used to verify and establish the efficacy of the results derived from probabilistic model.
  • Review the concept of Trustworthy computing as a relevant framework for addressing the ethical concerns associated with AI systems.

The Implementation of Data Science and their impact on the management environment and sustainability

The aim of the research is to demonstrate how data science and analytics can be leveraged in achieving sustainable development.

  • To examine the implementation of data science using data-driven decision-making tools
  • To evaluate the impact of modern information technology on management environment and sustainability.
  • To examine the use of  data science in achieving more effective and efficient environment management
  • Explore how data science and analytics can be used to achieve sustainability goals across three dimensions of economic, social and environmental.

Big data analytics in healthcare systems

The aim of the research is to examine the application of creating smart healthcare systems and   how it can   lead to more efficient, accessible and cost –effective health care.

  • Identify the potential Areas or opportunities in big data to transform the healthcare system such as for diagnosis, treatment planning, or drug development.
  • Assessing the potential benefits and challenges of using AI and deep learning in healthcare, including factors such as cost, efficiency, and accessibility
  • Evaluating the effectiveness of AI and deep learning in improving patient outcomes, such as reducing morbidity and mortality rates, improving accuracy and speed of diagnoses, or reducing medical errors
  • Examining the ethical and social implications of using AI and deep learning in healthcare, such as issues related to bias, privacy, and autonomy.

Large-Scale Data-Driven Financial Risk Assessment

The research aims to explore the possibility offered by big data in a consistent and real time assessment of financial risks.

  • Investigate how the use of big data can help to identify and forecast risks that can harm a business.
  • Categories the types of financial risks faced by companies.
  • Describe the importance of financial risk management for companies in business terms.
  • Train a machine learning model to classify transactions as fraudulent or genuine.

Scalable Architectures for Parallel Data Processing

Big data has exposed us to an ever –growing volume of data which cannot be handled through traditional data management and analysis systems. This has given rise to the use of scalable system architectures to efficiently process big data and exploit its true value. The research aims to analyses the current state of practice in scalable architectures and identify common patterns and techniques to design scalable architectures for parallel data processing.

  • To design and implement a prototype scalable architecture for parallel data processing
  • To evaluate the performance and scalability of the prototype architecture using benchmarks and real-world datasets
  • To compare the prototype architecture with existing solutions and identify its strengths and weaknesses
  • To evaluate the trade-offs and limitations of different scalable architectures for parallel data processing
  • To provide recommendations for the use of the prototype architecture in different scenarios, such as batch processing, stream processing, and interactive querying

Robotic manipulation modelling

The aim of this research is to develop and validate a model-based control approach for robotic manipulation of small, precise objects.

  • Develop a mathematical model of the robotic system that captures the dynamics of the manipulator and the grasped object.
  • Design a control algorithm that uses the developed model to achieve stable and accurate grasping of the object.
  • Test the proposed approach in simulation and validate the results through experiments with a physical robotic system.
  • Evaluate the performance of the proposed approach in terms of stability, accuracy, and robustness to uncertainties and perturbations.
  • Identify potential applications and areas for future work in the field of robotic manipulation for precision tasks.

Big data analytics and its impacts on marketing strategy

The aim of this research is to investigate the impact of big data analytics on marketing strategy and to identify best practices for leveraging this technology to inform decision-making.

  • Review the literature on big data analytics and marketing strategy to identify key trends and challenges
  • Conduct a case study analysis of companies that have successfully integrated big data analytics into their marketing strategies
  • Identify the key factors that contribute to the effectiveness of big data analytics in marketing decision-making
  • Develop a framework for integrating big data analytics into marketing strategy.
  • Investigate the ethical implications of big data analytics in marketing and suggest best practices for responsible use of this technology.

Looking For Customize Thesis Topics?

Take a review of different varieties of thesis topics and samples from our website TheResearchGuardian.com on multiple subjects for every educational level.

Platforms for large scale data computing: big data analysis and acceptance

To investigate the performance and scalability of different large-scale data computing platforms.

  • To compare the features and capabilities of different platforms and determine which is most suitable for a given use case.
  • To identify best practices for using these platforms, including considerations for data management, security, and cost.
  • To explore the potential for integrating these platforms with other technologies and tools for data analysis and visualization.
  • To develop case studies or practical examples of how these platforms have been used to solve real-world data analysis challenges.

Distributed data clustering

Distributed data clustering can be a useful approach for analyzing and understanding complex datasets, as it allows for the identification of patterns and relationships that may not be immediately apparent.

To develop and evaluate new algorithms for distributed data clustering that is efficient and scalable.

  • To compare the performance and accuracy of different distributed data clustering algorithms on a variety of datasets.
  • To investigate the impact of different parameters and settings on the performance of distributed data clustering algorithms.
  • To explore the potential for integrating distributed data clustering with other machine learning and data analysis techniques.
  • To apply distributed data clustering to real-world problems and evaluate its effectiveness.

Analyzing and predicting urbanization patterns using GIS and data mining techniques".

The aim of this project is to use GIS and data mining techniques to analyze and predict urbanization patterns in a specific region.

  • To collect and process relevant data on urbanization patterns, including population density, land use, and infrastructure development, using GIS tools.
  • To apply data mining techniques, such as clustering and regression analysis, to identify trends and patterns in the data.
  • To use the results of the data analysis to develop a predictive model for urbanization patterns in the region.
  • To present the results of the analysis and the predictive model in a clear and visually appealing way, using GIS maps and other visualization techniques.

Use of big data and IOT in the media industry

Big data and the Internet of Things (IoT) are emerging technologies that are transforming the way that information is collected, analyzed, and disseminated in the media sector. The aim of the research is to understand how big data and IoT re used to dictate information flow in the media industry

  • Identifying the key ways in which big data and IoT are being used in the media sector, such as for content creation, audience engagement, or advertising.
  • Analyzing the benefits and challenges of using big data and IoT in the media industry, including factors such as cost, efficiency, and effectiveness.
  • Examining the ethical and social implications of using big data and IoT in the media sector, including issues such as privacy, security, and bias.
  • Determining the potential impact of big data and IoT on the media landscape and the role of traditional media in an increasingly digital world.

Exigency computer systems for meteorology and disaster prevention

The research aims to explore the role of exigency computer systems to detect weather and other hazards for disaster prevention and response

  • Identifying the key components and features of exigency computer systems for meteorology and disaster prevention, such as data sources, analytics tools, and communication channels.
  • Evaluating the effectiveness of exigency computer systems in providing accurate and timely information about weather and other hazards.
  • Assessing the impact of exigency computer systems on the ability of decision makers to prepare for and respond to disasters.
  • Examining the challenges and limitations of using exigency computer systems, such as the need for reliable data sources, the complexity of the systems, or the potential for human error.

Network security and cryptography

Overall, the goal of research is to improve our understanding of how to protect communication and information in the digital age, and to develop practical solutions for addressing the complex and evolving security challenges faced by individuals, organizations, and societies.

  • Developing new algorithms and protocols for securing communication over networks, such as for data confidentiality, data integrity, and authentication
  • Investigating the security of existing cryptographic primitives, such as encryption and hashing algorithms, and identifying vulnerabilities that could be exploited by attackers.
  • Evaluating the effectiveness of different network security technologies and protocols, such as firewalls, intrusion detection systems, and virtual private networks (VPNs), in protecting against different types of attacks.
  • Exploring the use of cryptography in emerging areas, such as cloud computing, the Internet of Things (IoT), and blockchain, and identifying the unique security challenges and opportunities presented by these domains.
  • Investigating the trade-offs between security and other factors, such as performance, usability, and cost, and developing strategies for balancing these conflicting priorities.

Meet Our Professionals Ranging From Renowned Universities

Related topics.

  • Sports Management Research Topics
  • Special Education Research Topics
  • Software Engineering Research Topics
  • Primary Education Research Topics
  • Microbiology Research Topics
  • Luxury Brand Research Topics
  • Cyber Security Research Topics
  • Commercial Law Research Topics
  • Change Management Research Topics
  • Artificial intelligence Research Topics

Warning icon

Thesis/Capstone for Master's in Data Science | Northwestern SPS - Northwestern School of Professional Studies

  • Post-baccalaureate
  • Undergraduate
  • Professional Development
  • Pre-College
  • Center for Public Safety
  • Get Information

SPS Logo

Data Science

Capstone and thesis overview.

Capstone and thesis are similar in that they both represent a culminating, scholarly effort of high quality. Both should clearly state a problem or issue to be addressed. Both will allow students to complete a larger project and produce a product or publication that can be highlighted on their resumes. Students should consider the factors below when deciding whether a capstone or thesis may be more appropriate to pursue.

A capstone is a practical or real-world project that can emphasize preparation for professional practice. A capstone is more appropriate if:

  • you don't necessarily need or want the experience of the research process or writing a big publication
  • you want more input on your project, from fellow students and instructors
  • you want more structure to your project, including assignment deadlines and due dates
  • you want to complete the project or graduate in a timely manner

A student can enroll in MSDS 498 Capstone in any term. However, capstone specialization courses can provide a unique student experience and may be offered only twice a year. 

A thesis is an academic-focused research project with broader applicability. A thesis is more appropriate if:

  • you want to get a PhD or other advanced degree and want the experience of the research process and writing for publication
  • you want to work individually with a specific faculty member who serves as your thesis adviser
  • you are more self-directed, are good at managing your own projects with very little supervision, and have a clear direction for your work
  • you have a project that requires more time to pursue

Students can enroll in MSDS 590 Thesis as long as there is an approved thesis project proposal, identified thesis adviser, and all other required documentation at least two weeks before the start of any term.

From Faculty Director, Thomas W. Miller, PhD

Tom Miller

Capstone projects and thesis research give students a chance to study topics of special interest to them. Students can highlight analytical skills developed in the program. Work on capstone and thesis research projects often leads to publications that students can highlight on their resumes.”

A thesis is an individual research project that usually takes two to four terms to complete. Capstone course sections, on the other hand, represent a one-term commitment.

Students need to evaluate their options prior to choosing a capstone course section because capstones vary widely from one instructor to the next. There are both general and specialization-focused capstone sections. Some capstone sections offer in individual research projects, others offer team research projects, and a few give students a choice of individual or team projects.

Students should refer to the SPS Graduate Student Handbook for more information regarding registration for either MSDS 590 Thesis or MSDS 498 Capstone.

Capstone Experience

If students wish to engage with an outside organization to work on a project for capstone, they can refer to this checklist and lessons learned for some helpful tips.

Capstone Checklist

  • Start early — set aside a minimum of one to two months prior to the capstone quarter to determine the industry and modeling interests.
  • Networking — pitch your idea to potential organizations for projects and focus on the business benefits you can provide.
  • Permission request — make sure your final project can be shared with others in the course and the information can be made public.
  • Engagement — engage with the capstone professor prior to and immediately after getting the dataset to ensure appropriate scope for the 10 weeks.
  • Teambuilding — recruit team members who have similar interests for the type of project during the first week of the course.

Capstone Lesson Learned

  • Access to company data can take longer than expected; not having this access before or at the start of the term can severely delay the progress
  • Project timeline should align with coursework timeline as closely as possible
  • One point of contact (POC) for business facing to ensure streamlined messages and more effective time management with the organization
  • Expectation management on both sides: (business) this is pro-bono (students) this does not guarantee internship or job opportunities
  • Data security/masking not executed in time can risk the opportunity completely

Publication of Work

Northwestern University Libraries offers an option for students to publish their master’s thesis or capstone in Arch, Northwestern’s open access research and data repository.

Benefits for publishing your thesis:

  • Your work will be indexed by search engines and discoverable by researchers around the world, extending your work’s impact beyond Northwestern
  • Your work will be assigned a Digital Object Identifier (DOI) to ensure perpetual online access and to facilitate scholarly citation
  • Your work will help accelerate discovery and increase knowledge in your subject domain by adding to the global corpus of public scholarly information

Get started:

  • Visit Arch online
  • Log in with your NetID
  • Describe your thesis: title, author, date, keywords, rights, license, subject, etc.
  • Upload your thesis or capstone PDF and any related supplemental files (data, code, images, presentations, documentation, etc.)
  • Select a visibility: Public, Northwestern-only, Embargo (i.e. delayed release)
  • Save your work to the repository

Your thesis manuscript or capstone report will then be published on the MSDS page. You can view other published work here .

For questions or support in publishing your thesis or capstone, please contact [email protected] .

  • DSpace@MIT Home
  • MIT Libraries

This collection of MIT Theses in DSpace contains selected theses and dissertations from all MIT departments. Please note that this is NOT a complete collection of MIT theses. To search all MIT theses, use MIT Libraries' catalog .

MIT's DSpace contains more than 58,000 theses completed at MIT dating as far back as the mid 1800's. Theses in this collection have been scanned by the MIT Libraries or submitted in electronic format by thesis authors. Since 2004 all new Masters and Ph.D. theses are scanned and added to this collection after degrees are awarded.

MIT Theses are openly available to all readers. Please share how this access affects or benefits you. Your story matters.

If you have questions about MIT theses in DSpace, [email protected] . See also Access & Availability Questions or About MIT Theses in DSpace .

If you are a recent MIT graduate, your thesis will be added to DSpace within 3-6 months after your graduation date. Please email [email protected] with any questions.

Permissions

MIT Theses may be protected by copyright. Please refer to the MIT Libraries Permissions Policy for permission information. Note that the copyright holder for most MIT theses is identified on the title page of the thesis.

Theses by Department

  • Comparative Media Studies
  • Computation for Design and Optimization
  • Computational and Systems Biology
  • Department of Aeronautics and Astronautics
  • Department of Architecture
  • Department of Biological Engineering
  • Department of Biology
  • Department of Brain and Cognitive Sciences
  • Department of Chemical Engineering
  • Department of Chemistry
  • Department of Civil and Environmental Engineering
  • Department of Earth, Atmospheric, and Planetary Sciences
  • Department of Economics
  • Department of Electrical Engineering and Computer Sciences
  • Department of Humanities
  • Department of Linguistics and Philosophy
  • Department of Materials Science and Engineering
  • Department of Mathematics
  • Department of Mechanical Engineering
  • Department of Nuclear Science and Engineering
  • Department of Ocean Engineering
  • Department of Physics
  • Department of Political Science
  • Department of Urban Studies and Planning
  • Engineering Systems Division
  • Harvard-MIT Program of Health Sciences and Technology
  • Institute for Data, Systems, and Society
  • Media Arts & Sciences
  • Operations Research Center
  • Program in Real Estate Development
  • Program in Writing and Humanistic Studies
  • Science, Technology & Society
  • Science Writing
  • Sloan School of Management
  • Supply Chain Management
  • System Design & Management
  • Technology and Policy Program

Collections in this community

Doctoral theses, graduate theses, undergraduate theses, recent submissions.

Thumbnail

The properties of amorphous and microcrystalline Ni - Nb alloys. 

Thumbnail

Towards Biologically Plausible Deep Neural Networks 

Thumbnail

Randomized Data Structures: New Perspectives and Hidden Surprises 

feed

  • Press Enter to activate screen reader mode.

Department of Computer Science

Thesis projects and research in ds.

The Master's thesis is a mandatory course of the Master's program in Data Science. The thesis is supervised by a professor of the data science faculty list .

Research in Data Science is a core elective for students in Data Science under the supervision of a data science professor.

Research in Data Science

The project is in independent work under the supervision of a member of the faculty in data science

Only students who have passed at least one core course in Data Management and Processing, and one core course in Data Analysis can start with a research project.

Before starting, the project must be registered in mystudies and a project description must be submitted at the start of the project to the studies administration by e-mail (address see Contact in right column).

Master's Thesis

The Master's Thesis requires 6 months of full time study/work, and we strongly discourage you from attending any courses in parallel. We recommend that you acquire all course credits before the start of the Master’s thesis. The topic for the Master’s thesis must be chosen within Data Science.

Before starting a Master’s thesis, it is important to agree with your supervisor on the task and the assessment scheme. Both have to be documented thoroughly. You electronically register the Master’s thesis in mystudies.

It is possible to complete the Master’s thesis in industry provided that a professor involved in the Data Science Master’s program supervises the thesis and your tutor approves it.

Further details on internal regulations of the Master’s thesis can be downloaded from the following website: www.inf.ethz.ch/studies/forms-and-documents.html .

Overview Master's Theses Projects

Chair of programming methodology.

  • Prof. Dr. Martin Vechev

Institute for Computing Platform

  • Prof. Dr. Gustavo Alonso
  • Prof. Dr. Torsten Hoefler
  • Prof. Dr. Ana Klimovic
  • Prof. Dr. Timothy Roscoe

Institute for Machine Learning

  • Prof. Dr. Valentina Boeva
  • Prof. Dr. Joachim Buhmann
  • Prof. Dr. Ryan Cotterell    
  • external page Prof. Dr. Menna El-Assady call_made   
  • Prof. Dr. Niao He
  • Prof. Dr. Thomas Hofmann
  • Prof. Dr. Andreas Krause
  • external page Prof. Dr. Fernando Perez Cruz call_made
  • Prof. Dr. Gunnar Rätsch
  • external page Prof. Dr. Mrinmaya Sachan call_made
  • external page Prof. Dr. Bernhard Schölkopf call_made  
  • Prof. Dr. Julia Vogt

Institute for Persasive Computing

  • Prof. Dr. Otmar Hilliges

Institute of Computer Systems

  • Prof. Dr. Markus Püschel

Institute of Information Security

  • Prof. Dr. David Basin
  • Prof. Dr. Srdjan Capkun
  • external page Prof. Dr. Florian Tramèr call_made

Institute of Theoretical Computer Science

  • Prof. Dr. Bernd Gärtner

Institute of Visual Computing

  • Prof. Dr. Markus Gross
  • Prof. Dr. Marc Pollefeys
  • Prof. Dr. Olga Sorkine
  • Prof. Dr. Siyu Tang

Disney Research Zurich

  • external page Prof. Dr. Robert Sumner call_made

Automatic Control Laboratory

  • Prof. Dr. Florian Dörfler
  • Prof. Dr. John Lygeros

Communication Technology Laboratory

  • Prof. Dr. Helmut Bölcskei

Computer Engineering and Networks Laboratory

  • Prof. Dr. Laurent Vanbever
  • Prof. Dr. Roger Wattenhofer

Computer Vision Laboratory

  • Prof. Dr. Ender Konukoglu
  • Prof. Dr. Luc Van Gool
  • Prof. Dr. Fisher Yu

Institute for Biomedical Engineering

  • Prof. Dr. Klaas Enno Stephan

Integrated Systems Laboratory

  • Prof. Dr. Luca Benini
  • Prof. Dr. Christoph Studer

Signal and Information Processing Laboratory (ISI)

  • Prof. Dr. Amos Lapidoth
  • Prof. Dr. Hans-Andrea Loeliger

D-MATH does not publish Master's Theses projects. In case of interest contact the professor directly.

FIM - Insitute for Mathematical Research

  • Prof. Dr. Alessio Figalli

Financial Mathematics

  • Prof. Dr. Josef Teichmann

Institute for Operations Research

  • Prof. Dr. Robert Weismantel
  • Prof. Dr. Rico Zenklusen

RiskLab Switzerland

  • external page Prof. Dr. Patrick Cheridito call_made
  • external page Prof. Dr. Mario Valentin Wüthrich call_made

Seminar for Applied Mathematics

  • Prof. Dr. Rima Alaifari
  • Prof. Dr. Siddhartha Mishra

Seminar for Statistics

  • Prof. Dr. Afonso Bandeira
  • Prof. Dr. Peter Bühlmann
  • Prof. Dr. Yuansi Chen
  • Prof. Dr. Nicolai Meinshausen
  • Prof. Dr. Jonas Peters
  • Prof. Dr. Johanna Ziegel

Law, Economics, and Data Science Group

  • Prof. Dr. Eliott Ash , D-GESS)

Institute for Geodesy and Photogrammetry

  • Prof. Dr. Konrad Schindler (D-BSSE)
  • Thesis Option

Data Science master’s students can choose to satisfy the research experience requirement by selecting the thesis option. Students will spend the majority of their second year working on a substantial data science project that culminates in the submission and oral defense of a master’s thesis. While all thesis projects must be related to data science, students are given leeway in finding a project in a domain of study that fits with their background and interest.

All students choosing the thesis option must find a research advisor and submit a thesis proposal by mid-April of their first year of study. Thesis proposals will be evaluated by the Data Science faculty committee and only those students whose proposals are accepted will be allowed to continue with the thesis option.  

To account for the time spent on thesis research, students choosing the thesis option are able substitute three required courses (the Capstone and two "free" elective courses (as defined in the final bullet point on the degree requirement page )) with AC 302.

In Applied Computation

  • How to Apply
  • Learning Outcomes
  • Master of Science Degree Requirements
  • Master of Engineering Degree Requirements
  • CSE courses
  • Degree Requirements
  • Data Science courses
  • Data Science FAQ
  • Secondary Field Requirements
  • Advising and Other Activities
  • Alumni Stories
  • Financing the Degree
  • Student FAQ
  • Zur Metanavigation
  • Zur Hauptnavigation
  • Zur Subnavigation
  • Zum Seitenfuss

Photo: Sarah Buth

Bachelor and Master Thesis

We offer a variety of cutting-edge and exciting research topics for Bachelor's and Master's theses. We cover a wide range of topics from Data Science, Natural Language Processing, Argument Mining, the Use of AI in Business, Ethics in AI and Multimodal AI. We are always open to suggestions for your own topics, so please feel free to contact us. We supervise students from all disciplines of business administration, business informatics, computer science and industrial engineering.

Thesis Topics

Example topics could be:

  • Conversational Artificial Intelligence in Insurance and Finance
  • Natural Language Processing for Understanding Financial Narratives: An Overview
  • Ethics at the Intersection of Finance and AI: A Comprehensive Literature Review
  • Explainable Natural Language Processing for Credit Risk Assessment Models: A Literature Review

Thesis Template

  • Latex Template for bachelor and master theses
  • How to use the latex template

Q1: How many pages do I need to write?

A: In general, the number of pages is only a poor indicator of the quality of a thesis. However, as a rule of thumb, bachelor theses should have around 30 pages, while master theses should be around 60 pages of main content (that is, without the appendix and lists of tables, symbols, figures, references etc.).

Q2: How often should I meet with my supervisor?

A: Your supervisors are typically very busy people. However, don't hesitate to ask in case you have questions. For instance, if you are unsure of some requirements, or in case you have methodological problems, it is absolutely necessary to talk to your supervisor. As a rule of thumb, you should meet at least three times (once in the beginning, once in the middle, and once before the submission).

Q3: Am I allowed to use any AI models in the process of writing my thesis?

A: In general, we neither forbid nor recommend the use of AI for writing support. However, if you use AI, please inform your supervisor. Also, you need to adhere to the recommendations on the use of AI writing assistants given by the faculty.

Q4: How much time do I have?

A: The exact timing is dependent on your study program! Thus, please check the examination requirements before the official start of your thesis -- you are responsible for sticking to the rules.

MSc in Data Science, Project Guide, 2018-2019

NEW: List of project areas is available!

Introduction

The project is an essential component of the Masters course. It is a substantial piece of full-time independent research in some area of data science. You will carry out your project under the individual supervision of a member of CDT staff.

The project will occupy a large part of your time during the Spring semester, and 100% of your time from late May/early June — once your examinations have completed — until mid-August. A dissertation describing the work must be submitted by a deadline in mid-August.

Choosing a Project

You are expected to choose a project at the end of Semester 1. Students are expected to find their own projects in consultation with supervisors. To help with this, staff will post some project ideas in late October. These will be indicative of their areas of interest, but they shouldn't be interpreted as a fixed menu; they are simply the starting point for discussion. The procedure for project selection is:

  • You should identify some research areas that interest you, on the basis of your coursework so far, your independent reading, the guest lectures in IRDS, and the set of project ideas proposed by staff in late October.
  • Arrange meetings with supervisors in those research areas to discuss potential MSc projects. Often supervisors will have several potential project ideas in mind, but you should of course bring up any potential directions that you have been thinking about.
  • IMPORTANT: Once you have identified a project and supervisor who is willing to take you on, you will need to fill out a brief form identifying the topic and supervisor. The deadline for this is the 12th of December, 2018 .
  • The project proposals will all be reviewed for suitability by the CDT project coordinator. However, your proposal is not a contract and we are not going to hold you to it. It should simply represent a good-faith attempt to identify a topic of mutual interest to you and your supervisor.

Schedule and Important Dates

The overall schedule is: You will meet with supervisors during Semester 1 and select a project shortly after Semester 1 classes end. Once you have selected a project, we recommend that you get a head start on your project over the winter break. During Semester 2, you will work approximately 50% on coursework and 50% on your project. After classes end in Semester 2, you will have a revision period for your exams — during this period we recommend that you focus on your exams. Once the exams complete, you should return to your project work, spending 100% time on it until the final deadline in mid-August.

Here are the important dates and deadlines for 2018-19:

  • November -- You should start meeting with potential MSc supervisors now (if you have not begun already)
  • 12 December 2018 -- MSc project selections due (RTDS students).
  • 11 January 2018, noon -- Interim Report due (RTDS+ students).
  • 1 March 2018, noon -- Interim Report due (RTDS students).
  • April - May 2018 -- Revision period and exams. During this period we would not expect you to be making much progress on your project
  • late May 2018 -- Begin full time work on project.
  • mid-August 2018 (exact date TBD, probably 16 Aug) -- Deadline for submission of dissertation.
  • October 2018 -- Board of Examiners meets and marks announced

Supervision

As part of choosing a project, you will also choose a supervisor. Your supervisor gives technical advice and also assists you in planning the research. Students should expect approximately weekly meetings with their supervisor. Backup supervisors may be allocated to cover periods of absence of the supervisor, if necessary.

Interim Report

At the beginning of March (or early January for RTDS+), you will submit an interim report about how your project has gone so far. This should be 6-8 pages. This report will not form part of the mark; it is solely for feedback, so it is in your best interest to complete it. The report should describe the research problem that you are considering, explain why it is important, what methods you expect to use, how you expect to evaluate your results, what results you have been able to obtain so far, and what your plans are for the summer. You should write this in such a way that you can re-use the text in your final MSc project report.

Relationship to Your PhD Project

The MSc project is designed to be a first research project that prepares you for the more extended work that you will do in your PhD. The project is intended to be novel research — we hope that in some cases the MSc projects will lead to publishable results, although this is not required and will not always be possible, depending on the nature of the project. Your supervisor should help you identify a topic that has the potential to lead into a larger PhD project, should you decide to continue research in the area.

That said, it is not required that your PhD research be in the same area as your MSc research. Some students will indeed continue their PhD work with the same research area and supervisor as their MSc. Others will choose a different PhD supervisor. Both of these outcomes are expected and are perfectly fine.

Of course if you do already have a good idea about your intended PhD topic, you will want to take this into account when selecting your MSc topic — whether it be to choose a topic in the same area, or to choose a topic that will provide you with complementary experience.

Projects with External Collaborators

Some students may wish to undertake a project which relates to the activities of one of our external partners. Alternatively, some projects that supervisors suggest to you may have a natural relationship with one of the CDT partners. This is encouraged. A student undertaking such a project will still need to find an academic supervisor who is willing to take on the project. During the project phase, students working on such projects have both an academic supervisor and a designated contact at the partner organization.

We strongly encourage you to discuss your projects with other students, talk informally about your progress, and get advice from your peers about any issues. Last year this happened as part of the CDT Tea meetings; this year, we will discuss whether to continue this or to have more formal tutorials.

The Dissertation

  • Title page with abstract.
  • Introduction : an introduction to the document, clearly stating the hypothesis or objective of the project, motivation for the work and the results achieved. The structure of the remainder of the document should also be outlined.
  • Background : background to the project, previous work, exposition of relevant literature, setting of the work in the proper context. This should contain sufficient information to allow the reader to appreciate the contribution you have made.
  • Description of the work undertaken : this may be divided into chapters describing the conceptual design work and the actual implementation separately. Any problems or difficulties and the suggested solutions should be mentioned. Alternative solutions and their evaluation should also be included.
  • Analysis or Evaluation : results and their critical analysis should be reported, whether the results conform to expectations or otherwise and how they compare with other related work. Where appropriate evaluation of the work against the original objectives should be presented.
  • Conclusion : concluding remarks and observations, unsolved problems, suggestions for further work.
  • Bibliography .

In addition, the dissertation must be accompanied by a statement declaring that the student has read and understood the University's plagiarism guidelines.

In the acknowledgments section of your dissertation, in addition to thanking anyone that you wish, you should also acknowledge the funding sources that have supported you during the year. Please follow these instructions for acknowledging your funding sources . You should get to know them well as you will also need to follow them for every paper that you publish during your PhD.

Students should write as they go , but should also budget several weeks towards the end of the project to focus on writing. Where appropriate the dissertation may additionally contain appendices in which relevant program listings, experimental data, circuit diagrams, formal proofs, etc. may be included. However, students should keep in mind that they are marked on the quality of the dissertation, not its length.

The dissertation must be word-processed using either LaTeX or a system with similar capabilities. The LaTeX thesis template can be found via the local packages web page. You don't have to use these packages, but your thesis must match the style (i.e., font size, text width etc) shown in the sample output for an Informatics thesis.

Computing Resources

Many projects will require computing resources. Please see the CDT handbook for information about what computing resources are available to CDT students.

If a project requires anything more, this needs to be requested at the time of writing the proposal, and the supervisor needs to explicitly ask for additional resources if necessary (start by talking to the CDT projects organizer, below).

Technical problems during project work are only considered for resources we provide; no technical support, compensation for lost data, extensions for time lost due to technical problems with external hard- and software as provided will be given, except where this is explicitly stated as part of a project specification and adequately resourced at the start of the project.

Students must submit their project by the deadline in mid August (see above). Students need to submit hard copy, electronic copy and archive software as detailed below.

  • Hard Copy. Two printed copies of the dissertation, bound with the soft covers provided by the School, must be submitted to the ITO before the deadline.
  • Electronic Copy. Students must follow the instructions for how to submit their project electronically. Please use the online submission form that is linked from there.
  • Software. Students are required to preserve any software they have generated, source, object and make files, together with any associated data that has been accumulated. When you submit the electronic copy of your thesis you will also be asked to provide an archive file (tar or zip) containing all the project materials. You should create a directory, for example named PROJECT , in your file space specifically for the purpose. Please follow the accepted practice of creating a README file which documents your files and their function. This directory should be compressed and then submitted, together with the electronic version of the thesis, via the online submission webpage. See these instructions for how to submit your project electronically.

Project Assessment

  • Understanding of the problem
  • Completion of the work
  • Quality of the work
  • Quality of the dissertation
  • Knowledge of the literature
  • Critical evaluation of previous work
  • Critical evaluation of own work
  • Justification of design decisions
  • Solution of conceptual problems
  • Amount of work
  • Evidence of outstanding merit e.g. originality
  • Inclusion of material worthy of publication

The project involves both the application of skills learned in the past and the acquisition of new skills. It allows students to demonstrate their ability to organise and carry out a major piece of work according to sound scientific and engineering principles. The types of activity involved in each project will vary but all will typically share the following features:

  • Research the literature and gather background information
  • Analyse requirements, compare alternatives and specify a solution
  • Design and implement the solution
  • Experiment and evaluate the solution
  • Develop written and oral presentation skills

You may have noticed that there is both a 90pt version of the project (RTDS) and a 120pt version (RTDS+). The 120pt version is for students who have a previous Master's degree in an area relating to data science along with a clear project and a supervisor in mind when they arrive, and therefore want to take fewer classes and a larger project. If you wish to choose this option, you must speak to the CDT Year 1 organizer during course registeration; see the MSc by Research Course Handbook for more information.

The RTDS+ project works the same as the RTDS project, except that: (a) You are expected to have selected a supervisor by 21 September; (b) You should commence work on your project part-time in the autumn; (c) You should submit an interim report by 11 January; and (d) The markers will look to see evidence of more work or a more advanced project, commensurate to the additional amount of time you have had. For example, a larger project might make a larger research contribution, apply more advanced methodology, contain more extensive experimental evaluation, etc.

This page is currently maintained by Adam Lopez .

master thesis topics data science

The chair typically offers various thesis topics each semester in the areas computational statistics, machine learning, data mining, optimization and statistical software. You are welcome to suggest your own topic as well .

Before you apply for a thesis topic make sure that you fit the following profile:

  • Knowledge in machine learning.
  • Good R or python skills.

Before you start writing your thesis you must look for a supervisor within the group.

Send an email to the contact person listed in the potential theses topics files with the following information:

  • Planned starting date of your thesis.
  • Thesis topic (of the list of thesis topics or your own suggestion).
  • Previously attended classes on machine learning and programming with R.

Your application will only be processed if it contains all required information.

Potential Thesis Topics

[Potential Thesis Topics] [Student Research Projects] [Current Theses] [Completed Theses]

Below is a list of potential thesis topics. Before you start writing your thesis you must look for a supervisor within the group.

Available thesis topics

Disputation.

The disputation of a thesis lasts about 60-90 minutes and consists of two parts. Only the first part is relevant for the grade and takes 30 minutes (bachelor thesis) and 40 minutes (master thesis). Here, the student is expected to summarize his/her main results of the thesis in a presentation. The supervisor(s) will ask questions regarding the content of the thesis in between. In the second part (after the presentation), the supervisors will give detailed feedback and discuss the thesis with the student. This will take about 30 minutes.

  • How do I prepare for the disputation?

You have to prepare a presentation and if there is a bigger time gap between handing in your thesis and the disputation you might want to reread your thesis.

  • How many slides should I prepare?

That’s up to you, but you have to respect the time limit. Prepariong more than 20 slides for a Bachelor’s presentation and more than 30 slides for a Master’s is VERY likely a very bad idea.

  • Where do I present?

Bernd’s office, in front of the big TV. At least one PhD will be present, maybe more. If you want to present in front of a larger audience in the seminar room or the old library, please book the room yourself and inform us.

  • English or German?

We do not care, you can choose.

  • What do I have to bring with me?

A document (Prüfungsprotokoll) which you get from “Prüfungsamt” (Frau Maxa or Frau Höfner) for the disputation.Your laptop or a USB stick with the presentation. You can also email Bernd a PDF.

  • How does the grading work?

The student will be graded regarding the quality of the thesis, the presentation and the oral discussion of the work. The grade is mainly determined by the written thesis itself, but the grade can improve or drop depending on the presentation and your answers to defense questions.

  • What should the presentation cover?

The presentation should cover your thesis, including motivation, introduction, description of new methods and results of your research. Please do NOT explain already existing methods in detail here, put more focus on novel work and the results.

  • What kind of questions will be asked after the presentation?

The questions will be directly connected to your thesis and related theory.

Student Research Projects

We are always interested in mentoring interesting student research projects. Please contact us directly with an interesting resarch idea. In the future you will also be able to find research project topics below.

Available projects

Currently we are not offering any student research projects.

For more information please visit the official web page Studentische Forschungsprojekte (Lehre@LMU)

Current Theses (With Working Titles)

Completed theses, completed theses (lmu munich), completed theses (supervised by bernd bischl at tu dortmund).

Department of Data Science

Department of Data Science

Master Thesis

Master theses.

Below you find our current topic proposals as pdf-files.

If you are interested in a certain topic, please send an e-mail to wima-abschlussarbeiten[at]lists.fau.de. Please refrain from writing emails to other addresses.

Your e-mail should include

  • your transcript of records
  • a letter of motivation (approximately half a page)
  • desired date at which you want to start
  • latest possible date of submission.

In your letter of motivation please state which of the topic proposals you are interested in. If none of these proposals interest you please state which type of thesis you desire (e.g. literature study) and which field you are interested in.

Topic proposals (with corresponding advisers)

  • Optimization of Optical Particle Properties under Uncertainty (Frauke Liers)
  • Analysis and Prediction of Asynchronous Event Sequences Considering Uncertainty @ Medical Technology (Frauke Liers, thesis together with Siemens Healthineers, Erlangen)
  • Optimized Qubit Routing for Commuting Gates (Frauke Liers)

Furthermore, students are welcome to contact abschlussarbeiten[at]lists.fau.de to jointly define a thesis topic in one of the following areas:

  • Optimization under uncertainty (Frauke Liers)
  • Integration of data analysis with optimization (Frauke Liers)

Previous Theses

  • Adviser: Alexander Martin
  • Adviser: Kevin-Martin Aigner, Fauke Liers
  • Adviser: Jan Rolfes, Timm Oertel
  • Adviser: Jan Rolfes, Frauke Liers
  • Adviser: Jan Rolfes, Jana Dienstbier, Frauke Liers
  • Adviser: Martina Kuchlbauer, Frauke Liers
  • Adviser: Martina Kuchlbauer, Jana Dienstbier, Frauke Liers
  • Adviser: Yiannis Giannakopoulos
  • Adviser: Andreas Bärmann, Alexander Martin
  • Adviser: Jan Krause, Andreas Bärmann, Alexander Martin
  • Adviser: Christian Biefel, Frauke Liers
  • Adviser: Jonasz Staszek, Alexander Martin
  • Adviser: Lukas Hümbs, Alexander Martin
  • Adviser:Kristin Braun, Frauke Liers
  • Adviser: Lukas Glomb, Florian Rösel, Frauke Liers
  • Adviser: Bismark Singh, Alexander Martin
  • Adviser:Kristin Braun, Johannes Thürauf, Robert Burlacu,Frauke Liers
  • Adviser: Oskar Schneider, Alexander Martin
  • Optimization of energy supply in critical infrastructures using battery electric vehicles Adviser: Bismark Singh, Alexander Martin
  • Approximations to the Clustered Traveling Salesman Problem with an Application in Perm, Russia Adviser: Bismark Singh, Alexander Martin
  • Decomposition methods for energy optimization models Adviser: Bismark Singh, Alexander Martin
  • Aircraft Trajectory Optimization and Disjoint Paths Adviser: Benno Hoch, Frauke Liers
  • Obere und untere Schranken für das Set-Cover mittels Lasserre Hierachie Adviser: Jan Rolfes, Alexander Martin
  • Separation Algorithms and Reformulations for Single-Item Lot-Sizing with Non-Delivery Penalties Adviser: Dieter Weninger
  • Optimales Scheduling an Maschinen Adviser: Kevin-Martin Aigner, Jan Rolfes, Alexander Martin
  • Optimization of scenario-expanded tail assignment problems including maintenance Adviser: Lukas Glomb, Florian Rösel, Frauke Liers
  • Anti-Lifting: Sparsifizierung bei gemischt-ganzzahligen Optimierungsproblemen Adviser: Katrin Halbig, Alexander Martin
  • Solving Mixed-Integer Problems using Machine Learning for the Optimization of Energy Production Adviser: Christian Biefel, Lukas Hümbs, Alexander Martin
  • Preprocessing Techniques for Mixed-Integer Bilevel Problems Adviser: Thomas Kleinert, Dieter Weninger, Alexander Martin
  • Projection and Farkas Type Lemmas for Mixed Integer Programs Adviser: Richard Krug, Alexander Martin
  • Gamma-robuste lineare Komplementaritätssysteme Adviser: Vanessa Krebs, Martin Schmidt
  • Verseiloptimierung (Kooperation mit LEONI) Adviser: Alexander Martin
  • Data-based Methods for Chance Constraints in DC Optimal Power Flow with Extension to Curtailment Adviser: Kevin-Martin Aigner, Frauke Liers
  • Different Concepts of Distributionally Robust Vehicle Routing Problems Adviser: Sebastian Tschuppik, Dennis Adelhütte, Frauke Liers
  • Optimierungsmethoden für Logistikprozesse im Krankenhaus Adviser: Andreas Bärmann, Dieter Weninger, Alexander Martin
  • Lagrange Relaxierung Energienetze Kooperation Jülich Adviser: Johannes Thürauf, Lars Schwee
  • The price of robustness in the European entry-exit market Adviser: Thomas Kleinert, Frauke Liers
  • Kosteneffizienter Betrieb von Smart Grids mit Gomory Schnittebenen Adviser: Martin Schmidt, Galina Orlinskaya
  • On finding sparse descriptions of polyhedra with mixed-integer programming Adviser: Alexander Martin, Patrick Gemander, Oskar Schneider
  • Mathematische Optimierung für chromatographische Verfahren zur Trennung von Stoffgemischen Adviser: Frauke Liers, Robert Burlacu
  • Machine Learning gestützte Prognose der Performance zukünftiger Lieferungen unter Verwendung von adaptiven Algorithmen und geeigneten Datenstrukturen im Transportmanagement Adviser: Frauke Liers, Andreas Bärmann
  • Discrete optimization for optimal train control Adviser: Alexander Martin, Andreas Bärmann
  • Optimierte Flottenplanung in der Luftfahrt unter Berücksichtigung der Betankungsstrategie Adviser: Alexander Martin, Andreas Bärmann
  • Lipschitzoptimierung am Beispiel des europäischen Gasmarktes Adviser: Martin Schmidt, Thomas Kleinert
  • Graphzerlegungen und Alternating Direction Methode für Gasnetzwerke Adviser: Martin Schmidt
  • Multikriterielle Optimierung für Graphendekompositionen in der Gasnetzoptimierung Adviser: Martin Schmidt
  • Robuste Gleichgewichtsprobleme im Energiebereich Adviser: Martin Schmidt, Vanessa Krebs
  • MIP Methoden in der Fördertechnik Adviser: Alexander Martin, Andreas Bärmann, Patrick Gemander
  • Verwendung von SVMs für medizinische Diagnostik Adviser: Frauke Liers, Dieter Weninger
  • Ausbauplanung für städtische Verkehrsnetze Adviser: Alexander Martin, Andreas Bärmann
  • Zuschnittoptimierung und Parametervariation in der Flachglasindustrie Adviser: Lars Schewe
  • Ermittlung optimaler Höchstabfluggewichte unter Unsicherheit Adviser: Alexander Martin, Lena Hupp, Martin Weibelzahl
  • Online-Optimierung in Hinblick auf Prognoseunsicherheiten bei erneuerbaren Energien mittels basisorientierter Szenarienreduktion Adviser: Alexander Martin, Christoph Thurner
  • A variable decomposition algorithm for production planning Adviser: Alexander Martin, Dieter Weninger
  • Mixed integer moving horizon control for flexible energy storage systems Adviser: Martin Schmidt
  • Mathematische Analyse von Kompaktheitsmaßen in der Gebietsplanung anhand eines Modells zur Dienstleisterauswahl bei Transportausschreibungen Adviser: Alexander Martin
  • Methoden zur Laufzeitverbesserung eines Mixed-Integer Program in der Entsorgungslogistik Adviser: Alexander Martin
  • Aktuelle Erkenntnisse bei Pivotregeln des Simplexverfahrens Adviser: Frauke Liers
  • Radius of robust feasibility for the robust stochastic nomination validition problem in passive gas networks Adviser: Frauke Liers, Denis Aßmann
  • Mathematische Modelle und Optimierung für die automatische Permutation von Schließanlagen Adviser: Alexander Martin
  • Mathematische Modellierung von Stromnetzen: Ein Vergleich AC- und DC-Modell hinsichtlich Investitionsentscheidungen Adviser: Frauke Liers
  • Diskrete Optimierung im Immobilien-Investing Adviser: Alexander Martin
  • Polyedrische und komplexitätstheoretische Untersuchungen von bipartiten Matchingproblemen mit quadratischen Termen Adviser: Frauke Liers
  • Discrete Selection of Diameters for Constructing Optimal Hydrogen Pipeline Networks Adviser: Lars Schewe
  • Optimierte Tourenplanung im Krankentransport unter Berücksichtigung von Zeitfenstern Adviser: Frauke Liers
  • Optimale Preiszonen und Investitionsentscheidungen unter Berücksichtigung von Stromspeichern – Eine modelltheoretische Analyse des Strommarkts Adviser: Alexander Martin
  • Robuste Eigenanteilplanung und Belegungsplanung sowie Personalplanung für ein Pflegeheim Adviser: Frauke Liers
  • Dynamische automatisierte Rampensteuerung Adviser: Alexander Martin
  • A combinatorial splitting algorithm for checking feasibility of passive gas networks under uncertain injection patterns Adviser: Frauke Liers, Denis Aßmann
  • Integrated Optimization Problems in the Airline Industry Adviser: Alexander Martin
  • Mathematische Modellierung eines Produktionshochlaufs bei Kromberg & Schubert Adviser: Frauke Liers
  • On the uniqueness of competitive market equilibria on DC networks Adviser: Martin Schmidt
  • The Clique-Problem under Multiple-Choice Constraints with Cycle-Free Dependency Graphs Adviser: Alexander Martin, Andreas Bärmann
  • A Decomposition Approach for a Multilevel Graph Partitioning Model of the German Electricity Market Adviser: Martin Schmidt
  • Zyklisches Scheduling in der Kirche – Mathematische Modellierung und Optimierung Adviser: Alexander Martin, Thorsten Ramsauer
  • Optimierung von Flugbahnen: Ein gemischt-ganzzahliges Modell zur Berechnung von optimalen Trajektorien-Netzwerken Adviser: Frauke Liers
  • Robuste Optimierungsmethoden für Nominierungsvalidierung in Gasnetzwerken bei Nachfrageunsicherheiten Adviser: Frauke Liers, Denis Aßmann
  • Stable Set Problem with Multiple Choice Constraints on Staircase Graphs Adviser: Alexander Martin
  • Anwendung von robusten Flussproblemen für die optimale Speichersteuerung im Smart Grid Adviser: Frauke Liers
  • Das Sternsingerproblem: Planung, Modellierung und mathematische Optimierung Adviser: Alexander Martin, Martin Weibelzahl
  • A Bilevel Optimization Model for Holy Mass Planning Adviser: Alexander Martin, Martin Weibelzahl
  • Optimal Personnel Management in Church: A Robust Optimization Approach for Operative and Strategic Planning Adviser: Frauke Liers, Martin Weibelzahl
  • Active-Passive-Vehicle-Routing-Problem Adviser: Alexander Martin, Michael Drexl (Fraunhofer SCS)
  • Robuste Optimierung in der Flugplanung: Entwicklung eines statischen sowie eines zeitexpandierten Modells zur robusten Zeitfenster-Zuordnung in der prätaktischen Phase Adviser: Frauke Liers
  • Personaleinsatzplanung im Einzelhandel unter Berücksichtigung von Unsicherheiten mithilfe mathematischer Optimierung Adviser: Alexander Martin, Falk Meyerholz (Fraunhofer ILS)
  • Clustering von Bahnweichen und Analyse von Störungen zur Optimierung der Instandhaltungsmaßnahmen Adviser: Frauke Liers, Thomas Böhm (DLR)
  • Mikroökonomische Haushaltstheorie unter Unsicherheit: mathematische Perspektive Adviser: Alexander Martin, Martin Weibelzahl
  • Lösung von realen Probleminstanzen bei der Tourenplanung in der ambulanten Pflege mit Hilfe eines Cluster-First-Route-Second-Ansatzes Adviser: Alexander Martin
  • Revenue Management als Netzwerkproblem unter Unsicherheiten mit Anwendung im Fernbusmarkt Adviser: Alexander Martin, Lars Schewe
  • Das Partial Digest Problem mit absoluten Fehlern als Matching und Anwendung der Lagrange-Relaxierung Adviser: Frauke Liers
  • Optimierung im Produktionsablauf bei Elektrolux Rothenburg – Laserline (schriftliche Hausarbeit im Rahmen der Ersten Staatsprüfung für das Lehramt an Gymnasien in Bayern) Adviser: Alexander Martin
  • Memetische Optimierung des Generalized Travelling Salesman Problems Adviser: Alexander Martin
  • Estimation the optimal schedule of a Vehicle Routing Problem arising in Bulk Distribution Network Optimisation Adviser: Alexander Martin
  • Nominierungsvalidierung bei Gasnetzen: Einfluss und mögliche Behandlung von Unsicherheiten Adviser: Frauke Liers
  • Ein exaktes Lösungsverfahren für das Optimierungsproblem des Partial Digest mit absoluten Fehlern Adviser: Frauke Liers
  • Mathematische Optimierung des Bidmanagements in der Reisebranche Adviser: Alexander Martin, Lars Schewe
  • Optimierung der Staffeleinteilung in der Fußball-Landesliga Bayern und Konzipierung vereinsfreundlicher Spielpläne Adviser: Alexander Martin, Andreas Heidt
  • Potentialanalyse für die Transportlogistik im Krankenhauswesen Adviser: Alexander Martin, Andrea Peter
  • Anpassungstests mit Nuisanceparmetern für das lineare Regressionsmodell Adviser: Alexander Martin
  • Robuste Optimierung für Scheduling Probleme im Luftverkehrsmanagement Adviser: Alexander Martin, Andreas Heidt
  • Reisewegbasierte Flottenoptimierung bei differenzierter Passagiernachfrage Adviser: Alexander Martin
  • Gemischt  bivariate Verteilungen unter Verwendung einer doppelt-stochastischen Summe und ihre Anwendunge n Adviser: Alexander Martin, Ingo Klein
  • Optimierung einer Indoor-Navigation am Flughafen München mittels adaptivem, heuristischem Dijkstra-Verfahren basierend auf partieller Gridgraphenstruktur Adviser: Alexander Martin, Andreas Bärmann
  • A Comparison of cutting-plane closures in R² and R³ Adviser: Alexander Martin, Sebastian Pokutta
  • Tourenplanung bei der Abokiste Adviser: Alexander Martin, Andreas Bärmann
  • Klassifizierung und Strukturanalyse von Produktionsplanungsmodellen, die durch gemischt-ganzzahlig-lineare Programme modelliert sind. Adviser: Alexander Martin, Dieter Weninger
  • Performance-oriented Optimization Techniques for Facility Design Floorplanning Problems Adviser: Alexander Martin, Stefan Schmieder
  • Standortoptimierung als Entscheidungshilfe für Familien bei der Wahl eines Wohnortes unter Berücksichtigung der Infrastruktur Adviser: Alexander Martin
  • Kapazitätsbestimmung in linearen Netzwerken Adviser: Alexander Martin, Lars Schewe
  • Portfoliooptimierung – Der Ansatz von Markowitz unter realen Nebenbedingungen Adviser: Alexander Martin
  • Entwicklung eines Steuerungstools for Cross-Docking Prozesse bei der BMW Group Adviser: Alexander Martin
  • Modellierung und Lösung eines mehrperiodischen deterministischen Standortplanungsproblems unter volantilen Bedarfsmengen Adviser: Alexander Martin
  • Analytische Optimierung von Netzschutzkennlinien Adviser: Alexander Martin, Lars Schewe
  • Hedging von Katastrophenrisiken durch den Einsatz von Industry Loss Warranties Adviser: Alexander Martin, Nadine Gatzert
  • Anwendung mathematischer Werkzeuge zur Umlaufoptimierung am praktischen Beispiel Adviser: Alexander Martin
  • Optimization Methods for Asset Liability Management in a Non-Life Insurance Company Adviser: Alexander Martin, Nadine Gatzert
  • Parametrisierung von 3D-Blattmodellen zur Detektion und Ergänzung unvollständiger Messdaten Adviser: Alexander Martin, Günther Greiner
  • Conducting Optimal Risk Classification for Substandard Annities in the Presence of Underwriting Risk Adviser: Alexander Martin
  • Incorporating Convexe Hull into an Algorithmic Approach for Territory Design Problems Adviser: Alexander Martin, Sonja Friedrich
  • Expanding Brand & Bound for binary integer programs with a pseudo-boolean solver and a SAT based Presolver Adviser: Alexander Martin
  • “Effiziente randomisierte Algorithmen für das Erfüllbarkeitsproblem” – Beschleunigung durch Variableninvertierung Adviser: Alexander Martin
  • Preprocessingansätze für die Planung von gekoppelten Strom-, Gas- und Wärmenetzen  Adviser: Alexander Martin, Andrea Zelmer, Debora Mahlke
  • Die kontinuierliche Analyse der kooperativen Effizienz in Gesundheit mit Hilfe parametrischer iund nicht-parametrischer Verfahren Adviser: Alexander Martin, Freimut Bodendorf
  • Incorporating Convex hulls into an Algorithmic Approach for Territory Design Problems Adviser: Alexander Martin
  • Auffalten von orthogonalen Bäumen Adviser: Alexander Martin, Ute Günther
  • Heuristic Approaches for the Gate Assignment Problem Adviser: Alexander Martin, Andrea Peter
  • Synchronisation von Tauschpunkten für Flugbesatzungen und technische Anforderungen in der Rotationsplanung von  Verkehrsflugzeugen Adviser: Alexander Martin, Andrea Peter in co-operation with Lufthansa Passage
  • Verteilte vs. ganzheitliche Optimierung in der Luftfahrt Adviser: Alexander Martin, Sebastian Pokutta, Andrea Peter in co-operation with DLR Braunschweig
  • Optimale Schichtenerstellung zur Personalbedarfsermittlung Adviser: Alexander Martin, Henning Homfeld in co-operation with DB Schenker Rail
  • Polyedrische Untersuchungen von Multiple Knapsack Ungleichungen Adviser: Alexander Martin, Henning Homfeld
  • Using vehicle routing heuristics to estimate costs in gas cylinder delivery Adviser: Alexander Martin in co-operation with Linde Gas
  • An Overview on Algorithms for Graph Reliability and possible Transfor for Dynamic Graph Reliability Adviser: Alexander Martin,  Sebastian Pokutta, Nicole  Nowak
  • Eine Verallgemeinerung des Quadratic Bottleneck Assignment Problem und Anwendung Adviser: Alexander Martin, Lars Schewe, Sonja Friedrich
  • Vergleich von Optimierungsmodellen beim Erdgashandel Adviser: Alexander Martin, Johannes Müller
  • Branch and Cut Verfahren in der Standortplanung Adviser: Wolfgang Domschke, Alexander Martin
  • Evolution of the Performance of Separate Scheduling Solvers Forced to Cooperate Adviser: Alexander Martin, Andrea Peter
  • Heuristische Ansätze für den Umgang mit Fertigungsrestriktionen in der Herstellung von Blechprofilen Adviser: Alexander Martin, Ute Günther
  • An overview of algorithms for graph reliability and possible transfer dynamic graph reliability Advisor: Alexander Martin, Nicole Ziems
  • Analyzing and modeling of selected parameters of the facade construction of a building with respect to the sustainability and efficiency of the building Adviser: Alexander Martin
  • Preprocessingtechniken in der Gasnetzwerkoptimierung Adviser: Alexander Martin, Björn Geißler
  • Optimierungsmethoden für die Kopplung von Day-Ahead-Strommärkten Adviser: Alexander Martin, Antonio Morsi, Björn Geißler
  • Ein pfadbasiertes Modell für das Routing von Güterwagen im Einzelwagenverkehr Adviser: Alexander Martin, Henning Homfeld
  • Azyklische Fahrzeugeinsatz- und Instandhaltungsoptimierung im Schienenpersonennahverkehr Adviser: Alexander Martin, Henning Homfeld
  • Ein Arboreszenzmodell für das Leitwegproblem Adviser: Alexander Martin, Henning Homfeld
  • Approximation einer Hyperbel in der diskreten Optimierung Adviser: Alexander Martin, Henning Homfeld
  • Dynamische Programmierung in der Gasnetzwerkoptimierung Adviser: Alexander Martin, Susanne Moritz, Björn Geißler, Antonio Morsi
  • Lösungsmethoden für das Pin Assignment Problem Adviser: Alexander Martin, Antonio Morsi, Björn Geißler
  • Exploiting Heuristics for the Vehicle Routing Problem to estimate Gas Delivery Costs Adviser: Alexander Martin
  • Ganzzahlige Optimierung zur Bestimmung konsistenter Eröffnungspreise von Futures-Kontrakten und ihrer Kombinationen Advisor: Alexander Martin
  • Verfahren zur Lösung des soft rectangle packing problem Adviser: Alexander Martin, Armin Fügenschuh
  • Optimierung der Leitwegeplanung im Schienengüterverkehr Adviser: Alexander Martin, Armin Fügenschuh, Henning Homfeld
  • Optimierungsmethoden zur Berechnung von Cross-Border-Flow beim Market-Coupling im europäischen Stromhandel Adviser: Alexander Martin, Antonio Morsi, Björn Geißler
  • Gemischt-ganzzahliges Modell zur Entwicklung optimaler Erneuerungsstrategien für Wasserversorgungsnetze Adviser: Alexander Martin, Antonio Morsi
  • Algorithmische Behandlung des All-Different Constraints im Branch&Cut Adviser: Alexander Martin, Thorsten Gellermann
  • Empiric Analysis of Convex Underestimators in Mixed Integer Nonlinear Optimization Adviser: Alexander Martin, Thorsten Gellermann
  • Partial Reverse Search Adviser: Alexander Martin, Lars Schewe
  • Zufallsbasierte Heuristik für gekoppelte Netzwerke in der dezentralen Energieversorgung Adviser: Alexander Martin, Debora Mahlke, Andrea Zelmer
  • Polyedrische Untersuchungen an einem stochastischen Optimierungsproblem aus der regenerativen Energieversorgung Adviser: Alexander Martin, Debora Mahlke, Andrea Zelmer
  • Relax & Fix Heuristik für ein stochastisches Problem aus der regenerativen Energieversorgung Adviser: Alexander Martin, Debora Mahlke, Andrea Zelmer
  • Test Sets for Spanning Tree Problems with Side Constraints Adviser: Alexander Martin, Ute Günther
  • Polyedrische Untersuchungen zur Kostenoptimierung der Geldautomatenbefüllung Adviser: Alexander Martin, Ute Günther
  • Effiziente stückweise lineare Approximation bivariater Funktionen Adviser: Ulrich Reif, Armin Fügenschuh, Andrea Peter
  • The 3-Steiner Ratio in Octilinear Geometry Adviser: Karsten Weihe, Alexander Martin
  • An empirical investigation of local search algorithms to minimize the weighted number of tardy jobs in Single Machine Scheduling Adviser: T. Stützle, Alexander Martin
  • Optimization of Collateralization concerning Large Exposures Adviser: S. Dewal, Alexander Martin
  • Optimierungsmodelle zur Linienbündelung im ÖPNV Adviser: Alexander Martin, Armin Fügenschuh
  • Solving dynamic Scheduling Problems with Unary Resources Adviser: Alexander Martin
  • Parameteranalyse in der Optimierungssoftware Carmen-PAC Adviser: Armin Fügenschuh, Alexander Martin
  • Ein Data Mining Ansatz zur Abschätzung von zyklischen Werkstoffkennwerten Adviser: Armin Fügenschuh
  • Automatische Parameteroptimierung im Crew Assignment System Carmen Adviser: Armin Fügenschuh, Alexander Martin
  • Branch and Price-Verfahren für Losgrößenprobleme Adviser: Wolfgang Domschke, Alexander Martin
  • Pin Assignment im Multilayer Chip Design Adviser: Alexander Martin, Karsten Weihe
  • Augmentierende Vektoren mit beschränktem Support Adviser: Alexander Martin, Karsten Weihe
  • An LP-based Rounding Approach to Coupled Supply Network Planning Adviser: Alexander Martin, Debora Mahlke, Andrea Zelmer
  • Bounded Diameter Minimum Spanning Tree Adviser: Alexander Martin, Ute Günther
  • Degree and Diameter Bounded Minimum Spanning Trees Adviser: Alexander Martin, Ute Günther
  • Ein MILP, ein MINLP und ein graphentheoretischer Ansatz für die Free-Flight Optimierung Adviser: Alexander Martin, Armin Fügenschuh
  • Vehicle Routing for Mobile Nurses  Adviser: Alexander Martin, Armin Fügenschuh
  • Linearization Methods for the Optimization of Screening Processes in the Recovered Paper Production Adviser: Mirjam Duer, Armin Fügenschuh
  • Solving Real-World Vehicle Routing Problems using MILP and PGreedy Heuristics Adviser: Alexander Martin, Armin Fügenschuh
  • Supporting Geo-based Routing in Pub/Sub Middleware Adviser: A. Buchmann, Alexander Martin
  • Towards Adaptive Optimization of Advice Dispatch Adviser: Alexander Martin, M. Mezini
  • Optimierung von Lokumläufen in Schienengüterverkehr Adviser: Alexander Martin, Armin Fügenschuh
  • An Approximation Algorithm for Edge-Coloring of Multigraphs Adviser: Alexander Martin, Daniel Junglas
  • Modelling nonlinear stock holding costs in a facility location problem airsing in supply network optimisation  Adviser: Alexander Martin, Björn Samuelsson
  • Selected General Purpose Heuristics for Solving Mixed Integer Programs  Adviser: Alexander Martin, Marzena Fügenschuh
  • Ein Genetischer Algorithmus für das Proteinfaltungsproblem im HP-Modell Adviser: Alexander Martin
  • Algorithmic Approaches for Two Fundamental Optimization Problems: Workload-Balancing And Planar Steiner Trees Adviser: Alexander Martin, Matthias Müller-Hannemann
  • Stundenplan-Optimierung: Modelle und Software Adviser: Alexander Martin, Armin Fügenschuh
  • Vehicle Routing: Modelle und Software Adviser: Alexander Martin, Armin Fügenschuh
  • Parametrized GRASP Heuristics for Combinatorial Optimization Problems  Adviser: Alexander Martin, Armin Fügenschuh
  • Optimal Unrolling of Integral Branched Sheet Metal Components Adviser: Alexander Martin, Daniel Junglas
  • Leerwagenoptimierung im Schienengüterverkehr Adviser: Alexander Martin, Armin Fügenschuh, Gerald Pfau, DB AG
  • Mathematische Modelle und Methoden in der Entscheidungsfindung im Supply Chain Management  Adviser: Alexander Martin, Simone Göttlich
  • Modifikation des Approximationsalgorithmus von Hart und Istrail für das Proteinfaltungsproblem im HP-Modell Adviser: Alexander Martin, Agnes Dittel
  • Ein Verbesserungsalgorithmus der Proteinfaltung mit dem HP-Modell von Ken Dill Adviser: Alexander Martin, Agnes Dittel
  • Entwurf und Evaluation von MILP-Modellierungen zur Optimierung einer synchronisierten Abfüll- und Verpackungsstufe in der Produktionsfeinplanung  Adviser: Alexander Martin, Heinrich Braun, SAP AG, Thomas Kasper, SAP AG
  • Integration von Strafkosten für zu niedrige Sicherheitsbestände bei Losgrößenmodellen  Adviser: Hartmut Stadtler, Institut für Betriebswirtschaftslehre, Christian Seipl, Institut für Betriebswirtschaftslehre, Alexander Martin
  • Vergleich von Algorithmen zur Lösung ganzzahliger linearer Ungleichungssysteme mit höchstens zwei Variablen pro Ungleichung  Adviser: Alexander Martin, Armin Fügenschuh
  • Schaltbedingungen bei der Optimierung von Gasnetzen: Polyedrische Untersuchungen und Schnittebenen  Adviser: Alexander Martin, Susanne Moritz
  • Der Simulated Annealing Algorithmus zur transienten Optimierung von Gasnetzen  Adviser: Alexander Martin, Susanne Moritz
  • Kantenfärbung in Multigraphen Adviser: Alexander Martin, Daniel Junglas
  • Didaktische Aspekte einer Einbeziehung von Geschichte in den Mathematikunterricht am Beispiel von Kartographie Adviser: Alexander Martin
  • Vehicle Routing Adviser: Alexander Martin, Armin Fügenschuh
  • Ein genetischer Algorithmus zur Lösung eines multiplen Traveling Salesman Problems mit gekoppelten Zeitfenstern Adviser: Alexander Martin, Armin Fügenschuh
  • Integrierte Optimierung von Schulanfangszeiten und des Nahverkehrsangebots – ein Constraint-Programming Ansatz im Vergleich zu Ganzzahliger Optimierung Adviser: Alexander Martin, Armin Fügenschuh
  • Heuristic methods for site selection, installation selection and mobile assignment in UMTS  Adviser: Alexander Martin, Armin Fügenschuh
  • Optimization of the School Bus Traffic in Rural Areas – Modeling and Solving a Distance Constrained, Capacitated Vehicle Routing Problem with Pickup and Delivery, Flexible Time Windows and Several Time Constraints  Adviser: Alexander Martin, Armin Fügenschuh
  • Augmentierungsverfahren für Standortplanungsprobleme  Adviser: Alexander Martin, Armin Fügenschuh
  • Chain-3 Constraints for an IP Model of Ken A. Dill’s HP Lattice model  Adviser: Alexander Martin, Armin Fügenschuh
  • Stundenplangenerierung an einer Grundschule Adviser: Alexander Martin, Armin Fügenschuh, Agnes Dittel
  • Vergleichende Untersuchung von Heuristiken für das Routing- und Wellenlängenzuordnungsproblem bei rein transparenten oprischen Telekommunikationsnetzen.  Adviser: Alexander Martin, Manfred Körkel
  • Short Chain Constraints for an IP Model of Ken A. Dill’s HP-Lattice Model Adviser: Alexander Martin, Armin Fügenschuh
  • Eine Rundeheuristik für Ganzzahlige Programme Adviser: Alexander Martin, Armin Fügenschuh
  • Anwendung von Neuronalen netzen zur Beschleunigung von Branchen & Bound – Verfahren der Kombinatorischen Optimierung Adviser: M. Grötschel, Alexander Martin, K. Obermayer

Instructions for MSc Thesis

Before the thesis.

Before you start work on your thesis, it is important to put some thought into the choice of topic and familiarize yourself with the criteria and procedure. To do that, follow these steps, in this order:

Step 0: Read the university instructions .

Read the MSc thesis instructions and grading criteria on the university website. Computer Science Master's program: [link] . Data Science Master's program: [ link ].

Step 1: Choose a topic .

Choose a topic among the ones listed on the group's webpage [ link ].

You can also propose your own topic. In this case, you must explain what the main contribution of the thesis will be and identify at least one scientific publication that is related to the topic you propose.

Step 2: Contact us .

Submit the application form [ link ] to let us know of your interest to do your thesis in the group. Note : If you contact us, then please be ready to start work on the thesis within one month .

Step 3: Agree on the topic .

We have a brief discussion about the topic and devise a high-level plan for thesis work and content. We also discuss a start date , when you start work on the thesis. In addition, you should contact a second evaluator for the thesis.

Thesis timeline

Below you find the milestones after you have started work on the thesis. In parenthesis, you find an estimate of when each milestone occurs. The thesis work ends when you submit it for approval. The total duration from start to end of the thesis should be about four months.

Milestone #0: Thesis outline (at most 3 weeks from the start) .

You create a first outline of the thesis. The outline should contain the titles of the chapters, along with a (tentative) list of sections and contents. An indicative template for the outline is shown below on this page.

Milestone #1: A draft with first results (about 2 months from start) .

All chapters should contain some readable content (not necessarily polished). Most importantly, some results should already be described. Ideally, you should be able to complete and refine the results within one more month.

Milestone #2: A draft with all results (about 1 month before the end).

Most content should now be in the draft. Some polishing remains and some results may still be refined. Notify the second evaluator that you are near the end of the thesis work. Optionally, you may send the thesis draft and receive preliminary comments from the second evaluator.

Milestone #3: Submit the thesis for approval (end of thesis work).

You will receive a grade and comments after the next program board's meeting.

Supervision

What you can expect from the supervisor:

  • Comments for the thesis draft after each milestone (see timeline above) and, if necessary, a meeting.
  • Suggestions for how to proceed in cases when you encounter a major hurdle.

In addition, you are welcome to participate in the group meetings and discuss your thesis work with other group members.

Note however that one of the grading criteria for the thesis is whether you worked independently -- and in the end, the thesis should be your own work.

Template for Thesis Outline

Below you find a suggested template for the outline of the thesis. You may adapt it to your work, of course (e.g., change chapter titles or structure).

A summary of the thesis that mentions the broader topic of the thesis and why it is important; the research question or technical problem addressed by the thesis; the main thesis contributions (e.g., data gathering, developed methods and algorithms, experimental evaluation) and results.

Chapter 1: Introduction

The introduction should motivate the thesis and give a longer summary. It should be written in a way that allows anyone in your program to understand it, even if they are not experts in the topic.

  • What is the broader topic of the thesis?
  • Why is it important?
  • What research question(s) or technical problems does the thesis address?
  • What are the most related works from the literature on the topic? How does the thesis differ from what has already been done?
  • What are the main thesis contributions (e.g., data gathering, developed methods and algorithms, experimental evaluation)?
  • What are the results?

Chapter 2: Related literature

Organize this chapter in sections, with one section for each research area that is related to your thesis. For each research area, cite all the publications that are related to your topic, and describe at least the most important of them.

Chapter 3: Preliminaries

In this chapter, place the information that is necessary for you to describe the contributions and results of the thesis. It may be different from thesis to thesis, but could include sections about:

Setting. Define the terms and notation you will be using. State any assumptions you make across the thesis. Background on Methods . Describe existing methods from the literature (e.g., algorithms or ML models) that you use for your work. Data (esp. for a Data Science thesis). If the main contribution is data analysis, then describe the data here, before the analysis.

Chapter 4: Methodological contribution

For a Computer Science thesis, this part typically describes the algorithm(s) developed for the thesis. For a Data Science thesis, this part typically describes the method for the analysis.

Chapter 5: Results

This chapter describes the results obtained when the methods of Chapter 4 are used on data.

For a Computer Science thesis, this part typically describes the performance of the developed algorithm(s) on various synthetic and real datasets. For a Data Science thesis, this part typically describes the findings of the analysis.

The chapter should also describe what insights are obtained from the results.

Chapter 6: Conclusion

  • Summarize the contribution of the thesis.
  • Provide an evaluation: are the results conclusive, are there limitations in the contribution?
  • How would you extend the thesis, what can be done next on the same topic?

Assignment Help

  • Why Choose Us
  • Vision and Mission
  • Hire Writers
  • How it Works

Latest Data Science Dissertation Topics to Grab Reader's Attention

Choose from the Best Data Science Dissertation Topics

Table of Content

Explore Transformative Patterns

Innovation of new products, best data science dissertation topics, trending data science dissertation topics, compelling data science dissertation topics, data science dissertation topics to score well, unique data science dissertation topics, pick the meaningful topic, work with consistent data, ease the complexity of the model, acknowledge the day to day problems.

Data science is one of the fastest growing fields in present times, which makes it one of the befitting subjects among students. But, getting a degree in the field is not a piece of cake, as you have to overcome several hurdles. One such problem is to draft an ideal dissertation. Although, creating a paper can be easier when you have a precise topic. Thus, this blog will help you to explore data science dissertation topics to ease your workload. So, to begin with, have an insight into what the data science field is and why it is necessary.

Need Personalised Assistance from Our Experts?

Share Your Requirements via Whatsapp!

What Is Data Science and Its Importance?

Data science is a field that studies data to extract valuable insights for a business. In other words, it is a means to use scientific techniques to evaluate and extract meaningful information from the ocean of data. Moreover, as per professional data science dissertation writers, it is an interdisciplinary approach that combines practices and principles of several fields. These subjects are mathematics, statistics, computer engineering, and artificial intelligence. However, you can learn all these in the dissertation writing process, which is a crucial thing in academics. Furthermore, its importance and use is increasing day by day in the field to:

Data science enables a business to explore new relationships and patterns that have the ability to transform the organisation and take it to new heights. Moreover, it can evaluate the low cost of resource management to get higher profits.

Data science has the capability to reveal the gaps and the problems present the existing information that might go unnoticed otherwise. You can do it by evaluating the purchase decisions, consumer preferences, business process and more.

After gaining insight into data science and perceiving its importance, it is time to move ahead. Constructing a dissertation while you are pursuing your academic journey is necessary. Although it is a challenging task, but referring to online dissertation help  can guide you on the right track. To move forward, explore the topics you can use to frame your dissertation and impress your professor.

A List of Latest Data Science Dissertation Topics

In this section of the blog, you will explore dissertation topics in data science that you can use to build your paper on. These are shortlisted by the experts that will help you leave an impression on your professor and grab your readers' attention. Thus, begin to perceive them all listed by the professional dissertation writers in UK :

Here are the hand-picked dissertation topics for data science that can help you grab the reader's attention quickly and without too much effort.

1. Compare the implementation of data science in various investigations concerning wildfires.

2. Explain the K-means clustering from the perspective of online spherical.

3. Explore how linear and nonlinear regression analyses' efficacy can be increased.

4. Evaluate the platforms for big data computing: Big data analytics and the adoption.

5. Discuss the best data management strategies for modern enterprises to use.

As you know, trends are changing rapidly in every field, and you have to cope with them to grow. Thus, in this section, you will find some of the most trending data science dissertation ideas to adjust to the changing things.

6. Explain massive data processing and the appropriate key management system.

7. Discuss the deep learning process and its relevance in the field.

8. What is the application of big data in improving supply chain management of an institution?

9. Analyse the implementation of data science in economic theory.

10. What is the use of big data analytics to power AI and ML?

Need Help with Dissertation?

Get a 100% Original Dissertation Written by EXPERTS

Attracting the readers and making them stick to the end of the document is the most challenging task. But, if you have chosen ideal MSC data science dissertation topics, you can ace it easily. Thus, here are some of them:

11. Explain the Hadoop programming and the map-reduce architecture.

12. What is hyper-personalisation and its importance in the field?

13. Explore the value big data provides to innovation management.

14. Perform a comparative study on the implementation of data science in the teaching profession.

15. Overview of data valuation and why it matters in data management.

The motive behind constructing a dissertation is to score well apart from studying the subject. Thus, to make the paper effective, you can either buy dissertation service or select a topic which has the potential to fetch you good grades. So, here are some of the appropriate data science dissertation ideas:

16. Have a discussion about the MATLAB code for decision trees along with semantic data governance.

17. What is the necessity of big data technologies for modern businesses?

18. State the societal implications of using predictive analytics within education.

19. Mention the association rule learning regarding data mining.

20. Give an overview of the relevance of Artificial Intelligence.

Struggling to Find Best Dissertation Topic?

Get a Unique Title & Dissertation Proposal Outline for FREE!

You must know that uniqueness is the key towards an ideal dissertation. Thus, in this section, you will explore the unique data science dissertation topics that will help you achieve your goal.

21. What is the implementation of data science, and how does it impact the management environment and sustainability?

22. How to apply attribute-access or role-based access control in an organisation?

These are some of the dissertation topics for data science that will help you ease the process of selecting the topic. So, move ahead to know the technique that you can implement and find the perfect data science research topics for your paper.

How to Choose a Data Science Dissertation Topic?

This section of the blow will help you plan your dissertation topic selection process to smoothen the path. So, read further to perceive the procedure that you should follow while selecting data science dissertation topics:

As you know, there is a never-ending list of data science dissertation topics you can choose from and build your paper on. But you are opting for the appropriate one within your interest and trending simultaneously. However, if it is challenging for you, check examples of dissertation that can rescue you.

Due to the variety of data science dissertation topics available, you must choose the one with consistent data. It means some topics do not have an accurate amount of information available to research. So, to ensure that you do not get stuck in the middle, you must ensure that the theme you are opting for has a consistent flow of information.

While finalising the data science dissertation topics, you need to ensure that it does not have a complex model to work with. It is so because, sometimes, for the sake of uniqueness, students go for the topic with complicated theories. Thus, it makes them struggle and confuses them while creating the paper. So, to ensure a smooth process, you must work on something with lower complexity.

While selecting a theme, you must keep yourself updated with the daily problems faced by the targeted audience. You can refer to the data science dissertation examples available to understand this better. It is crucial as it will grab the attention of the audience faster, and they can connect with it easily. To do this, enhance your knowledge in the field you are working in.

These were some easy steps that you should adhere to while selecting ideal dissertation topics for data science. So, if you are still struggling with the topics, you can seek professional help.

Stuck with Data Science Dissertation Topics? We Can Help

The data science dissertation topics listed in the blog are more than enough, and you must have found the one that perfectly fits your interest area. However, if you are still stuck with dissertation topic and want to explore more, our team of experts is there. Moreover, we can guide you with other challenging areas that might become a hurdle in your way. So, when you seek our data science dissertation help, you will get the following:

  • 100% Unique Data
  • On-Time Delivery
  • 24/7 Assistance
  • Affordable Prices
  • Six Free Tools
  • Flawless Content

It is not the end, as you can avail of several other benefits when you hire us to assist you. Moreover, seeking help from the experts at the Assignment Desk will not burn a hole in your pocket as the prices will fit into your budget.

Let Us Help With Dissertation

Share Your Requirements Now for Customized Solutions.

Delivered on-time or your money back

Our Services

  • Assignment Writing Service
  • Essay Writing Help
  • Dissertation Writing Service
  • Coursework Writing Service
  • Proofreading & Editing Service
  • Online Exam Help
  • Term paper writing service
  • Ghost Writing Service
  • Case Study Writing Service
  • Research Paper Writing Service
  • Personal Statement Writing Service
  • Resume Writing Service
  • Report Writing Service

To Make Your Work Original

Check your work against paraphrasing & get a free Plagiarism report!

Check your work against plagiarism & get a free Plagiarism report!

Quick and Simple Tool to Generate Dissertation Outline Instantly

Get citations & references in your document in the desired style!

Make your content free of errors in just a few clicks for free!

Generate plagiarism-free essays as per your topic’s requirement!

Generate a Compelling Thesis Statement and Impress Your Professor

FREE Features

  • Topic Creation USD 3.87 FREE
  • Outline USD 9.33 FREE
  • Unlimited Revisions USD 20.67 FREE
  • Editing/Proofreading USD 28 FREE
  • Formatting USD 8 FREE
  • Bibliography USD 7.33 FREE

Get all these features for

USD 80.67 FREE

RELATED BLOGS

Related Blog

Check Out Top 65+ International Business Dissertation Topics

Related Blog

50+ In- Depth Geography Dissertation Ideas & Topics [2024]

Related Blog

A Comprehensive List of 35+ Trending Brexit Dissertation Topics

Related Blog

Interesting 40+ Early Childhood Studies Dissertation Ideas

Related Blog

Top 50 Unique Topics for Writing Palliative Care Dissertation

Related Blog

45+ Best Supply Chain Management Dissertation Topics 2024

Professional assignment writers.

Choose a writer for your task among hundreds of professionals

Mobile-view

Please rotate your device

We don't support landscape mode yet. Please go back to portrait mode for the best experience

We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it. Know more

Calculate the Price

Professional Academic Help at Pocket-Friendly Prices!

Captcha Code

Estimated Price

Limited Time Offer

Exclusive Library Membership + FREE Wallet Balance

1 Month Access !

5000 Student Samples

10,000 Answers by Experts

Get $300 Now

Google Custom Search

Wir verwenden Google für unsere Suche. Mit Klick auf „Suche aktivieren“ aktivieren Sie das Suchfeld und akzeptieren die Nutzungsbedingungen.

Hinweise zum Einsatz der Google Suche

Technical University of Munich

  • Data Analytics and Machine Learning Group
  • TUM School of Computation, Information and Technology
  • Technical University of Munich

Technical University of Munich

Open Topics

We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A  non-exhaustive list of open topics is listed below.

If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential topics.

Graph Neural Networks for Spatial Transcriptomics

Type:  Master's Thesis

Prerequisites:

  • Strong machine learning knowledge
  • Proficiency with Python and deep learning frameworks (PyTorch, TensorFlow, JAX)
  • Knowledge of graph neural networks (e.g., GCN, MPNN)
  • Optional: Knowledge of bioinformatics and genomics

Description:

Spatial transcriptomics is a cutting-edge field at the intersection of genomics and spatial analysis, aiming to understand gene expression patterns within the context of tissue architecture. Our project focuses on leveraging graph neural networks (GNNs) to unlock the full potential of spatial transcriptomic data. Unlike traditional methods, GNNs can effectively capture the intricate spatial relationships between cells, enabling more accurate modeling and interpretation of gene expression dynamics across tissues. We seek motivated students to explore novel GNN architectures tailored for spatial transcriptomics, with a particular emphasis on addressing challenges such as spatial heterogeneity, cell-cell interactions, and spatially varying gene expression patterns.

Contact : Filippo Guerranti , Alessandro Palma

References:

  • Cell clustering for spatial transcriptomics data with graph neural network
  • Unsupervised spatially embedded deep representation of spatial transcriptomics
  • SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network
  • DeepST: identifying spatial domains in spatial transcriptomics by deep learning
  • Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder

GCNG: graph convolutional networks for inferring gene interaction from spatial transcriptomics data

Generative Models for Drug Discovery

Type:  Mater Thesis / Guided Research

  • Proficiency with Python and deep learning frameworks (PyTorch or TensorFlow)
  • Knowledge of graph neural networks (e.g. GCN, MPNN)
  • No formal education in chemistry, physics or biology needed!

Effectively designing molecular geometries is essential to advancing pharmaceutical innovations, a domain which has experienced great attention through the success of generative models. These models promise a more efficient exploration of the vast chemical space and generation of novel compounds with specific properties by leveraging their learned representations, potentially leading to the discovery of molecules with unique properties that would otherwise go undiscovered. Our topics lie at the intersection of generative models like diffusion/flow matching models and graph representation learning, e.g., graph neural networks. The focus of our projects can be model development with an emphasis on downstream tasks ( e.g., diffusion guidance at inference time ) and a better understanding of the limitations of existing models.

Contact :  Johanna Sommer , Leon Hetzel

Equivariant Diffusion for Molecule Generation in 3D

Equivariant Flow Matching with Hybrid Probability Transport for 3D Molecule Generation

Structure-based Drug Design with Equivariant Diffusion Models

Efficient Machine Learning: Pruning, Quantization, Distillation, and More - DAML x Pruna AI

Type: Master's Thesis / Guided Research / Hiwi

  • Strong knowledge in machine learning
  • Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)

The efficiency of machine learning algorithms is commonly evaluated by looking at target performance, speed and memory footprint metrics. Reduce the costs associated to these metrics is of primary importance for real-world applications with limited ressources (e.g. embedded systems, real-time predictions). In this project, you will work in collaboration with the DAML research group and the Pruna AI startup on investigating solutions to improve the efficiency of machine leanring models by looking at multiple techniques like pruning, quantization, distillation, and more.

Contact: Bertrand Charpentier

  • The Efficiency Misnomer
  • A Gradient Flow Framework for Analyzing Network Pruning
  • Distilling the Knowledge in a Neural Network
  • A Survey of Quantization Methods for Efficient Neural Network Inference

Deep Generative Models

Type:  Master Thesis / Guided Research

  • Strong machine learning and probability theory knowledge
  • Knowledge of generative models and their basics (e.g., Normalizing Flows, Diffusion Models, VAE)
  • Optional: Neural ODEs/SDEs, Optimal Transport, Measure Theory

With recent advances, such as Diffusion Models, Transformers, Normalizing Flows, Flow Matching, etc., the field of generative models has gained significant attention in the machine learning and artificial intelligence research community. However, many problems and questions remain open, and the application to complex data domains such as graphs, time series, point processes, and sets is often non-trivial. We are interested in supervising motivated students to explore and extend the capabilities of state-of-the-art generative models for various data domains.

Contact : Marcel Kollovieh , David Lüdke

  • Flow Matching for Generative Modeling
  • Auto-Encoding Variational Bayes
  • Denoising Diffusion Probabilistic Models 
  • Structured Denoising Diffusion Models in Discrete State-Spaces

Active Learning for Multi Agent 3D Object Detection 

Type: Master's Thesis  Industrial partner: BMW 

Prerequisites: 

  • Strong knowledge in machine learning 
  • Knowledge in Object Detection 
  • Excellent programming skills 
  • Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch) 

Description: 

In autonomous driving, state-of-the-art deep neural networks are used for perception tasks like for example 3D object detection. To provide promising results, these networks often require a lot of complex annotation data for training. These annotations are often costly and redundant. Active learning is used to select the most informative samples for annotation and cover a dataset with as less annotated data as possible.   

The objective is to explore active learning approaches for 3D object detection using combined uncertainty and diversity based methods.  

Contact: Sebastian Schmidt

References: 

  • Exploring Diversity-based Active Learning for 3D Object Detection in Autonomous Driving   
  • Efficient Uncertainty Estimation for Semantic Segmentation in Videos   
  • KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection
  • Towards Open World Active Learning for 3D Object Detection   

Graph Neural Networks

Type:  Master's thesis / Bachelor's thesis / guided research

  • Knowledge of graph/network theory

Graph neural networks (GNNs) have recently achieved great successes in a wide variety of applications, such as chemistry, reinforcement learning, knowledge graphs, traffic networks, or computer vision. These models leverage graph data by updating node representations based on messages passed between nodes connected by edges, or by transforming node representation using spectral graph properties. These approaches are very effective, but many theoretical aspects of these models remain unclear and there are many possible extensions to improve GNNs and go beyond the nodes' direct neighbors and simple message aggregation.

Contact: Simon Geisler

  • Semi-supervised classification with graph convolutional networks
  • Relational inductive biases, deep learning, and graph networks
  • Diffusion Improves Graph Learning
  • Weisfeiler and leman go neural: Higher-order graph neural networks
  • Reliable Graph Neural Networks via Robust Aggregation

Physics-aware Graph Neural Networks

Type:  Master's thesis / guided research

  • Proficiency with Python and deep learning frameworks (JAX or PyTorch)
  • Knowledge of graph neural networks (e.g. GCN, MPNN, SchNet)
  • Optional: Knowledge of machine learning on molecules and quantum chemistry

Deep learning models, especially graph neural networks (GNNs), have recently achieved great successes in predicting quantum mechanical properties of molecules. There is a vast amount of applications for these models, such as finding the best method of chemical synthesis or selecting candidates for drugs, construction materials, batteries, or solar cells. However, GNNs have only been proposed in recent years and there remain many open questions about how to best represent and leverage quantum mechanical properties and methods.

Contact: Nicholas Gao

  • Directional Message Passing for Molecular Graphs
  • Neural message passing for quantum chemistry
  • Learning to Simulate Complex Physics with Graph Network
  • Ab initio solution of the many-electron Schrödinger equation with deep neural networks
  • Ab-Initio Potential Energy Surfaces by Pairing GNNs with Neural Wave Functions
  • Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds

Robustness Verification for Deep Classifiers

Type: Master's thesis / Guided research

  • Strong machine learning knowledge (at least equivalent to IN2064 plus an advanced course on deep learning)
  • Strong background in mathematical optimization (preferably combined with Machine Learning setting)
  • Proficiency with python and deep learning frameworks (Pytorch or Tensorflow)
  • (Preferred) Knowledge of training techniques to obtain classifiers that are robust against small perturbations in data

Description : Recent work shows that deep classifiers suffer under presence of adversarial examples: misclassified points that are very close to the training samples or even visually indistinguishable from them. This undesired behaviour constraints possibilities of deployment in safety critical scenarios for promising classification methods based on neural nets. Therefore, new training methods should be proposed that promote (or preferably ensure) robust behaviour of the classifier around training samples.

Contact: Aleksei Kuvshinov

References (Background):

  • Intriguing properties of neural networks
  • Explaining and harnessing adversarial examples
  • SoK: Certified Robustness for Deep Neural Networks
  • Certified Adversarial Robustness via Randomized Smoothing
  • Formal guarantees on the robustness of a classifier against adversarial manipulation
  • Towards deep learning models resistant to adversarial attacks
  • Provable defenses against adversarial examples via the convex outer adversarial polytope
  • Certified defenses against adversarial examples
  • Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks

Uncertainty Estimation in Deep Learning

Type: Master's Thesis / Guided Research

  • Strong knowledge in probability theory

Safe prediction is a key feature in many intelligent systems. Classically, Machine Learning models compute output predictions regardless of the underlying uncertainty of the encountered situations. In contrast, aleatoric and epistemic uncertainty bring knowledge about undecidable and uncommon situations. The uncertainty view can be a substantial help to detect and explain unsafe predictions, and therefore make ML systems more robust. The goal of this project is to improve the uncertainty estimation in ML models in various types of task.

Contact: Tom Wollschläger ,   Dominik Fuchsgruber ,   Bertrand Charpentier

  • Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
  • Predictive Uncertainty Estimation via Prior Networks
  • Posterior Network: Uncertainty Estimation without OOD samples via Density-based Pseudo-Counts
  • Evidential Deep Learning to Quantify Classification Uncertainty
  • Weight Uncertainty in Neural Networks

Hierarchies in Deep Learning

Type:  Master's Thesis / Guided Research

Multi-scale structures are ubiquitous in real life datasets. As an example, phylogenetic nomenclature naturally reveals a hierarchical classification of species based on their historical evolutions. Learning multi-scale structures can help to exhibit natural and meaningful organizations in the data and also to obtain compact data representation. The goal of this project is to leverage multi-scale structures to improve speed, performances and understanding of Deep Learning models.

Contact: Marcel Kollovieh , Bertrand Charpentier

  • Tree Sampling Divergence: An Information-Theoretic Metricfor Hierarchical Graph Clustering
  • Hierarchical Graph Representation Learning with Differentiable Pooling
  • Gradient-based Hierarchical Clustering
  • Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space

Available Master's thesis topics in machine learning

Main content.

Here we list topics that are available. You may also be interested in our list of completed Master's theses .

Learning and inference with large Bayesian networks

Most learning and inference tasks with Bayesian networks are NP-hard. Therefore, one often resorts to using different heuristics that do not give any quality guarantees.

Task: Evaluate quality of large-scale learning or inference algorithms empirically.

Advisor: Pekka Parviainen

Sum-product networks

Traditionally, probabilistic graphical models use a graph structure to represent dependencies and independencies between random variables. Sum-product networks are a relatively new type of a graphical model where the graphical structure models computations and not the relationships between variables. The benefit of this representation is that inference (computing conditional probabilities) can be done in linear time with respect to the size of the network.

Potential thesis topics in this area: a) Compare inference speed with sum-product networks and Bayesian networks. Characterize situations when one model is better than the other. b) Learning the sum-product networks is done using heuristic algorithms. What is the effect of approximation in practice?

Bayesian Bayesian networks

The naming of Bayesian networks is somewhat misleading because there is nothing Bayesian in them per se; A Bayesian network is just a representation of a joint probability distribution. One can, of course, use a Bayesian network while doing Bayesian inference. One can also learn Bayesian networks in a Bayesian way. That is, instead of finding an optimal network one computes the posterior distribution over networks.

Task: Develop algorithms for Bayesian learning of Bayesian networks (e.g., MCMC, variational inference, EM)

Large-scale (probabilistic) matrix factorization

The idea behind matrix factorization is to represent a large data matrix as a product of two or more smaller matrices.They are often used in, for example, dimensionality reduction and recommendation systems. Probabilistic matrix factorization methods can be used to quantify uncertainty in recommendations. However, large-scale (probabilistic) matrix factorization is computationally challenging.

Potential thesis topics in this area: a) Develop scalable methods for large-scale matrix factorization (non-probabilistic or probabilistic), b) Develop probabilistic methods for implicit feedback (e.g., recommmendation engine when there are no rankings but only knowledge whether a customer has bought an item)

Bayesian deep learning

Standard deep neural networks do not quantify uncertainty in predictions. On the other hand, Bayesian methods provide a principled way to handle uncertainty. Combining these approaches leads to Bayesian neural networks. The challenge is that Bayesian neural networks can be cumbersome to use and difficult to learn.

The task is to analyze Bayesian neural networks and different inference algorithms in some simple setting.

Deep learning for combinatorial problems

Deep learning is usually applied in regression or classification problems. However, there has been some recent work on using deep learning to develop heuristics for combinatorial optimization problems; see, e.g., [1] and [2].

Task: Choose a combinatorial problem (or several related problems) and develop deep learning methods to solve them.

References: [1] Vinyals, Fortunato and Jaitly: Pointer networks. NIPS 2015. [2] Dai, Khalil, Zhang, Dilkina and Song: Learning Combinatorial Optimization Algorithms over Graphs. NIPS 2017.

Advisors: Pekka Parviainen, Ahmad Hemmati

Estimating the number of modes of an unknown function

Mode seeking considers estimating the number of local maxima of a function f. Sometimes one can find modes by, e.g., looking for points where the derivative of the function is zero. However, often the function is unknown and we have only access to some (possibly noisy) values of the function. 

In topological data analysis,  we can analyze topological structures using persistent homologies. For 1-dimensional signals, this can translate into looking at the birth/death persistence diagram, i.e. the birth and death of connected topological components as we expand the space around each point where we have observed our function. These observations turn out to be closely related to the modes (local maxima) of the function. A recent paper [1] proposed an efficient method for mode seeking.

In this project, the task is to extend the ideas from [1] to get a probabilistic estimate on the number of modes. To this end, one has to use probabilistic methods such as Gaussian processes.

[1] U. Bauer, A. Munk, H. Sieling, and M. Wardetzky. Persistence barcodes versus Kolmogorov signatures: Detecting modes of one-dimensional signals. Foundations of computational mathematics17:1 - 33, 2017.

Advisors:  Pekka Parviainen ,  Nello Blaser

Causal Abstraction Learning

We naturally make sense of the world around us by working out causal relationships between objects and by representing in our minds these objects with different degrees of approximation and detail. Both processes are essential to our understanding of reality, and likely to be fundamental for developing artificial intelligence. The first process may be expressed using the formalism of structural causal models, while the second can be grounded in the theory of causal abstraction [1].      This project will consider the problem of learning an abstraction between two given structural causal models. The primary goal will be the development of efficient algorithms able to learn a meaningful abstraction between the given causal models.      [1] Rubenstein, Paul K., et al. "Causal consistency of structural equation models." arXiv preprint arXiv:1707.00819 (2017).

Advisor: Fabio Massimo Zennaro

Causal Bandits

"Multi-armed bandit" is an informal name for slot machines, and the formal name of a large class of problems where an agent has to choose an action among a range of possibilities without knowing the ensuing rewards. Multi-armed bandit problems are one of the most essential reinforcement learning problems where an agent is directly faced with an exploitation-exploration trade-off.       This project will consider a class of multi-armed bandits where an agent, upon taking an action, interacts with a causal system [1]. The primary goal will be the development of learning strategies that takes advantage of the underlying causal system in order to learn optimal policies in a shortest amount of time.      [1] Lattimore, Finnian, Tor Lattimore, and Mark D. Reid. "Causal bandits: Learning good interventions via causal inference." Advances in neural information processing systems 29 (2016).

Causal Modelling for Battery Manufacturing

Lithium-ion batteries are poised to be one of the most important sources of energy in the near future. Yet, the process of manufacturing these batteries is very hard to model and control. Optimizing the different phases of production to maximize the lifetime of the batteries is a non-trivial challenge since physical models are limited in scope and collecting experimental data is extremely expensive and time-consuming [1].      This project will consider the problem of aggregating and analyzing data regarding a few stages in the process of battery manufacturing. The primary goal will be the development of algorithms for transporting and integrating data collected in different contexts, as well as the use of explainable algorithms to interpret them.      [1] Niri, Mona Faraji, et al. "Quantifying key factors for optimised manufacturing of Li-ion battery anode and cathode via artificial intelligence." Energy and AI 7 (2022): 100129.

Advisor: Fabio Massimo Zennaro ,  Mona Faraji Niri

Reinforcement Learning for Computer Security

The field of computer security presents a wide variety of challenging problems for artificial intelligence and autonomous agents. Guaranteeing the security of a system against attacks and penetrations by malicious hackers has always been a central concern of this field, and machine learning could now offer a substantial contribution. Security capture-the-flag simulations are particularly well-suited as a testbed for the application and development of reinforcement learning algorithms [1].       This project will consider the use of reinforcement learning for the preventive purpose of testing systems and discovering vulnerabilities before they can be exploited. The primary goal will be the modelling of capture-the-flag challenges of interest and the development of reinforcement learning algorithms that can solve them.      [1] Erdodi, Laszlo, and Fabio Massimo Zennaro. "The Agent Web Model--Modelling web hacking for reinforcement learning." arXiv preprint arXiv:2009.11274 (2020).

Advisor: Fabio Massimo Zennaro ,  Laszlo Tibor Erdodi

Approaches to AI Safety

The world and the Internet are more and more populated by artificial autonomous agents carrying out tasks on our behalf. Many of these agents are provided with an objective and they learn their behaviour trying to achieve their objective as better as they can. However, this approach can not guarantee that an agent, while learning its behaviour, will not undertake actions that may have unforeseen and undesirable effects. Research in AI safety tries to design autonomous agent that will behave in a predictable and safe way [1].      This project will consider specific problems and novel solution in the domain of AI safety and reinforcement learning. The primary goal will be the development of innovative algorithms and their implementation withing established frameworks.      [1] Amodei, Dario, et al. "Concrete problems in AI safety." arXiv preprint arXiv:1606.06565 (2016).

Reinforcement Learning for Super-modelling

Super-modelling [1] is a technique designed for combining together complex dynamical models: pre-trained models are aggregated with messages and information being exchanged in order synchronize the behavior  of the different modles and produce more accurate and reliable predictions. Super-models are used, for instance, in weather or climate science, where pre-existing models are ensembled together and their states dynamically aggregated to generate more realistic simulations. 

This project will consider how reinforcement learning algorithms may be used to solve the coordination problem among the individual models forming a super-model. The primary goal will be the formulation of the super-modelling problem within the reinforcement learning framework and the study of custom RL algorithms to improve the overall performance of super-models.

[1] Schevenhoven, Francine, et al. "Supermodeling: improving predictions with an ensemble of interacting models." Bulletin of the American Meteorological Society 104.9 (2023): E1670-E1686.

Advisor: Fabio Massimo Zennaro ,  Francine Janneke Schevenhoven

The Topology of Flight Paths

Air traffic data tells us the position, direction, and speed of an aircraft at a given time. In other words, if we restrict our focus to a single aircraft, we are looking at a multivariate time-series. We can visualize the flight path as a curve above earth's surface quite geometrically. Topological data analysis (TDA) provides different methods for analysing the shape of data. Consequently, TDA may help us to extract meaningful features from the air traffic data. Although the typical flight path shapes may not be particularly intriguing, we can attempt to identify more intriguing patterns or “abnormal” manoeuvres, such as aborted landings, go-arounds, or diverts.

Advisor:  Odin Hoff Gardå , Nello Blaser

Automatic hyperparameter selection for isomap

Isomap is a non-linear dimensionality reduction method with two free hyperparameters (number of nearest neighbors and neighborhood radius). Different hyperparameters result in dramatically different embeddings. Previous methods for selecting hyperparameters focused on choosing one optimal hyperparameter. In this project, you will explore the use of persistent homology to find parameter ranges that result in stable embeddings. The project has theoretic and computational aspects.

Advisor: Nello Blaser

Validate persistent homology

Persistent homology is a generalization of hierarchical clustering to find more structure than just the clusters. Traditionally, hierarchical clustering has been evaluated using resampling methods and assessing stability properties. In this project you will generalize these resampling methods to develop novel stability properties that can be used to assess persistent homology. This project has theoretic and computational aspects.

Topological Ancombs quartet

This topic is based on the classical Ancombs quartet and families of point sets with identical 1D persistence ( https://arxiv.org/abs/2202.00577 ). The goal is to generate more interesting datasets using the simulated annealing methods presented in ( http://library.usc.edu.ph/ACM/CHI%202017/1proc/p1290.pdf ). This project is mostly computational.

Persistent homology vectorization with cycle location

There are many methods of vectorizing persistence diagrams, such as persistence landscapes, persistence images, PersLay and statistical summaries. Recently we have designed algorithms to in some cases efficiently detect the location of persistence cycles. In this project, you will vectorize not just the persistence diagram, but additional information such as the location of these cycles. This project is mostly computational with some theoretic aspects.

Divisive covers

Divisive covers are a divisive technique for generating filtered simplicial complexes. They original used a naive way of dividing data into a cover. In this project, you will explore different methods of dividing space, based on principle component analysis, support vector machines and k-means clustering. In addition, you will explore methods of using divisive covers for classification. This project will be mostly computational.

Learning Acquisition Functions for Cost-aware Bayesian Optimization

This is a follow-up project of an earlier Master thesis that developed a novel method for learning Acquisition Functions in Bayesian Optimization through the use of Reinforcement Learning. The goal of this project is to further generalize this method (more general input, learned cost-functions) and apply it to hyperparameter optimization for neural networks.

Advisors: Nello Blaser , Audun Ljone Henriksen

Stable updates

This is a follow-up project of an earlier Master thesis that introduced and studied empirical stability in the context of tree-based models. The goal of this project is to develop stable update methods for deep learning models. You will design sevaral stable methods and empirically compare them (in terms of loss and stability) with a baseline and with one another.

Advisors:  Morten Blørstad , Nello Blaser

Multimodality in Bayesian neural network ensembles

One method to assess uncertainty in neural network predictions is to use dropout or noise generators at prediction time and run every prediction many times. This leads to a distribution of predictions. Informatively summarizing such probability distributions is a non-trivial task and the commonly used means and standard deviations result in the loss of crucial information, especially in the case of multimodal distributions with distinct likely outcomes. In this project, you will analyze such multimodal distributions with mixture models and develop ways to exploit such multimodality to improve training. This project can have theoretical, computational and applied aspects.

Learning a hierarchical metric

Often, labels have defined relationships to each other, for instance in a hierarchical taxonomy. E.g. ImageNet labels are derived from the WordNet graph, and biological species are taxonomically related, and can have similarities depending on life stage, sex, or other properties.

ArcFace is an alternative loss function that aims for an embedding that is more generally useful than softmax. It is commonly used in metric learning/few shot learning cases.

Here, we will develop a metric learning method that learns from data with hierarchical labels. Using multiple ArcFace heads, we will simultaneously learn to place representations to optimize the leaf label as well as intermediate labels on the path from leaf to root of the label tree. Using taxonomically classified plankton image data, we will measure performance as a function of ArcFace parameters (sharpness/temperature and margins -- class-wise or level-wise), and compare the results to existing methods.

Advisor: Ketil Malde ( [email protected] )

Self-supervised object detection in video

One challenge with learning object detection is that in many scenes that stretch off into the distance, annotating small, far-off, or blurred objects is difficult. It is therefore desirable to learn from incompletely annotated scenes, and one-shot object detectors may suffer from incompletely annotated training data.

To address this, we will use a region-propsal algorithm (e.g. SelectiveSearch) to extract potential crops from each frame. Classification will be based on two approaches: a) training based on annotated fish vs random similarly-sized crops without annotations, and b) using a self-supervised method to build a representation for crops, and building a classifier for the extracted regions. The method will be evaluated against one-shot detectors and other training regimes.

If successful, the method will be applied to fish detection and tracking in videos from baited and unbaited underwater traps, and used to estimate abundance of various fish species.

See also: Benettino (2016): https://link.springer.com/chapter/10.1007/978-3-319-48881-3_56

Representation learning for object detection

While traditional classifiers work well with data that is labeled with disjoint classes and reasonably balanced class abundances, reality is often less clean. An alternative is to learn a vectors space embedding that reflects semantic relationships between objects, and deriving classes from this representation. This is especially useful for few-shot classification (ie. very few examples in the training data).

The task here is to extend a modern object detector (e.g. Yolo v8) to output an embedding of the identified object. Instead of a softmax classifier, we can learn the embedding either in a supervised manner (using annotations on frames) by attaching an ArcFace or other supervised metric learning head. Alternatively, the representation can be learned from tracked detections over time using e.g. a contrastive loss function to keep the representation for an object (approximately) constant over time. The performance of the resulting object detector will be measured on underwater videos, targeting species detection and/or indiviual recognition (re-ID).

Time-domain object detection

Object detectors for video are normally trained on still frames, but it is evident (from human experience) that using time domain information is more effective. I.e., it can be hard to identify far-off or occluded objects in still images, but movement in time often reveals them.

Here we will extend a state of the art object detector (e.g. yolo v8) with time domain data. Instead of using a single frame as input, the model will be modified to take a set of frames surrounding the annotated frame as input. Performance will be compared to using single-frame detection.

Large-scale visualization of acoustic data

The Institute of Marine Research has decades of acoustic data collected in various surveys. These data are in the process of being converted to data formats that can be processed and analyzed more easily using packages like Xarray and Dask.

The objective is to make these data more accessible to regular users by providing a visual front end. The user should be able to quickly zoom in and out, perform selection, export subsets, apply various filters and classifiers, and overlay annotations and other relevant auxiliary data.

Learning acoustic target classification from simulation

Broadband echosounders emit a complex signal that spans a large frequency band. Different targets will reflect, absorb, and generate resonance at different amplitudes and frequencies, and it is therefore possible to classify targets at much higher resolution and accuracy than before. Due to the complexity of the received signals, deriving effective profiles that can be used to identify targets is difficult.

Here we will use simulated frequency spectra from geometric objects with various shapes, orientation, and other properties. We will train ML models to estimate (recover) the geometric and material properties of objects based on these spectra. The resulting model will be applied to read broadband data, and compared to traditional classification methods.

Online learning in real-time systems

Build a model for the drilling process by using the Virtual simulator OpenLab ( https://openlab.app/ ) for real-time data generation and online learning techniques. The student will also do a short survey of existing online learning techniques and learn how to cope with errors and delays in the data.

Advisor: Rodica Mihai

Building a finite state automaton for the drilling process by using queries and counterexamples

Datasets will be generated by using the Virtual simulator OpenLab ( https://openlab.app/ ). The student will study the datasets and decide upon a good setting to extract a finite state automaton for the drilling process. The student will also do a short survey of existing techniques for extracting finite state automata from process data. We present a novel algorithm that uses exact learning and abstraction to extract a deterministic finite automaton describing the state dynamics of a given trained RNN. We do this using Angluin's L*algorithm as a learner and the trained RNN as an oracle. Our technique efficiently extracts accurate automata from trained RNNs, even when the state vectors are large and require fine differentiation.arxiv.org

Scaling Laws for Language Models in Generative AI

Large Language Models (LLM) power today's most prominent language technologies in Generative AI like ChatGPT, which, in turn, are changing the way that people access information and solve tasks of many kinds.

A recent interest on scaling laws for LLMs has shown trends on understanding how well they perform in terms of factors like the how much training data is used, how powerful the models are, or how much computational cost is allocated. (See, for example, Kaplan et al. - "Scaling Laws for Neural Language Models”, 2020.)

In this project, the task will consider to study scaling laws for different language models and with respect with one or multiple modeling factors.

Advisor: Dario Garigliotti

Applications of causal inference methods to omics data

Many hard problems in machine learning are directly linked to causality [1]. The graphical causal inference framework developed by Judea Pearl can be traced back to pioneering work by Sewall Wright on path analysis in genetics and has inspired research in artificial intelligence (AI) [1].

The Michoel group has developed the open-source tool Findr [2] which provides efficient implementations of mediation and instrumental variable methods for applications to large sets of omics data (genomics, transcriptomics, etc.). Findr works well on a recent data set for yeast [3].

We encourage students to explore promising connections between the fiels of causal inference and machine learning. Feel free to contact us to discuss projects related to causal inference. Possible topics include: a) improving methods based on structural causal models, b) evaluating causal inference methods on data for model organisms, c) comparing methods based on causal models and neural network approaches.

References:

1. Schölkopf B, Causality for Machine Learning, arXiv (2019):  https://arxiv.org/abs/1911.10500

2. Wang L and Michoel T. Efficient and accurate causal inference with hidden confounders from genome-transcriptome variation data. PLoS Computational Biology 13:e1005703 (2017).  https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005703

3. Ludl A and and Michoel T. Comparison between instrumental variable and mediation-based methods for reconstructing causal gene networks in yeast. arXiv:2010.07417  https://arxiv.org/abs/2010.07417

Advisors: Adriaan Ludl ,  Tom Michoel

Space-Time Linkage of Fish Distribution to Environmental Conditions

Conditions in the marine environment, such as, temperature and currents, influence the spatial distribution and migration patterns of marine species. Hence, understanding the link between environmental factors and fish behavior is crucial in predicting, e.g., how fish populations may respond to climate change.   Deriving this link is challenging because it requires analysis of two types of datasets (i) large environmental (currents, temperature) datasets that vary in space and time, and (ii) sparse and sporadic spatial observations of fish populations.

Project goal   

The primary goal of the project is to develop a methodology that helps predict how spatial distribution of two fish stocks (capelin and mackerel) change in response to variability in the physical marine environment (ocean currents and temperature).  The information can also be used to optimize data collection by minimizing time spent in spatial sampling of the populations.

The project will focus on the use of machine learning and/or causal inference algorithms.  As a first step, we use synthetic (fish and environmental) data from analytic models that couple the two data sources.  Because the ‘truth’ is known, we can judge the efficiency and error margins of the methodologies. We then apply the methodologies to real world (empirical) observations.

Advisors:  Tom Michoel , Sam Subbey . 

Towards precision medicine for cancer patient stratification

On average, a drug or a treatment is effective in only about half of patients who take it. This means patients need to try several until they find one that is effective at the cost of side effects associated with every treatment. The ultimate goal of precision medicine is to provide a treatment best suited for every individual. Sequencing technologies have now made genomics data available in abundance to be used towards this goal.

In this project we will specifically focus on cancer. Most cancer patients get a particular treatment based on the cancer type and the stage, though different individuals will react differently to a treatment. It is now well established that genetic mutations cause cancer growth and spreading and importantly, these mutations are different in individual patients. The aim of this project is use genomic data allow to better stratification of cancer patients, to predict the treatment most likely to work. Specifically, the project will use machine learning approach to integrate genomic data and build a classifier for stratification of cancer patients.

Advisor: Anagha Joshi

Unraveling gene regulation from single cell data

Multi-cellularity is achieved by precise control of gene expression during development and differentiation and aberrations of this process leads to disease. A key regulatory process in gene regulation is at the transcriptional level where epigenetic and transcriptional regulators control the spatial and temporal expression of the target genes in response to environmental, developmental, and physiological cues obtained from a signalling cascade. The rapid advances in sequencing technology has now made it feasible to study this process by understanding the genomewide patterns of diverse epigenetic and transcription factors as well as at a single cell level.

Single cell RNA sequencing is highly important, particularly in cancer as it allows exploration of heterogenous tumor sample, obstructing therapeutic targeting which leads to poor survival. Despite huge clinical relevance and potential, analysis of single cell RNA-seq data is challenging. In this project, we will develop strategies to infer gene regulatory networks using network inference approaches (both supervised and un-supervised). It will be primarily tested on the single cell datasets in the context of cancer.

Developing a Stress Granule Classifier

To carry out the multitude of functions 'expected' from a human cell, the cell employs a strategy of division of labour, whereby sub-cellular organelles carry out distinct functions. Thus we traditionally understand organelles as distinct units defined both functionally and physically with a distinct shape and size range. More recently a new class of organelles have been discovered that are assembled and dissolved on demand and are composed of liquid droplets or 'granules'. Granules show many properties characteristic of liquids, such as flow and wetting, but they can also assume many shapes and indeed also fluctuate in shape. One such liquid organelle is a stress granule (SG). 

Stress granules are pro-survival organelles that assemble in response to cellular stress and important in cancer and neurodegenerative diseases like Alzheimer's. They are liquid or gel-like and can assume varying sizes and shapes depending on their cellular composition. 

In a given experiment we are able to image the entire cell over a time series of 1000 frames; from which we extract a rough estimation of the size and shape of each granule. Our current method is susceptible to noise and a granule may be falsely rejected if the boundary is drawn poorly in a small majority of frames. Ideally, we would also like to identify potentially interesting features, such as voids, in the accepted granules.

We are interested in applying a machine learning approach to develop a descriptor for a 'classic' granule and furthermore classify them into different functional groups based on disease status of the cell. This method would be applied across thousands of granules imaged from control and disease cells. We are a multi-disciplinary group consisting of biologists, computational scientists and physicists. 

Advisors: Sushma Grellscheid , Carl Jones

Machine Learning based Hyperheuristic algorithm

Develop a Machine Learning based Hyper-heuristic algorithm to solve a pickup and delivery problem. A hyper-heuristic is a heuristics that choose heuristics automatically. Hyper-heuristic seeks to automate the process of selecting, combining, generating or adapting several simpler heuristics to efficiently solve computational search problems [Handbook of Metaheuristics]. There might be multiple heuristics for solving a problem. Heuristics have their own strength and weakness. In this project, we want to use machine-learning techniques to learn the strength and weakness of each heuristic while we are using them in an iterative search for finding high quality solutions and then use them intelligently for the rest of the search. Once a new information is gathered during the search the hyper-heuristic algorithm automatically adjusts the heuristics.

Advisor: Ahmad Hemmati

Machine learning for solving satisfiability problems and applications in cryptanalysis

Advisor: Igor Semaev

Hybrid modeling approaches for well drilling with Sintef

Several topics are available.

"Flow models" are first-principles models simulating the flow, temperature and pressure in a well being drilled. Our project is exploring "hybrid approaches" where these models are combined with machine learning models that either learn from time series data from flow model runs or from real-world measurements during drilling. The goal is to better detect drilling problems such as hole cleaning, make more accurate predictions and correctly learn from and interpret real-word data.

The "surrogate model" refers to  a ML model which learns to mimic the flow model by learning from the model inputs and outputs. Use cases for surrogate models include model predictions where speed is favoured over accuracy and exploration of parameter space.

Surrogate models with active Learning

While it is possible to produce a nearly unlimited amount of training data by running the flow model, the surrogate model may still perform poorly if it lacks training data in the part of the parameter space it operates in or if it "forgets" areas of the parameter space by being fed too much data from a narrow range of parameters.

The goal of this thesis is to build a surrogate model (with any architecture) for some restricted parameter range and implement an active learning approach where the ML requests more model runs from the flow model in the parts of the parameter space where it is needed the most. The end result should be a surrogate model that is quick and performs acceptably well over the whole defined parameter range.

Surrogate models trained via adversarial learning

How best to train surrogate models from runs of the flow model is an open question. This master thesis would use the adversarial learning approach to build a surrogate model which to its "adversary" becomes indistinguishable from the output of an actual flow model run.

GPU-based Surrogate models for parameter search

While CPU speed largely stalled 20 years ago in terms of working frequency on single cores, multi-core CPUs and especially GPUs took off and delivered increases in computational power by parallelizing computations.

Modern machine learning such as deep learning takes advantage this boom in computing power by running on GPUs.

The SINTEF flow models in contrast, are software programs that runs on a CPU and does not happen to utilize multi-core CPU functionality. The model runs advance time-step by time-step and each time step relies on the results from the previous time step. The flow models are therefore fundamentally sequential and not well suited to massive parallelization.

It is however of interest to run different model runs in parallel, to explore parameter spaces. The use cases for this includes model calibration, problem detection and hypothesis generation and testing.

The task of this thesis is to implement an ML-based surrogate model in such a way that many surrogate model outputs can be produced at the same time using a single GPU. This will likely entail some trade off with model size and maybe some coding tricks.

Uncertainty estimates of hybrid predictions (Lots of room for creativity, might need to steer it more, needs good background literature)

When using predictions from a ML model trained on time series data, it is useful to know if it's accurate or should be trusted. The student is challenged to develop hybrid approaches that incorporates estimates of uncertainty. Components could include reporting variance from ML ensembles trained on a diversity of time series data, implementation of conformal predictions, analysis of training data parameter ranges vs current input, etc. The output should be a "traffic light signal" roughly indicating the accuracy of the predictions.

Transfer learning approaches

We're assuming an ML model is to be used for time series prediction

It is possible to train an ML on a wide range of scenarios in the flow models, but we expect that to perform well, the model also needs to see model runs representative of the type of well and drilling operation it will be used in. In this thesis the student implements a transfer learning approach, where the model is trained on general model runs and fine-tuned on a most representative data set.

(Bonus1: implementing one-shot learning, Bonus2: Using real-world data in the fine-tuning stage)

ML capable of reframing situations

When a human oversees an operation like well drilling, she has a mental model of the situation and new data such as pressure readings from the well is interpreted in light of this model. This is referred to as "framing" and is the normal mode of work. However, when a problem occurs, it becomes harder to reconcile the data with the mental model. The human then goes into "reframing", building a new mental model that includes the ongoing problem. This can be seen as a process of hypothesis generation and testing.

A computer model however, lacks re-framing. A flow model will keep making predictions under the assumption of no problems and a separate alarm system will use the deviation between the model predictions and reality to raise an alarm. This is in a sense how all alarm systems work, but it means that the human must discard the computer model as a tool at the same time as she's handling a crisis.

The student is given access to a flow model and a surrogate model which can learn from model runs both with and without hole cleaning and is challenged to develop a hybrid approach where the ML+flow model continuously performs hypothesis generation and testing and is able to "switch" into predictions of  a hole cleaning problem and different remediations of this.

Advisor: Philippe Nivlet at Sintef together with advisor from UiB

Explainable AI at Equinor

In the project Machine Teaching for XAI (see  https://xai.w.uib.no ) a master thesis in collaboration between UiB and Equinor.

Advisor: One of Pekka Parviainen/Jan Arne Telle/Emmanuel Arrighi + Bjarte Johansen from Equinor.

Explainable AI at Eviny

In the project Machine Teaching for XAI (see  https://xai.w.uib.no ) a master thesis in collaboration between UiB and Eviny.

Advisor: One of Pekka Parviainen/Jan Arne Telle/Emmanuel Arrighi + Kristian Flikka from Eviny.

If you want to suggest your own topic, please contact Pekka Parviainen ,  Fabio Massimo Zennaro or Nello Blaser .

Graph

Thrive On: The Campaign for Utica University → Delays Due to Weather → -->

Over the shoulder view of a person working on a computer from home.

What to Expect from an Online Data Science Master’s Degree

February 28, 2024

The exponential growth of data utilization has led to an unprecedented demand for skilled professionals who can navigate, analyze, and derive meaningful insights from this wealth of information. Experts predict there could be up to 1.4 million new jobs created in data science and data analytics between 2024 and 2027, according to the World Economic Forum .

A master’s degree in data science can equip you with the dynamic skills and knowledge required for these in-demand roles—and fully online degree programs can enable you to level up your skill set from the comfort of your home, according to your schedule.

But not all online graduate programs will provide you with the supportive, cutting-edge data science training you’ll need to be successful. Join us as we outline what you should be looking for in a high-quality online data science master’s degree.

3 Features of a top data science master’s program

A career in data science carries great potential for tangible impact. Whether it’s improving healthcare outcomes, optimizing business operations, or enhancing user experiences, data science professionals can make meaningful contributions to real-world problem solving.

If you’re looking for both a chance to gain the expertise needed to pursue a career in this booming field and the flexibility of virtual learning, you’ll need to find an online graduate program that offers the following elements:

1. A comprehensive curriculum of relevant, in-demand courses

Examining a program’s curriculum will give you a good idea of the depth and breadth of knowledge students at that school will acquire during their studies. A high-quality data science master’s degree curriculum should possess several key elements to meet the demands of today’s industry and academic standards.

Most importantly, it should offer a comprehensive foundation in fundamental data science concepts. The core curriculum within Utica University’s Master of Science (MS) in Data Science program, for example, includes the following courses:

  • Introduction to Data Science
  • Statistical Methods
  • Data Mining
  • Machine Learning
  • Data Visualization

In addition to covering the foundational concepts of data science, the most impactful master’s programs also include online coursework related to advanced topics and emerging trends in the field. Learning about things like deep learning, natural language processing, and big data technologies is beneficial in preparing students to meet current market demands.

You’ll also want an online program with a curriculum that adapts to the landscape of data science by regularly updating its content. In such a rapidly evolving field, the most effective degree programs are those that can quickly incorporate new technologies, tools, and methodologies. This ensures that graduates are well equipped with the most up-to-date knowledge and skills being sought by employers.

Finally, it’s worth looking for a data science master’s program that prioritizes practical application through hands-on projects, internships, or real-world case studies. Utica University students, for example, can choose between a three-credit capstone or thesis project in which they’re given the chance to demonstrate their newly acquired expertise in a hands-on practicum or research project.

2. Opportunities for career-focused specialization

All data science master’s program graduates should exit their programs with the same suite of core competencies. Beyond enabling students with this foundational skill set, however, the best online data science master’s degrees also provide specialized opportunities for career development.

The field of data science is incredibly multifaceted, which is why the program at Utica University offers online students the opportunity to customize their degree experiences to fit their personal career goals.

Students can pursue a general track within the MS in Data Science program, which covers a breadth of different industry-relevant topics. There are also opportunities for Utica students to select from these in-demand specializations:

  • Business Analytics: Students learn how to use demographic trends, census information, and other databases to predict market workforce developments and risks.
  • Cybersecurity: Students become familiar with state-of-the-art cybersecurity and computer forensic practices and learn how to apply their data science knowledge to identify malicious activities and protect organizations from cyber-attacks.
  • Financial Crime: Students learn how to apply preventative and investigative approaches to financial compliance misdemeanors; they also master tactics for managing economic crime.

Students can opt to pursue multiple specializations—or none at all. Even without formally committing to a specialized track, Utica University affords data science students the opportunity to personalize their curricular experience by electing to take courses that align with their specific needs or interests.

3. Access to expert guidance & personalized support

The infrastructure that supports an online data science master’s degree doesn’t solely stem from the program’s curricular offerings and opportunities for career-focused study. The faculty and staff in place to support you can have just as strong an influence on your experience as an online graduate student.

As you examine different programs, it’s worth it to seek out instructors with proven experience as practitioners in the field. Faculty members who can help students gain the practical skills needed to apply complex theories often have a greater impact. It also helps when professors are actively involved in the field, with their fingers on the pulse of emerging data science trends.

At Utica University, our online MS in Data Science faculty are well-versed in the challenges that working with data can present, with a wealth of hands-on experience in healthcare, geographic information systems (GIS), urban-rural dynamics, and cybersecurity. Students benefit from their decades of collective experience and graduate with a network of veteran data professionals who can help them in their careers.

Each online student at Utica University also receives the support of an individualized team made up of academic advisors, Student Success Coaches, career services, and more. Whether students need academic assistance, professional guidance, access to mental health services, or technology support, there is a suite of resources available to help every step of the way.

Unlock your career potential with an online master’s in data science

As you seek to expand your career prospects as a data science professional, an online MS in Data Science can equip you with the foundational knowledge you’ll need to be successful in many in-demand career paths.

By pursuing your graduate degree in a virtual classroom environment, you’ll have the opportunity to gain the complex data science skills you need without disrupting your life. And now that you know which features to prioritize in a top-tier online data science master’s program, you can take the next step forward in your journey toward career success.

Learn more about how Utica University can help you get there by visiting our online Master of Science in Data Science program page .

Ready to Take the Next Step?

Utica University

Request Information

We're excited you're interested in learning more about Utica University. Please fill out and submit the form, and we'll be in touch shortly!

I authorize Utica University and its representatives to contact me via SMS or phone. By submitting this form, I am providing my consent. Message and data rates may apply.

I would like to see logins and resources for:

For a general list of frequently used logins, you can also visit our logins page .

IMAGES

  1. Latest Big Data Master Thesis Topics [Professional PhD Thesis Writers]

    master thesis topics data science

  2. Best Data Science Dissertation Topics: A List from Experts

    master thesis topics data science

  3. Topics-feb21

    master thesis topics data science

  4. Latest Research Big Data and Data Science Thesis Topics 2022

    master thesis topics data science

  5. How to choose a novel big data master thesis topics? (Novel Proposal)

    master thesis topics data science

  6. Big Data Master Thesis [Tips to write a big data thesis]

    master thesis topics data science

VIDEO

  1. LJMU

  2. How to Write Master Thesis in Germany Using ChatGPT

  3. Data science ecosystem

  4. Top 10 Human Resource Thesis research topics research paper

  5. HOW TO CHOOSE HEALTHCARE RESEARCH TOPIC & DATA SOURCES FOR THESIS & DISSERTATION -TOP 30 TOPICS

  6. Data Mining Research Topics ideas for MS and PHD Thesis #DataMiningResearch

COMMENTS

  1. Research Topics & Ideas: Data Science

    If you're just starting out exploring data science-related topics for your dissertation, thesis or research project, you've come to the right place. In this post, we'll help kickstart your research by providing a hearty list of data science and analytics-related research ideas, including examples from recent studies.. PS - This is just the start…

  2. 37 Research Topics In Data Science To Stay On Top Of » EML

    22.) Cybersecurity. Cybersecurity is a relatively new research topic in data science and in general, but it's already garnering a lot of attention from businesses and organizations. After all, with the increasing number of cyber attacks in recent years, it's clear that we need to find better ways to protect our data.

  3. Data Science Masters Theses // Arch : Northwestern University

    Data Science Masters Theses. The Master of Science in Data Science program requires the successful completion of 12 courses to obtain a degree. These requirements cover six core courses, a leadership or project management course, two required courses corresponding to a declared specialization, two electives, and a capstone project or thesis.

  4. 10 Best Research and Thesis Topic Ideas for Data Science in 2022

    The best course of action to amplify the robustness of a resume is to participate or take up different data science projects. In this article, we have listed 10 such research and thesis topic ideas to take up as data science projects in 2022. Handling practical video analytics in a distributed cloud: With increased dependency on the internet ...

  5. Best Big Data Science Research Topics for Masters and PhD

    Data science thesis topics. We have compiled a list of data science research topics for students studying data science that can be utilized in data science projects in 2022. our team of professional data experts have brought together master or MBA thesis topics in data science that cater to core areas driving the field of data science and big data that will relieve all your research anxieties ...

  6. Thesis/Capstone for Master's in Data Science

    Learn about the Northwestern SPS master's in data science capstone project and thesis final project. See capstone project tips and get an overview of both options. ... Capstone projects and thesis research give students a chance to study topics of special interest to them. Students can highlight analytical skills developed in the program ...

  7. PDF Thesis topics for the master thesis Data Science and Business Analytics

    Thesis topics for the master thesis Data Science and Business Analytics Topic 1: Logistic regression for modern data structures Promotor Gerda Claeskens Description Logistic regression is widely used for binary classification. In the classical setting with a fixed number of predictive variables p and a large sample size n, the likelihood ratio test

  8. How to write a great data science thesis

    They will stress the importance of structure, substance and style. They will urge you to write down your methodology and results first, then progress to the literature review, introduction and conclusions and to write the summary or abstract last. To write clearly and directly with the reader's expectations always in mind.

  9. MIT Theses

    MIT's DSpace contains more than 58,000 theses completed at MIT dating as far back as the mid 1800's. Theses in this collection have been scanned by the MIT Libraries or submitted in electronic format by thesis authors. Since 2004 all new Masters and Ph.D. theses are scanned and added to this collection after degrees are awarded.

  10. PDF Data Science for Engineering: Open thesis topics

    Data Science for Engineering: Open thesis topics Jun.-Prof. Dr. Sebastian Peitz Paderborn University October 29, 2021 This PDF file contains a list of open topics for Master (and possibly Bachelor) theses offered by ... set on the task to reconstruct complex physical data from a small number of realistically measurable data points. Literature ...

  11. PDF Master Thesis: Data Science and Marketing Analytics

    1.3 Structure of the thesis The remainder of the thesis is structured as follows. Chapter 2 gives an overview of the literature focused on attribution modeling in which topics such as attribution modeling, rule-based models, data-driven models, and explainable machine learning are discussed. Chapter 3 briefly discusses

  12. Thesis Projects and Research in DS

    The topic for the Master's thesis must be chosen within Data Science. Before starting a Master's thesis, it is important to agree with your supervisor on the task and the assessment scheme. Both have to be documented thoroughly. You electronically register the Master's thesis in mystudies. It is possible to complete the Master's thesis ...

  13. Thesis Option

    Data Science master's students can choose to satisfy the research experience requirement by selecting the thesis option. Students will spend the majority of their second year working on a substantial data science project that culminates in the submission and oral defense of a master's thesis. While all thesis projects must be related to data science, students are given leeway in finding a ...

  14. Bachelor and Master Thesis : Professorship of Data Science

    Bachelor and Master Thesis. We offer a variety of cutting-edge and exciting research topics for Bachelor's and Master's theses. We cover a wide range of topics from Data Science, Natural Language Processing, Argument Mining, the Use of AI in Business, Ethics in AI and Multimodal AI. We are always open to suggestions for your own topics, so ...

  15. Thesis topic for MSc in Data Science

    Thesis topic for MSc in Data Science. ... What are some good ideas for a Master's Thesis if I want to become a Data Scientist or Data Analytics? Question. 3 answers. Asked 9th Oct, 2021;

  16. MSc in Data Science, Project Guide

    The project is an essential component of the Masters course. ... not its length. Often, a shorter thesis focused on a topic in depth will be better than a longer, meandering thesis with extraneous details. ... The 120pt version is for students who have a previous Master's degree in an area relating to data science along with a clear project and ...

  17. Statistical Learning and Data Science Chair :: Theses

    Statistical Learning and Data Science Chair :: Theses. Theses. The chair typically offers various thesis topics each semester in the areas computational statistics, machine learning, data mining, optimization and statistical software. You are welcome to suggest your own topic as well. Before you apply for a thesis topic make sure that you fit ...

  18. Master Thesis

    Master Theses. Below you find our current topic proposals as pdf-files. If you are interested in a certain topic, please send an e-mail to wima-abschlussarbeiten [at]lists.fau.de. Please refrain from writing emails to other addresses. Your e-mail should include. your transcript of records. a letter of motivation (approximately half a page)

  19. Instructions for MSc Thesis

    Read the MSc thesis instructions and grading criteria on the university website. Computer Science Master's program: . Data Science Master's program: . Step 1: Choose a topic. Choose a topic among the ones listed on the group's webpage . You can also propose your own topic.

  20. Best Data Science Dissertation Topics: A List from Experts

    Here are the hand-picked dissertation topics for data science that can help you grab the reader's attention quickly and without too much effort. 1. Compare the implementation of data science in various investigations concerning wildfires. 2. Explain the K-means clustering from the perspective of online spherical.

  21. Open Theses

    Open Topics We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A non-exhaustive list of open topics is listed below.. If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential ...

  22. Available Master's thesis topics in machine learning

    Towards precision medicine for cancer patient stratification. Unraveling gene regulation from single cell data. Developing a Stress Granule Classifier. Machine Learning based Hyperheuristic algorithm. Machine learning for solving satisfiability problems and applications in cryptanalysis. Hybrid modeling approaches for well drilling with Sintef.

  23. Master thesis topics

    1. This is the approach I took: Find journals related to your field of studies. Skim through the proceedings, see if there are titles that catch your interest. Read the papers (carefully or globally) that seemed interesting. Carefully consider the approaches and whatever future suggestions they present in their papers.

  24. What to Expect from an Online Data Science Master's Degree

    Data Mining. Machine Learning. Data Visualization. In addition to covering the foundational concepts of data science, the most impactful master's programs also include online coursework related to advanced topics and emerging trends in the field. Learning about things like deep learning, natural language processing, and big data technologies ...

  25. Best Data Science Courses Online [2024]

    Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning. Course. Learn Data Science or improve your skills online today. Choose from a wide range of Data Science courses offered from top universities and industry leaders. Our Data Science courses are perfect for individuals or for corporate Data Science ...