Email forwarding for @cs.stanford.edu is changing. Updates and details here . CS Commencement Ceremony June 16, 2024.  Learn More .

PhD Admissions

Main navigation.

The Computer Science Department PhD program is a top-ranked research-oriented program, typically completed in 5-6 years. There are very few course requirements and the emphasis is on preparation for a career in Computer Science research. 

Eligibility

To be eligible for admission in a Stanford graduate program, applicants must meet:

  • Applicants from institutions outside of the United States must hold the equivalent of a United States Bachelor's degree from a college or University of recognized good standing. See detailed information by region on  Stanford Graduate Admissions website. 
  • Area of undergraduate study . While we do not require a specific undergraduate coursework, it is important that applicants have strong quantitative and analytical skills; a Bachelor's degree in Computer Science is not required.

Any questions about the admissions eligibility should be directed to  [email protected] .

Application Checklist

An completed online application must be submitted by the CS Department application deadline and can be found  here .

Application Deadlines

The online application can be found here  and we will only one admissions cycle for the PhD program per respective academic term.

Email forwarding for @cs.stanford.edu is changing. Updates and details here . CS Commencement Ceremony June 16, 2024.  Learn More .

Gates building front: composited in Photoshop

Preparing Our Students to Make Meaningful Contributions to the World

Tamish Pulappadi

Student Spotlight:  Tamish Pulappadi, Computer Science & Music

"We’re at the forefront of artificial intelligence and machine learning. It’s truly mind blowing to be going to a place like Stanford at a time when so much is happening..." Meet Tamish Puluppadi at the intersection of music and technology

CS Degree Programs

Our main educational goal is to prepare students for a rapidly changing world. Undergraduate students have the option of declaring a Bachelor of Science or a Minor in Computer Science. Graduate students have the opportunity to pursue a Master's or PhD degree in Computer Science. The Master's degree is a terminal professional degree. The PhD is for those who desire a research or teaching career.

Group of graduating students sitting and smiling together in their graduation gowns.

A Gateway to Opportunity & Innovation

Stanford Computer Science cultivates an expansive range of research opportunities and a renowned group of faculty. Here, discoveries that impact the world spring from the diverse perspectives and life experiences of our community of students, faculty, and staff.

Karen Lui in the lab

Our Research & Impact

The CS Department is a center for research and education, discovering new frontiers in AI, robotics, scientific computing and more.

More About CS Research

Oussama Khatib posing next to the diving robot, OceanOne.

Our Faculty

Stanford CS faculty members strive to solve the world's most pressing problems, working in conjunction with other leaders across multiple fields. 

Explore Faculty by Their Areas of Research

Removing Barriers to Excellence: Our Culture of Equity & Inclusion

Everyone deserves a voice in the discovery of new technology and the shaping of innovation. Stanford CS is nurturing a future in science that represents all cultures and backgrounds.

2023 CURIS Cohort

Our Research is in the News

young students using technology

A Data-Centered Approach to Education AI

Useful models need better data. Stanford scholars explore a path to ensure domain experts have more fruitful conversation. 

Read the full article on HAI News

illustration of hands

Mehran Sahami on AI and safeguarding society

The Chair of the CS Department and scientist talks about the issues he’s paying attention to in 2024, particularly how to respond to the risks and opportunities of AI. Read the full article in the Stanford Report

SISL team members

A new video: The Stanford Intelligent Systems Laboratory

The Stanford Intelligent Systems Laboratory (SISL) takes research that was built for airplane collision avoidance, and enhances it and broadens it to unmanned aircrafts, unmanned car vehicles, emergency services routing, sub-terranean vehicles, and space vehicles. The common theme is to make decisions with uncertain, ever-changing environments. “AI Safety is about ensuring that the algorithms that we develop behave in the way we expect and in a safe manner when deployed in the real world.” (15:48)

See this new video that covers the latest discoveries, collaborations happening at SISL, and what life is like for the SISL team.

Stories & Voices

Meet some of the students, faculty, and alumni who create the Stanford Computer Science community.

phd in computer science in stanford university

"You could summarize my work by saying I want to give machine learning models more humility."

Neil Band, PhD Candidate, Computer Science

Kendall Beache student spotlight

"One of the things that’s really important to me is working to support marginalized youth to build the technical confidence they need to feel empowered in pursuing a STEM education, or really anything they’re passionate about."

Kendall Beache, Computer Science

phd in computer science in stanford university

"In Iran, where I grew up, there are many female engineers, and I never really doubted my ability to follow this path."

Dorsa Sadigh, Assistant Professor Computer Science | Electrical Engineering

phd in computer science in stanford university

"Working with students and faculty here is an absolutely amazing experience. What really impresses me is the conviction and drive of people here. They really care about the topics they work on. They want to bring value to people and they want to have an impact: to do good for humanity. It’s something very unique to Stanford. The speed at which you can exchange ideas was totally new to me when I came here and I love it."  

- Christopher Hahn, Ph.D Visiting Assistant Professor in the Computer Science Department at Stanford University and an Independent Research Group Leader at the CISPA Helmholtz Center for Information Security

Computer Science

Closeup of students looking up in a classroom

Main navigation

The computer science department continues to lead the world in computer science research and education.

Throughout the past four decades, the department has influenced society at levels that remain without parallel among academic institutions. Its spin-offs are among the most successful corporate ventures in the world, and many of the leaders in the academic and corporate research world are our graduates.

Gates building

What are we researching?

Strong research groups exist in areas of artificial intelligence, robotics, foundations of computer science, scientific computing and systems. 

Top view of lecture hall full of students

What is it like for undergraduate students?

The CS curriculum provides knowledge that is applicable across many fields, including many areas of engineering, science, and medicine. Students receive a strong foundation in computer science as well as specialized knowledge through the student’s choice of track.

Students studying on floor in hallway

What is it like for graduate students?

With faculty and resources that are among the strongest in the world, students have the opportunity to participate in leading-edge academic research carried out at Stanford. The main educational goal is to prepare students for research and teaching careers either in universities or in industry

Information For

  • Prospective Graduate Students
  • Current Graduate Students
  • Prospective Undergraduate Students
  • Current Undergraduate Students

Map of the United States with binary code in the background.

The future of cybersecurity

The PhD Program in Computer Science FAQ

Email forwarding for @cs.stanford.edu is changing. Updates and details here . CS Commencement Ceremony June 16, 2024.  Learn More .

PhD | Program Requirements

Main navigation.

On average, the program is completed in five to six years, depending on the student’s research and progress. First-year students have the opportunity to rotate in three different labs before selecting their advisor. 

The Computer Science Department also believes that teaching is an integral and important part of graduate-level education in Computer Science. In pursuing the PhD degree, students have clear and defined milestones that help guide them to the successful completion of their dissertation and oral defense. This includes a cumulative list of requirements to be completed in order for students to confer their PhD degree in Computer Science.

For any questions related to CS PhD milestone requirements, please email  [email protected] .

  • CS300 Seminar       
  • First-Year Research Rotation Program       
  • Courses       
  • Foundation & Breadth Requirements       
  • Candidacy Requirement       
  • Qualifying Examination       
  • Teaching Requirements       
  • Reading Committee       

Thesis Proposal       

Note : A student may go to TGR status after all the Ph.D. requirements above have been completed, and just their orals and dissertation submission remain, see Special Registration Statuses page.

  • University Oral Examination       
  • Dissertation

Picture

  • Penn Engineering Online Degrees
  • Penn Engineering Online Dual Dual Degree
  • Online Graduate Certificates
  • Take a Course
  • On-Demand Learning
  • MSE-AI Online
  • Request Info

THE RAJ AND NEERA SINGH Program in Artificial Intelligence: Master of Science in Engineering in AI Online

Join the AI revolution. The online Master of Science in Engineering (MSE) in Artificial Intelligence (AI) degree from Penn Engineering equips you with technical skills including classical AI, natural language processing, generative AI, and modern deep learning, alongside a robust ethical framework, preparing you to shape the future of this transformative technology.

If you have an undergraduate degree in Computer Science, Computer Engineering or an equivalent degree and an interest in artificial intelligence, our MSE-AI Online program is for you. Taught by some of the top AI researchers in the world, the program offers a deep dive into machine learning, natural language processing, GPU programming for AI and machine learning, and more — along with the ability to thoroughly analyze AI’s impact on society from various perspectives.

Our asynchronous, online curriculum gives you the flexibility to study anywhere, any time. But you’ll also benefit from the support and friendship of a tight-knit online community. 

phd in computer science in stanford university

The MSE-AI is designed for professionals with an undergraduate degree in computer science, computer engineering, or a related field.

phd in computer science in stanford university

Tuition and Financial Aid

An Ivy League education at an accessible cost, ensuring that high-quality learning is within reach for a wide range of learners.

phd in computer science in stanford university

Study machine learning, statistical modeling, and gain insights into data center infrastructures like distributed systems, networking, and GPU programming, alongside ethical considerations, preparing to navigate AI’s risks.

phd in computer science in stanford university

Student Experience

Online learning offers flexible, interactive, and resource-rich experiences, tailored to individual schedules and preferences, fostering collaborative and enriching journeys.

A message from the program director

phd in computer science in stanford university

Artificial intelligence is a transformative technology that is already boosting productivity across industries and helping people realize creative visions they never before thought possible. While this technology has brought forth opportunities , it has also created challenges for humanity that we must safeguard against.

That’s why I’m so excited to introduce Penn Engineering’s new online Master of Science in Engineering in Artificial Intelligence (MSE-AI Online), which will prepare you to address those risks head-on while building the technical skills to create innovative new AI tools.

In collaboration with Penn Engineering faculty who are some of the top experts in the field, you’ll explore the history of AI and learn to anticipate and mitigate potential challenges of the future. You’ll be prepared to lead change as we embark towards the next phases of this revolutionary technology.

All of our classes are 100% online and asynchronous, giving you the flexibility to learn at a time and pace that work best for you. While you can access this world-class education remotely, you won’t be studying alone. You’ll benefit from the guidance and support of faculty members, classmates, teaching assistants and staff through our robust portfolio of engagement and communication platforms. 

Graduates of this program will go on to found startups, build new models and create new ways to integrate AI tools into current industries. I’m excited to play a role in this transformative field, and I hope you will join us. 

Chris Callison-Burch Program Director, MSE-AI Online

Have questions?

Machine Unlearning in 2024

41 minute read

Written by Ken Liu ∙ May 2024

1. A bit of history & motivations for unlearning

2.1. exact unlearning.

  • 2.2. "Unlearning" via differential privacy
  • 2.3. Empirical unlearning with known example space
  • 2.4. Empirical unlearning with unknown example space

2.5. Just ask for unlearning?

3. evaluating unlearning, 4.1. the spectrum of unlearning hardness, 4.2. copyright protection, 4.3. retrieval-based ai systems, 4.4. ai safety.

As our ML models today become larger and their (pre-)training sets grow to inscrutable sizes, people are increasingly interested in the concept of machine unlearning to edit away undesired things like private data, stale knowledge, copyrighted materials, toxic/unsafe content, dangerous capabilities, and misinformation, without retraining models from scratch.

Machine unlearning can be broadly described as removing the influences of training data from a trained model. At its core, unlearning on a target model seeks to produce an unlearned model that is equivalent to—or at least “behaves like”—a retrained model that is trained on the same data of target model, minus the information to be unlearned.

There’s a lot hidden in the above description. How do we describe the information to be unlearned? Do we always have ground-truth retrained models? If not, how do we actually evaluate the unlearning? Can we even verify and audit the unlearning? Is pretending to unlearn, as humans often do, sufficient? Is unlearning even the right solution? If so, for what problems?

The precise definitions of unlearning, the techniques, the guarantees, and the metrics/evaluations would depend on:

  • The ML task (e.g., binary classification or language modeling);
  • The data to unlearn (e.g., a set of images, news articles, or the knowledge of making napalm );
  • The unlearning algorithm (e.g., heuristic fine-tuning vs deleting model components);
  • The goal of unlearning (e.g., for user privacy or harmfulness removal).

In this educational post, I hope to give a gentle, general ML audience introduction to machine unlearning and touch on things like copyright protection , New York Times v. OpenAI , right-to-be-forgotten , NeurIPS machine unlearning challenge , retrieval-based AI systems , AI safety , along with some of my thoughts on the field. While unlearning is broad topic applicable to most ML models, we will focus a lot on foundation models .

phd in computer science in stanford university

People have thought about the unlearning problem for a while now . The initial research explorations were primarily driven by Article 17 of GDPR (European Union’s privacy regulation), often referred to as “ right-to-be-forgotten ” ( RTBF ) since 2014. RTBF basically says a user has the right to request deletion of their data from a service provider (e.g. deleting your Gmail account).

RTBF was well-intentioned. It was also very actionable when said service providers store user data in a structured way, like how Google removed a bunch of links from its index in repsonse to RTBF requests.

However, RTBF wasn’t really proposed with machine learning in mind. In 2014, policymakers wouldn’t have predicted that deep learning will be a giant hodgepodge of data & compute, and that separating and interpreting this hodgepodge turned out to be hard. The hardness of erasing data from ML models has subsequently motivated research on what is later referred to as “ data deletion ” and “ machine unlearning ”.

A decade later in 2024, user privacy is no longer the only motivation for unlearning. We’ve gone from training small convolutional nets on face images to training giant language models on pay-walled, copyrighted , toxic , dangerous, and otherwise harmful content, all of which we may want to “erase” from the ML models—sometimes with access to only a handful of examples. The nature of the models has changed too. Instead of using many small specialized models each good at one task, people started using a single giant model that knows just about any task.

Currently, I think the motivations for unlearning fall into two categories:

Access revocation (think unlearning private and copyrighted data). In an ideal world, data should be thought of as “borrowed” (possibly unpermittedly) and thus can be “returned”, and unlearning should enable such revocation.

Unlearning is challenging from this perspective. One key difficulty is that our limited understanding of deep learning itself makes data trained into a model akin to “consumables” (which can’t just be “returned” after consumption). Data may also be non-fungible (e.g. your chat history) and may even be thought of as labor with its own financial and control interests. Another challenge is that access revocation may require a proof of unlearning; as we will explore in the coming sections, this isn’t always possible.

These difficulties suggest that it’s perhaps also worth revising laws like RTBF and thinking about alternatives such as data markets , where data owners are properly compensated so they won’t want to request unlearning in the first place. To illustrate, suppose Bob ate Alice’s cheesecake (data), Alice would much rather Bob pay her or return something equivalent (compensation) than Bob puking to his pre-eating state (unlearning).

In practice, one way to implement access revocation is via some form of periodic re-training of the base model. Many model providers already do this to keep their models competitive and up-to-date. For example, OpenAI can collect a bunch of unlearning requests, and batch-satisfy them during the re-training every year (or, guided by RTBF’s “ undue delay ” period by which the request must be satisfied). More broadly, this suggests socio-technical solutions for unlearning: policymakers can mandate such periodic re-training and set economically viable deadlines to offload the costs to the model owners.

Model correction & editing (think toxicity, bias, stale/dangerous knowledge removal). That is, the model was trained on something undesirable and we’d like to fix it. This is closely related to the model editing literature. The concept of “ corrective machine unlearning ”, where unlearning serves to correct the impact of bad data, was recently proposed to capture this motivation. From this perspective, unlearning may also be viewed as a post-training risk mitigation mechanism for AI safety concerns (discussed further in Section 4).

Unlike access revocation, we could be more lenient towards with model correction since the edit is more of a desire than a necessity mandated by law, much like model accuracy on image classification or toxicity of generated text. (Of course, these can cause real harm too.) Here, we won’t necessarily need formal guarantees for the unlearning to be practically useful; we have plenty of examples where people would happily deploy models that are deemed “sufficiently safe”. The recent WMDP benchmark , which quizzes a model on hazardous knowledge, is a good example of empirically evaluating unlearning efficacy.

2. Forms of unlearning

Unlearning is trivially satisfied if we can just retrain the model without the undesired data. However, we want something better because (1) retraining can be expensive and (2) it can be a lot of work just to find out what to remove from training data—think finding all Harry Potter references in a trillion tokens. Unlearning techniques essentially seek to mitigate or avoid this retraining cost while producing identical or similar results.

The unlearning literature can roughly be categorized into the following:

  • Exact unlearning
  • “Unlearning” via differential privacy
  • Empirical unlearning, where data to be unlearned are precisely known (training examples)
  • Empirical unlearning, where data to be unlearned are underspecified (think “knowledge”)
  • Just ask for unlearning?

Forms 2-4 are sometimes known as “ approximate unlearning ” in that the unlearned model approximates the behavior of the retrained model. Form 5 is quite new and interesting, and more specific to instruction-following models.

phd in computer science in stanford university

In the following, we will go through what each of these types roughly looks like, along with what I think are the promises, caveats, and questions to ask looking forward.

Exact unlearning roughly asks that the unlearned model and the retrained model to be distributionally identical ; that is, they can be exactly the same under fixed randomness.

Techniques for exact unlearning are characterized by the early work of Cao & Yang and SISA . In SISA, a very simple scheme, the training set is split into $N$ non-overlapping subsets, and a separate model is trained for each subset. Unlearning involves retraining the model corresponding to and without the data points to be unlearned. This reduces cost from vanilla retraining by $1/N$ (cheaper if we keep model checkpoints). Inference then involves model ensembling. 1

phd in computer science in stanford university

More generally, the essence of exact unlearning of this form is that we want modular components in the learning algorithm to correspond to different (potentially disjoint) sets of the training examples.

There are several benefits of exact unlearning:

  • The algorithm is the proof . If we implement something like SISA, we know by design that the unlearned data never contributed to other components. As it turns out, formally proving the model has unlearned something is quite challenging otherwise.
  • It turns the unlearning problem into an accuracy/efficiency problem. This makes exact unlearning more approachable due to the messiness of unlearning evaluation and lack of benchmarks.
  • Interpretability by design . By providing a structure to learning, we also have better understanding of how certain data points contribute to performance.

The main drawback seems obvious: modern scaling law of large models argues against excessive data & model sharding as done in SISA. Or does it? I think it would be very interesting to revisit sharding in the context of large models, in light of the recent model merging literature that suggests the feasibility of weight-space merging between large models. As we’ll learn in the coming sections, the messiness of approximate unlearning and its evaluation, especially in the context of large models, makes exact unlearning very appealing.

2.2. “Unlearning” via differential privacy

This line of work roughly says: if the model behaves more or less the same with or without any particular data point, then there’s nothing we need to unlearn from that data point. More broadly, we are asking for distributional closeness between the unlearned and the retrained models.

For readers unfamilar with differential privacy (DP) in machine learning, DP defines a quantifiable indistinguishability guarantee between two models $M$, $M’$ trained on datasets $X$, $X’$ that differ in any single training example. The canonical procedure, DP-SGD , works by clipping the L2-norm of the per-example gradients and injecting some per-coordinate Gaussian noise to the gradients. The idea is that the noise would mask or obscure the contribution of any single gradient (example), such that the final model isn’t sensitive any exmaple. It is usually denoted by ($\varepsilon, \delta$)-DP; the stronger the noise, the smaller the scalars ($\varepsilon, \delta$), the more private.

The intuition is that if an adversary cannot (reliably) tell apart the models, then it is as if this data point has never been learned—thus no need to unlearn. DP can be used to achieve this form of unlearning, but due to the one-sidedness of unlearning (where we only care about data removal, not addition), DP is a strictly stronger definition . This notion of unlearning is sometimes known as “ ($\alpha, \beta$)-unlearning ” where ($\alpha, \beta$) serve similar roles as ($\varepsilon, \delta$) to measure distributional closeness.

Example techniques along this direction include: (1) storing checkpoints of (DP) convex models and unlearning is retraining from those checkpoints; and (2) on top of the previous technique, add SISA for adaptive unlearning requests (i.e. those that come in after observing the published model).

DP-based unlearning is good in that it gives some form of a statistical guarantee. However, there are some important considerations that limit its applicability to large models :

  • Many such unlearning results apply only to convex models or losses .
  • What levels of unlearning (values of $(\varepsilon, \delta)$-DP or $(\alpha, \beta)$-unlearning) are sufficient? Who decides?
  • For large models, current ML systems don’t fit well with the per-example workloads of DP-like procedures. The memory overhead will also be prohibitive.
  • Moreoever, like DP, the guarantees can fall off quickly with more unlearning requests (at best the rate of $O(\sqrt k)$ with $k$ requests following DP composition theorems).
  • DP-like definitions implicitly assume we care about all data points equally . But some examples are more likely to receive unlearning request, and some examples would not have contributed to the learning at all.
  • DP-like procedures may also just hurt model accuracy a lot, sometimes in an unfair way.

For large models in particular, it’s also worth distinguishing the cases of unlearning pre-training data vs unlearning fine-tuning data . The latter is a lot more tractable; for example, we could indeed fine-tune large models with differential privacy but not so much with pre-training.

2.2.1. Forging and its implications on DP-like unlearning definitions

An unlearning procedure may sometimes require an external audit , meaning that we’d like to prove that the unlearning procedure has actually happened.

The main idea of “ forging ” is that there exists two distinct datasets that, when trained on, would produce the same gradients and (thus) the same models . This is true intuitively:

  • Think linear regression of points on a perfect line; removing any 1 point doesn’t change the fitted line;
  • Think mini-batch GD, where replacing one example gradient with the sum of several “fake” gradients would give the same batch gradient.

Forging implies that DP-based approximate unlearning may not be auditable —that is, the unlearning service provider cannot formally prove that the forget set is really forgotten. In fact, if we only look at the model weights, even exact unlearning may not be auditable.

While one can brush this off as a theoretical result, it does mean that policymakers should think carefully about how a future version of “right-to-be-forgotten” (if any) should look like and whether similar policies are legally and technically enforceable.

Indeed, what qualifies as an “audit” could very well be definition and application dependent. If the auditor only cares that the unlearned model performs poorly on a specified set of inputs (say on a set of face images), then even empirical unlearning is “auditable” (see next section).

2.3. Empirical unlearning with known example space (“example unlearning”)

This line of work is essentially “training to unlearn” or “unlearning via fine-tuning”: just take a few more heuristically chosen gradient steps to shape the original model’s behavior into what we think the retrained model would do (while also optionally resetting some parameters in the model). It may also be referred to as “example unlearning”, since the training, retain, and forget sets are often clearly defined.

The NeurIPS 2023 Machine Unlearning Challenge collected many methods along this direction. The challenge roughly runs as follows:

  • You are given a face image dataset with designated retain/forget example splits for the training set, a target model trained on everything, and a secret model trained only on the retain set.
  • You are asked to design an unlearning algorithm that produces unlearned model(s) from the target model that “match” the secretly kept model.
  • The “match” or evaluation metric uses a DP-like output-space similarity over 512 seeds: for each forget example, compute an “empirical $\varepsilon$” over 512 unlearned models based on true/false positive rates of an adversary (also provided by the organizer), and aggregate across examples.
  • All models are a small ConvNet.

To give an intuition about how well empirical unlearning is doing without fully explaining the metric: the ground-truth retrained model gets about ~0.19, the winning submission gets to ~0.12, and the baseline (simple gradient ascent on forget set) is ~0.06. 2

So what do the winning ideas look like? Something along the lines of the following:

  • Gradient ascent on the forget set;
  • Gradient descent on the retain set (and hope that catastrophic forgetting takes care of unlearning);
  • Gradient descent on the forget set, but with uniformly random labels (to “confuse” the model);
  • Minimize KL divergence on outputs between unlearned model and original model on the retain set (to regularize unlearned model performance on unrelated data);
  • Re-initialize weights that had similar gradients on the retain set and forget sets, and finetune these weights on the retain set;
  • Prune 99% of weights by L1-norm and fine-tune on the retain set;
  • Reset first/last $k$ layers and fine-tune on the retain set; and
  • Heuristic/arbitrary combinations of the above.

Indeed, despite the heuristic nature of these approaches, these are what most empirical unlearning algorithms , especially those on large (language) models , are doing these days.

People explore empirical approaches because theoretical tools are usually impractical; for example, enforcing DP simply hurts accuracy and efficiency too much, even for the GPU rich. On the flip side, empirical methods are often fast and easy to implement, and their effects are often qualitatively visible.

Another key motivation for empirical unlearning is that counterfactuals are unclear, especially on LLMs. In deep learning, we often don’t know how the retrained model would behave on unseen data. What should the LLM think who Biden is, if not a politician? Should image classifiers give uniformly random predictions for unlearned images? Do they generalize? Or are they confidently wrong? Any of these is possible and it can be up to the practitioner to decide. It also means that behaviors that are equally plausible can lead to wildly different measurements (e.g., KL divergence between output distributions of unlearned & retrained model), complicating theoretical guarantees.

2.4. Empirical unlearning with unknown example space (“concept/knowledge unlearning”)

What if the train, retain, or forget sets are poorly specified or just not specified at all? Foundation models that train on internet-scale data may get requests to unlearn a “ concept ”, a “ fact ”, or a piece of “ knowledge ”, all of which we cannot easily associate a set of examples. The terms “ model editing ”, “ concept editing ”, “ model surgery ”, and “ knowledge unlearning ” are closely related to this notion of unlearning. 3

The underspecification of the unlearning requests means that we now have to deal with the notions of “ unlearning scope ” (or “ editing scope ”) and “ entailment ”. That is, unlearning requests may provide canonical examples to indicate what to unlearn, but the same information can manifest in the (pre-)training set in many different forms with many different downstream implications such that simply achieving unlearning on these examples—even exactly —would not suffice.

For example:

  • The association “Biden is the US president” is dispersed throughout various forms of text from news articles, books, casual text messages, or this very blog post. Can we ever unlearn all occurrences? Moreover, does unlearning Joe Biden also entail unlearning the color of Biden’s cat ?
  • Artists may request to unlearn art style by providing art samples, but they won’t be able to collect everything they have on the internet and their adaptations .
  • New York Times may request to unlearn news articles, but they cannot enumerate quotes and secondary transformations of these articles.

Such vagueness also suggests that unlearning pre-training data from large models are perhaps necessarily empirical: it is unlikely to derive formal guarantees if we can’t clearly specify what to (and what not to) unlearn in the trillions of tokens and establish clear information boundaries between different entities. An interesting implication of achieving unlearning empirically is that the unlearning itself can be unlearned .

What does existing work do, then, with underspecified unlearning requests? Most techniques are more or less the same as before , except now we also need to find the examples to fine-tune on. For example, attempting to unlearn Harry Potter involves asking GPT-4 to come up with plausible alternative text completions (e.g. that Mr. Potter studies baking instead of magic); and attempting to unlearn harmful behavior involves collecting examples of hatespeech.

Another set of techniques involves training the desired behavior (or its opposite) into task / control vectors and harnessing the capability of large models to undergo weight-space merging or activation steering . The fundamental approach of the above is more or less the same, nevertheless—obtaining these edit vectors involves (heuristically) designing what gradients to take and what data on which to take them. One could also frame the unlearning problem as an alignment problem and applies the forget examples with a DPO-like objective .

It turns out that powerful, instruction-following LLMs like GPT-4 are smart enough to pretend to unlearn . This means crafting prompts to induce a (sufficiently) safe behavior for the target unlearning application.

This is an interesting approach because no gradients are involved whatsoever (big plus from a systems perspective), and intuitively the end results could very well be as good as existing empirical unlearning techniques. Among different ways we could prompt, past work explored the following two directions.

Literally asking to pretend unlearning. We can ask in the system prompt to, say, pretend to not know who Harry Potter is. By design, this works best for common entities, facts, knowledge, or behaviors (e.g. the ability to utter like Trump) that are well-captured in the pre-training set, since the LLM needs to know it well to pretend not knowing it well . On the other hand, suppose now we’d like to unlearn the address of an obscure person; the pre-training set is so large that we suspect it’s part of training data. We now face a variant of the Streisand effect : is it even worth asking the model to pretend unlearning by accurately describing it in-context, and subsequently risk leaking it in subsequent model responses?

Few-shot prompting or “ in-context unlearning ”. Suppose we now have a clearly defined set of forget examples with corresponding labels. We can flip their labels and put them in the prompt, along with more retain examples with correct labels, with the intuition that the model would treat these falsely labelled forget examples as truths and act accordingly—much like one could jailbreak a model this way. 4 Indeed, this works best when the forget examples and the counterfactual labels are clearly defined and (somewhat) finite. It may work for factual associations (e.g. Paris is the captial of France) by enumerating a lot of examples, but unlikely to work for unlearning toxic behaviors (where space of possible outputs is much larger).

In a sense, these approaches are complementary as they work for different kinds of unlearning requests.

More broadly, one could imagine a boxed LLM system for unlearning through prompting, where :

  • Only the input and output interfaces are exposed (like ChatGPT);
  • Different instances of a powerful LLM are responsible for accurately mimicking different parts of a desired unlearning behavior (for example, one LLM instance specializes in general trivia-style QA while anoother handles sequence completions);
  • An orchestrator/router LLM decides which unlearning worker instance to call depending on the input; and
  • A composer/summarizer LLM that drafts the final output conforming to the desired unlearning behavior; it may also apply some output filtering.

Some readers may grumble about the heuristic nature of such prompting-based techniques; that there is no proof of unlearning whatsoever. We should keep in mind that fine-tuning based empirical unlearning, as most recent approaches do, is perhaps not fundamentally different. I think it ultimately comes down to the following questions:

  • Which of fine-tuning or prompting can better steer model behavior ?
  • Which of them are less susceptible to attacks (exposing less surfaces and/or requiring more effort for an adversary to revert the unlearning)?

My intuition of our current models says that both questions point to fine-tuning based unlearning, but this is very much up for debate and can change as we get more powerful models and better defense mechanisms. For example, the recent notion of an instruction hierarchy may help make such as an LLM system less susceptible to malicious prompts.

It might be useful to note that humans don’t really “unlearn” a piece of knowledge either. 5 In fact, by claiming to have unlearned something, we often have: (1) not only learned it well to be able to make the very claim that we have unlearned it, and (2) consciously decided that it’s no longer useful / beneficial to apply this knowledge to our current world state. Who is to say that unlearning for LLMs should be any different?

Unlearning is messy for many reasons. But one of the biggest broken things about unlearning is evaluation. In general, we care about three aspects:

  • Efficiency : how fast is the algorithm compared to re-training?
  • Model utility : do we harm performance on the retain data or orthogonal tasks?
  • Forgetting quality : how much of the “forget data” is actually unlearned? How fast can we recover (re-learn) them?

Evaluating efficiency and model utility are easier; we already measure them during training. The key challenge is in understanding the forgetting quality. 6

If the forget examples are specified, this feels easy too. For example, unlearning a particular image class intuitively means getting a near-chance accuracy on the images in that class. An evaluation protocol may measure accuracy (high on retain & test set, low on forget set) or the likelihood of the forget text sequences (lower the better).

However, these intuitive choices of metrics aren’t necessarily principled or extensible to settings like knowledge unlearning in LLMs. Expecting the model to perform poorly on an unlearned image ignores generalization , as the forget examples could very well be an interpolation/duplicate of certain retain examples. And we don’t always have oracle models that have never seen the forget examples; e.g., do we have LLMs that have never seen New York Times articles?

Evaluating unlearning on LLMs had been more of an art than science. For example, to unlearn “Harry Potter” as an entity, people would visualize how the token probabilities would decay for Harry Potter related text—and some other folks would come along and show that the model can indeed still answer Harry Potter trivia questions. The key issue has been the desperate lack of datasets and benchmarks for unlearning evaluation.

Since 2024, nevertheless, the benchmarking crisis is getting better. There are two recent projects worth highlighting:

  • TOFU : A benchmark focusing on unlearning individuals (specifically book authors). It involves asking GPT-4 to create fake author profiles, fine-tuning an LLM on them, and using the fine-tune as the unlearning target model and the original LLM as the oracle “retrained” model. It provides QA pairs on the generated fake authors to evaluate a model’s knowledge of these authors before/after applying unlearning.
  • WMDP : A benchmark focusing on unlearning dangerous knowledge, specifically on biosecurity, cybersecurity, and chemical security. It provides 4000+ multiple-choice questions to test a model’s hazardous knowledge before/after applying unlearning. As part of the report the authors also propose an activation steering based empirical unlearning method.

TOFU and WMDP depart from previous unlearning evaluation in that they are both “high-level” and focus on the model’s knowledge retention and understanding as opposed to example-level metrics like forget sequence perplexity. This is particularly relevant for LLMs as they are generally capabale of giving the same answer in many different ways that example-level metrics can’t capture.

Looking forward, I think application-oriented unlearning benchmarks like TOFU and WMDP, as opposed to instance-based evaluation like that of the NeurIPS unlearning challenge , are more useful for evaluating foundation models, owing to the multi-tasking nature of these models and the disparate definitions of “unlearning success” for each of these tasks. Indeed, one might imagine separate benchmarks on unlearning personally identifiable information (PII), copyrighted content, speech toxicity, or even model backdoors . For example, for unlearning PII, we might care about exact token regurgitation, whereas for toxicity, the unlearning metric would be the score reported by a ToxiGen classifier.

4. Practice, pitfalls, and prospects of unlearning

Unlearning is a hard problem, especially in the context of foundation models. As we actively research to make unlearning work in practice, it helps to philosophize a bit on what unlearning really means and whether it is the right solution for our current problems.

Intuitively, unlearning infrequent textual occurrences in LLMs like car accidents in Palo Alto should be easier than unlearning frequent occurrences like “Biden is the US president”, which is in turn easier than unlearning fundamental facts like “the sun rises every day”.

This spectrum of unlearning hardness emerges because as a piece of knowledge becomes more fundamental, it will have more associations with other pieces of knowledge (e.g. as premises or corollaries) and an exponentially larger unlearning scope. In fact, a piece of knowledge can be so embedded in the model’s implicit knowledge graph that it cannot be unlearned without introducing contraditions and harming the model’s utility. 7

This intuition implies that certain unlearning requests are much harder or simply unsatisfiable (any attempts are bound to have flaws). Indeed, humans have experiences that form the basis of their subsequent actions and world models; it is subjective, blurry, and philosophical as to what capacity can humans unlearn their formative past memories.

More broadly, the unlearning hardness problem applies to all kinds of models, and for reasons beyond embeddedness in a knowledge/entailment graph. Let’s consider two more seemingly contradictory intuitions for unlearning hardness:

  • An example seen later in the training should be easy to unlearn , since the model would have moved only slightly in weight space (e.g. due to decayed learning rate) and one could either just revert gradients or revert to a previous checkpoint (if stored). In contrast, examples seen early gets “built on” by later examples (in the curriculum learning sense), making them harder to unlearn.
  • An example seen later should be harder to unlearn , since examples seen earlier are gradually (or catastrophically) forgotten over the course of training; this may be especially true for LLMs.

Failure to reconcile these intuition would suggest that the interplay across memorization/forgetting , example importance (in the sense of data selection and coresets ), learning hardness (in the sense of prediction flips ), and unlearning hardness is unclear.

Here are some interesting research questions :

  • Is there a qualitative/fundamental difference between unlearning “easy” data (e.g. a local news event) and “hard” data (e.g. cats have four legs)?
  • If there is a spectrum of unlearning hardness, does there exist a threshold to tell apart what is “easy” and “hard”, and thus what is unlearnable or shouldn’t be unlearned? Does there exist, or can we train, such an oracle classifier? Can humans even tell?
  • How does this relate to influence functions and data attribution ? If a certain piece of knowledge (as it manifests in a model’s output) can be attributed to a larger fraction of the training data, does it make it harder to unlearn?
  • Can we benchmark how easy is it to unlearn something?

On the surface, unlearning seems to be a promising solution for copyright protection: if a model violates the copyright of some content, we could attempt to unlearn said content. 8 It is conceivable that to resolve copyright violations via unlearning, provable and exact unlearning is necessary (and possibly sufficient); on the other hand, approximate unlearning, without guarantees and with the possibility of being hacked, is certainly insufficient and likely unnecessary.

In practice, however, there is a lot more nuance due to the questionable effectiveness of current unlearning methods and the unclear legal landscape at the intersection of AI and copyright. Since I am no legal expert (and clearly none of this section constitutes legal advice), we will mostly focus on asking questions. The central question seems to be: is unlearning the right solution for copyright protection?

Recall that the fair use doctrine 9 permits limited use of copyrighted material contigent on four factors: (1) purpose and character of the use (“transformativeness”), (2) the nature of the copyrighted work, (3) amount and substantiality of the use, and (4) the effect on material’s value. If the use of copyrighted content in a model qualifies as fair use, then unlearning such content from the model is unnecessary.

Suppose a model is trained on some copyrighted content and is risking copyright violation, as in New York Times v. OpenAI . Should OpenAI invest in (empirical) unlearning algorithms on ChatGPT? Or should they focus on the transformativeness axis of fair use and invest in deploying empirical guardrails , such as prompting, content moderation, and custom alignment to prevent the model from regurgitating training data? The latter seems to be what’s being implemented in practice.

More broadly, there could also be economic solutions to copyright violation as alternatives to unlearning. For example, model owners may provide an exact unlearning service (e.g. via periodic retraining) while also offering to indemnify model users for copyright infringement in the mean time, as seen in the case of OpenAI’s “ Copyright Shield ”. People are also starting to explore how one may price copyrighted data using Shapley values. In general, it is unclear right now how much of a role (if any) unlearning will play for resolving copyright related issues. Exact unlearning (extending to retrieval-based systems, see next section) does hold promises since deletion is clean and provable, but it seems that legally binding auditing procedures/mechanisms need to be first in place.

An obvious alternative to unlearning is to not learn at all. One way this could manifest for an LLM is that we take all content from the pre-training set that may receive unlearning requests (e.g., New York Times articles) and put them to an external data/vector store. Any questions relating to them will then be RAG ’ed during inference, and any unlearning requests can be trivially satisfied by removing the data from the database. Min et al. demonstrates that this approach can be competitive to (though not quite matching) the trained baseline in terms of final perplexity.

Retrieval-based solutions are promising because of the increasing capabilities of the base models to reason in-context. However, there are few considerations before taking retrieval systems as the no-brainer solution to unlearning:

  • Removing protected content from pre-training corpus can be a hard de-duplication problem. Much like removing data contamination is hard , how can we be sure that paraphrases, quotations/citations, or other adaptations of the protected content are removed?
  • What if the data to be unlearned can’t be retrieved? Today we fine-tune many things into a model that aren’t documents or knowledge items; for example, it is unclear (yet) if things like as human preferences and desired behaviors (e.g. ability to write concisely) can be “retrieved” from a database.
  • Dumping stuff in-context can open new attack surfaces. Many RAG methods for LLMs work by putting related content in-context and ask the model to reason on them. Having the protected data in-context means they are now more susceptible to data extraction (simple prompting attacks may work just fine ).
  • Utility gap between retrieval and training. While there is evidence that retrieval-based solutions can be competitive, there is no general consensus that retrieval alone can replace fine-tune workloads; indeed, they can be complementary . More broadly, what if the space of unlearnable data is too large such that if all of it goes to an external store, the base model wouldn’t be as useful?

As models become more capable and are granted agency , one concrete application domain for unlearning that is gaining traction is AI safety .

Roughly speaking, safety concerns stem from a model’s knowledge (e.g., recipe of napalm ), behaviors (e.g., exhibiting bias ), and capabilities (e.g., hacking websites). Examining current AI systems and extrapolating forward, one may imagine the following examples to apply unlearning and improve AI safety:

  • removing hazardous knowledge , as seen in the WMDP benchmark;
  • removing model poisons and backdoors , where models respond to adversarially planted input triggers;
  • removing manipulative behaviors , such as the ability to perform unethical persuasions or deception;
  • removing bias and toxicity ; or even
  • removing power-seeking tendencies .

For safety-oriented applications, it is worth noting that unlearning should be treated as a post-training risk mitigation and defense mechanism , alongside existing tools like alignment fine-tuning and content filters. And as with any tool, we should view unlearning through its trade-offs in comparison to other tools in the toolbox (e.g., unlearning is more adaptive but more expensive than content filters), as opposed to brushing it off because of the potential lack of guarantees and efficacy.

Acknowledgements : The author would like to thank Aryaman Arora, Jiaao Chen, Irena Gao, John Hewitt, Shengyuan Hu, Peter Kairouz, Sanmi Koyejo, Xiang Lisa Li, Percy Liang, Eric Mitchell, Rylan Schaeffer, Yijia Shao, Chenglei Si, Pratiksha Thaker, Xindi Wu for helpful discussions and feedback before and during the drafting of this post. Any hot/bad takes are those of the author.

If you find this post helpful, it can be cited as:

Liu, Ken Ziyu. (Apr 2024). Machine Unlearning in 2024. Ken Ziyu Liu - Stanford Computer Science. https://ai.stanford.edu/~kzliu/blog/unlearning .

phd in computer science in stanford university

Technically, SISA may not give exact unlearning in the sense of identical model distributions between the retrained model and the unlearned model, since after a sequence of unlearning requests, the data shards may end up in a state that we wouldn’t otherwise get into in the first place (e.g., some shards have way more data than others after unlearning). For practical purposes, nevertheless, this is subtle enough that the nice properties about exact unlearning, as discussed later in the section, would still hold.  ↩

It is also worth noting that the unlearning metric used in the NeurIPS unlearning challenge was disputed: why should we stick to a DP-like distributional closeness metric to a single secretly-kept re-trained model, when retraining itself can give a different model due to randomness?  ↩

More broadly, “unlearning” falls under the umbrella of “model editing” in the sense that a deletion is also an edit. Similarly, one could argue that the concept of “continual learning” falls under the umbrella too, where an association (say an input/label pair, or a piece of factual association) is updated by deleting of an old association and creating a new, clearly specified association. One could imagine using continual learning to help achieve unlearning and vice versa.  ↩

There is also evidence that in-context demonstrations mostly serve to elicit a particular behavior and that the labels don’t even matter that much. It’s unclear yet how we could reconcile this finding with “in-context unlearning”.  ↩

Humans do forget things though, which is different. The ML analogy might be “catastrophic forgetting”; humans similarly forget things under information overload.  ↩

In particular, recall that for exact unlearning , understanding forgetting quality isn’t strictly necessary because the algorithm would remove the forget data from the picture by construction (through retraining). Thus it may be acceptable even if the unlearned model does well on the forget set (as it could be a result of generalization from the retain set). We will focus the discussions of unlearning evaluation on approximate unlearning.  ↩

Note that this “embeddedness” of a piece of data is related but distinct from whether the data is in or out of distribution , which should also affects how an unlearning algorithm should behave (e.g. unlearning a perfect inlier should be no-op for an ideal unlearning algorithm).  ↩

Of course, we must first verify that such content has been trained on by the model in the first place. We can be almost certain that contents like Wikipedia articles are trained on, but we are generally less sure about a random blogpost somewhere on the internet. This is basically the membership inference problem.  ↩

Fair use is a doctrine applicable specifically in the United States. The reader should refer to related doctrines in corresponding jurisdictions, such as fair dealings in Commonwealth countries.  ↩

Annual hooding ceremony celebrates graduate students

A man drapes an academic hood on a man standing on front of him

KOKOMO, Ind. — Indiana University Kokomo celebrated the hard work and dedication of its graduate students, honoring them in the annual master’s hooding ceremony held Friday (May 3) in Havens Auditorium.

“We are delighted to celebrate the achievements of these degree candidates with all of you,” IU Kokomo Chancellor Mark Canada told the audience of friends and family. “We know that our students have counted on the support of family and friends to make this day possible.”

Fifty-three students in the Master of Business Administration (MBA), Master of Public Management (MPM), Master of Science in Education, Master of Science in Criminal Justice and Public Safety, Master of Arts in Mental Health Counseling, Master of Science in Nursing (MSN), and Master of Arts for Teachers programs received hoods that recognized their achievements in their degree programs.

Hoods were draped over each graduate by a faculty member who played an important role in their time at IU Kokomo, exchanging a handshake or hugging afterwards.

“It feels very rewarding,” Parker Woods said. Woods was recognized during the ceremony as an outstanding student in the MBA program.

Woods chose to earn his master’s “as a personal goal,” and to put himself in a position to advance his career.

“It’s something that you look at on the horizon, because it seems so far away, but I’m excited to cross the finish line,” he said.

Sarah Napier is the second person in her family to earn a master’s degree. “It’s nice because it’s done,” Napier said. “It’s monumental.”

Napier chose to get her master’s degree, to expand upon the opportunities available to her with her bachelor’s degree in new media, art and technology with a minor in public management.

“What I want to do is work to help people get the resources they need,” she said. “Public management helps to give me more understanding.”

Outstanding student award winners are:

Master of Business Administration (MBA): Parker Woods

Master of Arts in Mental Health Counseling: Shiloh Pullen

Master of Public Management (MPM): Syenna Powell

Master of Science in Nursing (MSN): Melissa Cohee

Graduate students earning master’s degrees are listed by degree and hometown. They include:

Master of Business Administration:

Alexandria: Alana Billings

Bunker Hill: Derek Davies

Carmel: Mikayla Tom

Cicero: Christian Humbert

Frankfort: Megan Wall

Greentown: McKenzie Cooper

Kokomo: Peter Davis, Kathia Garcia, Isaac Hogsett, Corbin Kuntz, Grace Russell

Logansport: Bryce Reish

Noblesville: Dakota Crandall

Peru: Alexander Beer

Sheridan: Katelyn McMillan

Swayzee: Emily Wilkison

Sweetser: Logan Schultz

Tipton: Parker Woods

Master of Public Management:

Huntington: Katie Stanley

Kokomo: Abdullah Albaqshi (Saudi Arabia), Marisha Besser, Sarah Napier, Syenna Powell, Kimberly Vazquez

Master of Science in Education (Educational Technology for Learning):

Walton: Michael Sommers

Master of Science in Criminal Justice and Public Safety:

Danville: Leighanna Shoemaker

Master of Arts in Mental Health Counseling:

Galveston: Dereka Samuel

Gas City: Allison Tignor

Greentown: Courtney Altherr

Kokomo: Erinn Adam, Amber Beatty, Shaylee Clark, Emma McGregor, Shiloh Pullen, Holly Widner

Logansport: Robert Irwin, Olivia Torres

Russiaville: Regan Head

Somerset: Schalene Shafer

Tipton: Chyenna Mills

Master of Science in Nursing:

Carmel: Keyla Matthews

Flora: Samantha Feltner

Galveston: Melissa Cohee

Kokomo: Amanda Lewis, Cassie McKillip, Olivia Younce

Lebanon: Jessica Haltom, Rebecca Holloman

Walton: Kassidy Clem

Master of Arts for Teachers (Mathematics):

Greentown: Benjamin Diener

Education is KEY at Indiana University Kokomo.

Contact Info

Erin Witt, director of media and marketing 765-455-9468 [email protected] Danielle Rush, communications specialist 765-432-9906 [email protected]

Filed under:

More stories.

A woman on a stage holds a certificate

Outstanding achievement celebrated at Honors Convocation

College Goal Sunday preview

IU Kokomo offers free FAFSA assistance at College Goal Sunday

Social media.

  • Facebook for IU Kokomo
  • Twitter for IU Kokomo
  • Instagram for IU Kokomo
  • Youtube for IU Kokomo

Indiana University Kokomo

Indiana university.

  • Have a question? AskIU
  • Learn more about UITS
  • Outlook Web Access
  • Federally Required Disclosures
  • Information Privacy Policy
  • 15 to finish
  • Non-Discrimination Notice
  • Majors and Degrees
  • IT Knowledge Base
  • Academic Calendars
  • IT Services (UITS)
  • Report an Accessibility Concern

More From Forbes

Clemson to offer m.s. in computer science via coursera; no application required.

  • Share to Facebook
  • Share to Twitter
  • Share to Linkedin

Clemson University will offer an online M.S. in Computer Science for a total of $20,280 intuition.

Clemson University will partner with Coursera Coursera to offer a fully online Master of Science in Computer Science degree. The announcement was made in a blog release by Marni Baker Stein, chief content cfficer at Coursera , the online learning platform and a pioneer of Massive Open Online Courses (also known as MOOCs),

The program, which will have an artificial intelligence focus, is designed to be both affordable and uniquely accessible.

Instead of having to complete a formal application, students who hold a bachelor’s degree in any field from an accredited college and earn a B average in two introductory Clemson courses through Coursera will be automatically accepted. They will have 20% of the degree already completed.

Tuition for the complete program is set at $20,280 — 35% less than the comparable hybrid program.

“This Master of Science in Computer Science program is timely, industry-relevant and thoughtfully designed to be approachable to learners from many backgrounds, for example those looking for opportunities for mid-career advancement,” said Brian Dean, professor and C. Tycho Howle Director of the Clemson School of Computing, in the release.

“The modern and cutting-edge curriculum ensures that learners can succeed, whether they hold a formal computer science background or whether their computing background comes from prior real-world experience,” Dean added. “We are excited to be able to partner with Coursera to offer this program at Clemson University.”

Enrollment for the new program is scheduled to begin on May 1, 2024, with the first courses beginning in August 2024.

Clemson anticipates that most students will be able to complete the program in 20 to 36 months, preparing them for careers such as software development, information security analysis, and computer research. Students will be able to watch lecture videos at any time while engaging with peers and tenure-track Clemson faculty in live course sessions and office hours.

Best High-Yield Savings Accounts Of 2024

Best 5% interest savings accounts of 2024.

The 10-course MSCS program will feature:

  • An AI-first curriculum. Five of the 10 courses will be focused on AI.
  • An emphasis on ethics. To promote ethical use of AI, students will be taught to examine the implications of each AI system before exploring it further.
  • A combination of theory with real-world skills. Students will first learn core software engineering principles before tackling more advanced topics, including deep learning, data science, and data mining.
  • A hands-on approach to learning. Students will be expected to complete complex projects in real-world computing environments, enabling them to build a substantial portfolio demonstrating they know how to apply their knowledge.

“We’re honored to partner with Clemson on this affordable, accessible, and incredibly relevant degree,” said Coursera’s Stein. “Together, we’ll educate future technical leaders, who will thoughtfully use AI to solve society’s most pressing challenges and create a positive impact.”

Clemson’s use of a performance-based admission process is an innovation that bears watching. While the use of standardized admissions tests continues to be hotly debated in higher education circles, actual course performance could prove a fairer and easier alternative for making admission decisions, particularly for certain graduate programs.

Michael T. Nietzel

  • Editorial Standards
  • Reprints & Permissions

IMAGES

  1. Stanford phd Computer Science

    phd in computer science in stanford university

  2. My ENTIRE Stanford Computer Science Degree in 1 Video

    phd in computer science in stanford university

  3. My Masters Computer Science Degree from Stanford in 7 Minutes

    phd in computer science in stanford university

  4. A look at Stanford computer science, part I: Past and present

    phd in computer science in stanford university

  5. Computer Science

    phd in computer science in stanford university

  6. Stanford Computer Science Phd Acceptance Rate

    phd in computer science in stanford university

VIDEO

  1. Stanford CS109 Probability for Computer Scientists I Counting I 2022 I Lecture 29

  2. CodeX

  3. Stanford CS109 I Advanced Probability I 2022 I Lecture 27

  4. Stanford CS109 I Deep Learning I 2022 I Lecture 25

  5. NNFC Workshop: Anshul Kundaje, Deep learning for genomic discovery

  6. Stanford CS109 Probability for Computer Scientists I Inference II I 2022 I Lecture 13

COMMENTS

  1. PhD Admissions

    The Computer Science Department PhD program is a top-ranked research-oriented program, typically completed in 5-6 years. There are very few course requirements and the emphasis is on preparation for a career in Computer Science research. Eligibility. To be eligible for admission in a Stanford graduate program, applicants must meet: Degree level ...

  2. PhD

    This includes a cumulative list of requirements to be completed in order for students to confer their PhD degree in Computer Science. For any questions related to CS PhD milestone requirements, please email [email protected]. CS300 Seminar. First-Year Research Rotation Program. Courses.

  3. Academics

    The PhD degree is intended primarily for students who desire a career in research, advanced development, or teaching. A broad Computer Science, Engineering, Science background, intensive study, and research experience in a specialized area are the necessary requisites. The degree of Doctor of Philosophy (PhD) is conferred on candidates who have ...

  4. PhD Admissions

    The maximum score per course is defined on the transcript in almost all cases. For example, if the maximum score is 100 points per course and you have six courses per semester, your unconverted scale value would be 4800 (8x6x100) for the eight quarters of the undergraduate program. If you obtained 3700 points out of the potential 4800 points ...

  5. Admissions

    PhD Admissions. The PhD program is a research-oriented program with few course requirements. "In some ways, computer science is another tool in my design toolbox - understanding the latest technology helps me find the right solutions for problems I'm trying to address. Because ultimately, even in diverse settings, we only represent a ...

  6. PhD

    Gates Computer Science Building 353 Jane Stanford Way Stanford, CA 94305. Phone: (650) 723-2300 Admissions

  7. Stanford Computer Science

    Our main educational goal is to prepare students for a rapidly changing world. Undergraduate students have the option of declaring a Bachelor of Science or a Minor in Computer Science. Graduate students have the opportunity to pursue a Master's or PhD degree in Computer Science. The Master's degree is a terminal professional degree.

  8. General Information

    The Computer Science online application for Autumn entry quarter 2023-24 is now closed. The deadline to apply for 2024-2025 academic year is TBA. We have only one admissions cycle for the PhD program. For answers to commonly asked questions, please visit our FAQ page at: FAQ | Stanford Computer Science. All application deadlines are final.

  9. Computer Science

    The computer science department continues to lead the world in computer science research and education. Throughout the past four decades, the department has influenced society at levels that remain without parallel among academic institutions. Its spin-offs are among the most successful corporate ventures in the world, and many of the leaders ...

  10. The PhD Program in Computer Science FAQ

    The PhD Program in Computer Science FAQ. Here are a list of commonly asked questions. I hope this helps all of yall on the hunt for higher education and enlightenment. :D. There is a ton of information at The Stanford CS Admissions Page, so please look there before emailing me. (If you want to get a PhD, you will have to be good at research, so ...

  11. PhD

    Stanford University (link is external) Stanford Engineering. Computer Science. Engineering. Search this site Submit Search. Menu. Home; About. Welcome to Stanford CS; Gates Building; ... PhD. Program Requirements. First-Year Research Rotation Program; Courses; Foundation & Breadth Requirements; Candidacy;

  12. PhD

    Gates Computer Science Building 353 Jane Stanford Way Stanford, CA 94305. Phone: (650) 723-2300 Admissions: [email protected]. Campus Map

  13. PhD Minor

    Gates Computer Science Building 353 Jane Stanford Way Stanford, CA 94305. Phone: (650) 723-2300 Admissions: [email protected]. Campus Map

  14. Computer Science MS Degree

    The M.S. degree in Computer Science is intended as a terminal professional degree and does not lead to the Ph.D. degree. Most students planning to obtain the Ph.D. degree should apply directly for admission to the Ph.D. program. Some students, however, may wish to complete the master's program before deciding whether to pursue the Ph.D. To give such students a greater opportunity to become ...

  15. CS-MS Program

    Program Overview. The MS in Computer Science is intended as a terminal professional degree and does not lead to the PhD. Most students planning to obtain a PhD degree should apply directly for admission to the PhD program. Some students, however, may wish to complete the master's program before deciding whether to pursue a PhD.

  16. Frequently Asked Questions

    Gates Computer Science Building 353 Jane Stanford Way Stanford, CA 94305. Phone: (650) 723-2300 Admissions: [email protected]. Campus Map

  17. People

    PhD student (Computer Science) Honglin is a PhD student in the Department of Computer Science. Prior to joining Stanford, he received his BS in Mathematics of Computation from University of California, Los Angeles in 2018. During 2018 - 2019, he was a research assistant in the Center for Brains, Minds & Machines (CBMM) at MIT.

  18. PhD in Computer Science in USA: Deadlines, Colleges, Fees 2023, Jobs

    PhD in Computer Science in USA is a 4-6 year program with fees ranging from 30,000 USD- 80,000 USD (24-64 lakh INR) for Indian students.Check out top universities offering PhD in Computer Science in USA. ... Stanford University: PhD in Computer Science: Application Closed: 104,223: Harvard University: PhD in Computer Science: Application Closed ...

  19. CEE Special Seminar: Targeting humanitarian aid with machine learning

    About the Speaker: Emily is a PhD candidate at the UC Berkeley School of Information, where she studies the application of novel algorithms and digital data sources for social protection programs. Her work has been published in venues including Nature and Science Advances, and she is a recipient of a Microsoft Research PhD fellowship.

  20. MSE-AI Online

    THE RAJ AND NEERA SINGH Program in Artificial Intelligence: Master of Science in Engineering in AI Online . Join the AI revolution. The online Master of Science in Engineering (MSE) in Artificial Intelligence (AI) degree from Penn Engineering equips you with technical skills including classical AI, natural language processing, generative AI, and modern deep learning, alongside a robust ethical ...

  21. Machine Unlearning in 2024

    As our ML models today become larger and their (pre-)training sets grow to inscrutable sizes, people are increasingly interested in the concept of machine unlearning to edit away undesired things like private data, stale knowledge, copyrighted materials, toxic/unsafe content, dangerous capabilities, and misinformation, without retraining models from scratch.

  22. Annual hooding ceremony celebrates graduate students

    KOKOMO, Ind. — Indiana University Kokomo celebrated the hard work and dedication of its graduate students, honoring them in the annual master's hooding ceremony held Friday (May 3) in Havens Auditorium. "We are delighted to celebrate the achievements of these degree candidates with all of you," IU Kokomo Chancellor Mark Canada told the audience of friends and family.

  23. Clemson To Offer M.S. In Computer Science Via Coursera; No ...

    Clemson University will partner with Coursera to offer a fully online Master of Science in Computer Science degree. The 10-course program will have an AI focus.

  24. Online Computer Science & Engineering Degrees

    A master's degree in computer science is a graduate program focused on advanced concepts in computer science, such as software development, machine learning, data visualization, natural language processing, cybersecurity, and more. At this level, you'll often choose a field to specialize in.. Computer science master's programs build on your technical skill set while strengthening key ...