Undertale Wiki

We've Moved! Just as Gamepedia has joined forces with Fandom, this wiki had joined forces with our Fandom equivalent. The wiki has been archived and we ask that readers and editors move to the now combined wiki on Fandom. Click to go to the new wiki.

Mettaton/In Battle

  • View history

Template:Infobox in battle

  • 1.1 Soul modes
  • 1.2.1 Mettaton
  • 1.2.2 Mettaton EX
  • 1.2.3 Mettaton NEO
  • 2.1.1 Mettaton Quiz Show
  • 2.1.2 Mettaton EX
  • 2.2 Genocide Route
  • 3.1 Quiz Show
  • 3.2 Multicolor Tile Puzzle
  • 3.3 Mettaton EX
  • 3.4 Mettaton NEO
  • 4.1 Mettaton
  • 4.2 Bomb Defusal
  • 4.3 Mettaton EX
  • 4.4 Mettaton NEO
  • 5 References

Attacks [ ]

Soul modes [ ].

Mettaton uses the Red Soul during the quiz show and beginning the battle after the Multicolor Tile Puzzle , and the Yellow Soul in all other circumstances.

Patterns [ ]

Mettaton [ ].

  • During the quiz show, if the protagonist answers incorrectly, Mettaton fires an unavoidable laser that halves the protagonist's HP.
  • After the tile puzzle, if the protagonist does not press ACT → "Yellow" right away, Mettaton drops boxes from above in a set pattern, with gaps in between.

Mettaton EX [ ]

Mettaton EX's attacks all consist of an arrangement of several objects:

  • Copies of Mettaton's legs stick out from either side of the box and scroll downwards. They can be either moving inwards and outwards or not move; this can be controlled by shooting them, which stops the moving legs or causes the motionless legs to begin moving again.
  • Copies of Mettaton's legs quickly emerge from the side of the screen. An exclamation mark will appear in the area that they will appear in before this attack is used.
  • This attack is also used by Alphys as a Lost Soul during the Asriel boss fight in the True Pacifist run.
  • Bombs marked with a plus sign travel downwards, which explode into lasers of a similar formation when shot with the yellow soul.
  • White squares marked with black circles travel downwards (and sometimes side to side), which are broken by a shot from the yellow soul.
  • Spaced outlines of bombs and white squares with black circles travel downward, which will eventually reverse direction.
  • Long segmented arms resembling Mettaton's travel downwards, with sliding orange-yellow boxes along their length which, when shot, cause the arms to retract.
  • Mettaton's heart emerges from the square on his waist and shoots lightning-bolt shaped projectiles either in a circular blast formation or single linear shots (which only happens when he loses all of his limbs). Firing at his heart will end Mettaton's attack earlier; not shooting his heart will still cause his limbs to fall off due to being a scripted event. Contrary to popular belief, attacking Mettaton's heart this way does not count as FIGHTing him.
  • A disco ball appearing from the top of the square projects laser beams in either blue or white, which can be toggled in color by shooting the disco ball. They rotate at a variable pace.
  • Plain white squares move downwards. These can only be destroyed by a bomb.

Mettaton NEO [ ]

Mettaton NEO uses no attacks.

Strategy [ ]

Neutral/pacifist route [ ], mettaton quiz show [ ].

Alphys is present during this battle and uses her hands to shape the correct answer's letter. If Alphys does not provide an answer, it means any will do. Answering any questions wrong will cause Mettaton to halve the protagonist's HP. Attempting to hurt Mettaton in this form will not work; the screen will always display "Miss."

To defeat Mettaton EX without killing him, one must survive until his arms and legs are blown off and achieve a show rating of 10,000 or more; if his limbs are not blown off, a show rating of 12,000 or more will end the battle. [1] While the protagonist waits without acting, the ratings will go down. The ratings will stay at 3,001 at the minimum until the protagonist does something. Mettaton's limbs fall off regardless of whether or not his heart is shot, but shooting it will end his turn earlier. Ratings can be boosted in several ways:

  • Getting hit gives a violence boost of 10 to 50 points.
  • Shooting anything during Mettaton's turn.
  • Using the FIGHT option to directly harm Mettaton will result in him either biting his lip or sticking his tongue out while saying "Yeah," giving an Action boost of 300 points. Further attacks will cause him to grin visibly but give fewer Action Points.
  • Eating food sold by Burgerpants will give 300-500 rating points, but the Steak in the Shape of Mettaton's Face will give 700 points instead. Eating Junk Food gives the 'Eating garbage?!' penalty of 50 points. All other consumables have no effect.
  • Equipping a different piece of armor will give 1,500 points as long that piece has not been worn previously during the fight.
  • Using the Stick will cause it to be thrown at Mettaton. He will catch it in his mouth and ratings will boost by 700 points. Repeating this action will give 1 point. Using the stick after Mettaton has lost his arms and legs will instead give 1,400 points.
  • Using the Boast action will cause ratings to shoot up during Mettaton's turn, but taking damage will cost 100 points and stop the rating spike.
  • Using the Pose action will give 100-1000 points, inversely proportional to the amount of health the protagonist has left.
  • Using the Heel Turn action will increase the aforementioned violence boost to 100 points. This is more of a risk as it requires the protagonist to get hit.
  • Writing "LEGS" earns 350 points, which is the highest amount, being the 'correct answer.'
  • Writing "TOBY" earns 300 points, Mettaton saying that Toby sounds sexy.
  • Writing "DANCING" earns 250 points, Mettaton saying that he is self-taught.
  • Writing "ARMS" will earn 250 points, Mettaton commenting most people talk about his legs, but embrace them for their writing anyways.
  • Writing "VOICE" earns 200 points, Mettaton commenting that he has the voice of a siren.
  • Writing "HAIR" earns 200 points, Mettaton saying that he uses metal hair gel.
  • Writing certain words like "fabulous," "beautiful," "radiant," and "personality" make Mettaton comment on them.
  • Writing nothing earns 80 points. Mettaton is not surprised that the protagonist is speechless.
  • Writing any swear word results in the loss of 150 points, which causes Mettaton to exclaim that the show is family-friendly.
  • Writing a word that Mettaton deems insulting will prompt him to tell the protagonist that this is an essay about him, not them.
  • Writing a lot of random gibberish and letters with no sense will result in Mettaton being impressed that the protagonist wrote so much about him, even though he does not understand what they said.

If Mettaton EX is spared, he will appear only as his torso and head when the game returns to the overworld view. This will happen even if the protagonist ends the fight before his limbs fall off. However, if he is killed, he will appear in his original box form in the overworld view, but busted up.

For those who plan on killing Mettaton, methods that decrease ratings should be used, as the ratings may hit 10,000 before the protagonist kills Mettaton. Methods that decrease ratings include writing a curse word during the essay and eating Junk Food, so bringing a full inventory of Junk Food is advised. Waiting between turns also will steadily decrease ratings. However, it will stop decreasing after a certain point.

Genocide Route [ ]

Despite claiming to be a human eradicator, Mettaton NEO does absolutely nothing. Any non-missing attack will instantly kill him as he is scripted to take between 900,000 and 999,999 damage, no matter the strength of the attack. There is no way to spare Mettaton NEO, meaning that the point at which a Genocide Route cannot be aborted occurs before fighting him when the last random encounter is killed.

Contrary to popular belief, a weak attack does not revert to the Neutral Route. Only failing to depopulate Hotland and the CORE completely will cause the game to revert to the Neutral Route.

Quiz Show [ ]

Multicolor tile puzzle [ ], flavor text [ ].

  • His metal body renders him invulnerable to attack. [Check, Quiz Show]
  • His metal body STILL renders him invulnerable to attack. [Check, Dungeon, 1.001 patch]
  • Seriously, his metal body is invulnerable! [Check, CORE, 1.001 patch]
  • Mettaton attacks! [Encounter]
  • Mettaton. [Neutral]
  • Smells like Mettaton. [Neutral]
  • The quiz show continues. [Neutral during Quiz Show]
  • Screaming is against the rules. [Cry]
  • You yell... But nothing happened. [Yell]
  • This is probably what you'll do if things continue in this manner. [Burn]
  • Your phone's [ [ ACT ] ] menu is glowing. [After Alphys calls, dungeon]
  • You press the yellow button. The phone is resonating with Mettaton's presence! [Yellow]
  • Seems like a good time to turn Mettaton around. [After Alphys calls, CORE]
  • You tell Mettaton that there's a mirror behind him. [Turn]

Bomb Defusal [ ]

  • Defuse the dog! [Encounter]
  • It's blissfully unaware of its circumstances. [Check Dog]
  • The dog is still active! [Neutral, after defuse has failed]
  • Dog defused! [After defusal succeeds]
  • Defuse the basketball! [Encounter]
  • Even if you explode, you'll at least look good. [Check Basket Bomb]
  • Defuse the present! [Encounter]
  • Regardless, you'll have to write a thank-you letter. [Check Present Bomb]
  • Defuse the game! [Encounter]
  • You really should have rented it first. [Check Game Bomb]
  • Defuse the script! [Encounter]
  • Like all modern blockbusters, it's full of explosions. [Check Script Bomb]
  • Defuse the extremely agile glass of water! [Encounter]
  • All things considered, it's an extremely agile glass of water. [Check Extremely Agile Glass of Water]
  • Defuse the bomb! [Encounter]
  • Defuse failed! Aim for DEFUSE ZONE! [When defuse has failed]
  • The bomb is still active! [Neutral, after defuse has failed]
  • Bomb defused! [When bomb is defused]
  • His weak point is his heart-shaped core. [Check]
  • Mettaton EX makes his premiere! [Encounter]
  • Mettaton is saving your essay for future use. [After essay question]
  • You say you aren't going to get hit at ALL. Ratings gradually increase during Mettaton's turn. [Boast]
  • You posed dramatically. The audience nods. [Pose]
  • Despite being hurt, you posed dramatically. The audience applauds. [Pose at less than half HP]
  • Despite being wounded, you posed dramatically. The audience gasps. [Pose at less than 1/4 HP]
  • With the last of your power, you posed dramatically. The audience screams. [Pose at extremely low HP]
  • Mettaton has low HP. [Low HP]
  • You turned and scoffed at the audience. They're rooting for your destruction this turn! [Heel Turn]
  • You eat the (Item). The audience loves the brand. [Eat Glamburger or Starfait ]
  • You ate the Face Steak. The audience goes nuts. [Eat Steak in the Shape of Mettaton's Face ]
  • You eat the Junk Food. The audience is disgusted. [Eat Junk Food ]
  • You throw the stick. Mettaton catches it in his mouth and winks. [Throw Stick ]
  • You used the Mystery Key. Mettaton pretends it isn't there. [Use Mystery Key ]
  • Dr. Alphys's greatest invention. [Check]
  • Mettaton NEO blocks the way! [Encounter]
  • Stage lights are blaring. [Neutral]

Report this post

Valve Logo

Starmen.Net

  • » forum
  • » Undertale
  • » What did you type on Mettaton...

dog sprite

What did you type on Mettaton (ex)'s essay?

And what did he reply?

  • Reply To Topic

PsiRockingOMeta

I typed “i’m cool”. Then he said “you deserve a gold star”, lol

Posted over 8 years ago (edited over 8 years ago )

Quote Icon

The first time, I typed something like, “I love Mettaton’s hair, it’s so shiny and stylish” and I got a big bonus for that. And Mettaton told me what products he used on his hair.

Then when I died, I said Mettaton had really good legs, and I got an even bigger bonus...! He said something like, "That's the correct answer!"

Posted over 8 years ago

VERY COOL AND POPULAR

2014 Halloween Hack-Fest Participation

I typed in “everything”. He liked it.

Myzma Totes a Spy~

Lucas Collab

I typed “ur legs are so fine.” He seemed to enjoy it.

SleuthMechanism

i was caught off guard and only managed to type “the” in time the first time he praised the brevity. second time i typed “the legs”

I didn’t type anything due to a mix of being confused and being watched by a friend (who had already beaten it) and was to nervous to produce an answer.

If you type toby she responds with: Who is this guy… If you type a curse word she responds: we are talking about me, not you. Dat comeback doa!

Supah Star Warrior

fanvatar3

My space key didn’t work when writing the essay. Was that a glitch, or…?

What is dad?

M25FF Participation Award

Coolgamerz: If you type toby she responds with: Who is this guy… If you type a curse word she responds: we are talking about me, not you. Dat comeback doa!

Mettaton EX is a guy, I’m pretty sure.

I think I simply told him he was pretty. Then I told him “everything.”

Anything involving his legs seems to give the best bonus, it looks like.

WastelandCoyote Jocknerd

SMCon 2011

The first time I typed “nothing, he sucks and is terrible” and I got some points for brevity.

The second time I just mashed the keyboard a bunch and got more points for typing lots, even thought it was all gibberish.

The third time I didn’t have to type anything, because I finished the fight in one round… >;D

I wrote “his silky smooth voice”

Can’t remember what bonus I got

The first time I did something along the lines of “asdfadfasdgasdfdfasdffadsf”. He liked it. The second time, I went full Alphys-talking-about-mew-mew-kissy-cutie style on him. He only read the legs part. There was no third time. On my most recent time, I just said “u hav a hot butt”. I don’t actually remember what he thought of that. I love Mettaton.

roytheshort

I typed “gaster” and nothing interesting happened.

I meant to type something but the act of that turn was genuinely surprising to me that it actually left me speechless before I could react.

All I typed, plain and simple was “Agh” only with WAY more A’s, g’s, and h’s… … … …He liked it…

On my third run, I wrote BLOOD BLOOD BLOOD BLOOD BLOOD . He didn’t seem to get the implication. Maybe because monsters aren’t made of that?

I, too, would like a donut.

Fanfiction Reviewer Badge

I typed “Nothing”, “Not a thing”, and “No”, because I kept dying. The fourth time, I didn’t type anything, and he said something like “Speechless? I don’t blame you.” (can’t remember word for word)

I also tried pressing “g” a million times, and found that the text left the text box.

earthbound4ever

I’m pretty sure I threw out every fake curse word I could think of. Apparently you lose points if you swear at him so I’m glad the game didn’t pick up anything as a swear.

AffableGiraffe

i was super flustered so i just typed “i love your legs and your hair and arms and everything sfjldsfjklaf” obv. high points for legs, he loved it.

I never, ever swear, but I did this time, and I got the penalty for it. lol

CoolSkeleton95

essay prompt undertale

There’s really no need to bump your own topic without adding anything of significance when the last post wasn’t even a day ago. Be patient!

I tried typing I love that he is a ghost and then tried don't you remember Blooky but he didn’t react to either of those!

On my Soulless Pacifist run yesterday, first time I typed in “his hair” and he told me the brand he uses, and the second time I typed in “nothing” and he gave me the same response as I got for “everything”.

What an understanding guy.

I simply typed “Mettaton” He rather liked it.

No way! I just tried “his personality” and he said “Yes, my personality is quite charming, isn’t it?” Hah!

So far, it seems like things that give a special reaction are: hair, legs, personality, everything/nothing, swearing, toby…

I wrote a couple of lines including the fact that Mettaton was pretty, and I believe he agreed that he was attractive.

artistorm

I told him he was very pretty and he responded, “Thank you, I do look nice, don’t I?”

I also mashed gibberish and he gave me a high score for it even though he said he couldn’t understand any of it.

TigerStripes

I typed in “i like your legs mettaton” because his legs are so fab.

I didn’t type anything and he seemed happy that I was speechless. I don’t remember if there was any bonus to that but I thought it was funny.

i was in the middle of a skeleton pun and he cut me off.

PSI fangirl Reminder: Don't double post!

First time I said “You’re hot.” and he responded with “You’re right. I look quite nice, don’t I?”

Second time I said “Why do you want me dead?” He gave me a gold star.

I just got to him and I haven’t beat him yet so here are my other ideas:

“Nothing.” “Nice legs” “Stop” “Why are you trying to kill a child” (if I have enough time to type that) (no typing) “Everything.” “die” “I wanna be your friend”

electroheartx

The first time I typed in how frustrated I was with him. He gave me sass. Every subsequent time (there were many deaths) I just mashed the keys angrily until I filled up the text box and got a response about writing a book, haha

Operatic Sheep

essay prompt undertale

Bad phone pics because I forgot the print screen button existed…

That time limit, though. I honestly don’t know what I was going for after s! I think I was going to write stupid, but I guess I didn’t think that far…

DragonRaiderX9

I typed in “Destruction Glaive”. He seemed to like it.

SCP173official

HMMM i wander if he recognizes racial slurs…….

I mentioned homosexuality and he seemed to be a big fan

Oh cool. im gonna try calling him the N word next time i go against him.

Flip a Dip Dip

!Franklin Badge Gold

I said “He’s so handsome!” and he replied “I do look quite nice!”

essay prompt undertale

Gemini Duality I'm just a bottle of sunshine and rainbows with a metric ton of snark

I typed spammed x and z because I thought it said to press x and z. I was speechless.

Permanent Reggie

Mother 3 17th Anniversary Funfest - Staff

When I did this fight, I was playing with a controller. The keyboard was across the room. I had only one letter I could type without ending it.

CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC

clear off the side of the screen for who knows how many characters

I really wonder what “future use” Mettaton might have for that.

Usually just “mettaton is cool”, though I’ve heard he especially likes it when you say he’s a good dancer.

MOTHER2 Riddles 2 Wins

This is a bit of a confession. I typed out “I hate you I hate you I hate you” etc until the time ran out. During my first game I honestly thought he was worse than Flowey. At least Flowey backed off after some intimidation from Toriel, but Mettaton was maliciously persistent in his attempts to kill me. And not for the sake of others like Undyne. An evil killing machine.

He was impressed with the length of my essay.

I was kind of indifferent with mettaton until the end. He was just a malfunctioning robot. If I was going to be upset with anyone, it would be Alphys, but she was helping me and I let it slide. Until I found out the truth, and was just directly angry at Alphys. Using both of us as tools for her own goals. She’s a rather messed up person, though I did soften up to her when I spent more time with her.

I was never really mad at him. In fact I feel bad that he’s missing from some group shots. He’s like the Zoidberg of the group. You only see a leg of him when all your friends get together before the boss fight in the pacifist ending, and he’s not in the group photograph you see if you don’t stay with Toriel.

The Dracula Spectacula

Floating Heddy

HALP MEH ;-;

I tried to type but it didnt let me

Posted almost 8 years ago

I was trying to beat him before round three, so i looked up what i could type for the best bonuses. I misunderstood. I thought each trigger word got the points, so i spammed all of the words i knew.

“Toby’s beautiful leg hair”

Unfortunately, i didn’t get any bonus points for that.

Zoogrrl: “Toby’s beautiful leg hair”

THE BEST PART OF UNDERTALE LADIES AND GENTLEMEN

Back to top

All rights reserved by Starmen.Net

Forum by Fangamer

Undertale Wiki

  • Main characters
  • Português do Brasil
  • View history
I'M NOT GOING TO DESTROY YOU WITHOUT A LIVE TELEVISION AUDIENCE!! Mettaton

Mettaton is a major character and the fourth boss in Undertale . Mettaton is a robot with a SOUL , whose body was built by Alphys . He is the sole television star of the Underground . Mettaton poises as an entertainment robot turned human killing robot in Hotland , though he later reveals the truth to the protagonist at the end of the CORE .

  • 1.1.1 Initial Form
  • 1.1.2 Mettaton EX
  • 1.1.3 Mettaton NEO
  • 1.2 Personality
  • 2.1.1 Quiz Show
  • 2.1.2.1 Pre-Cooking Show
  • 2.1.2.2 Cooking Show
  • 2.1.2.3 Post-Cooking Show
  • 2.1.3.1 Pre-Bomb Defusal
  • 2.1.3.2 Bomb Defusal
  • 2.1.3.3 Post-Bomb Defusal
  • 2.1.4.1 Pre-Tile Puzzle
  • 2.1.4.2 Tile Puzzle
  • 2.1.5 End of CORE Encounter
  • 2.1.6 Mettaton EX
  • 2.1.7 Mystery Key
  • 2.2.1 Epilogue
  • 2.3 Genocide Route
  • 3 In Battle
  • 4.1 The Protagonist
  • 4.3 Napstablook
  • 7 References

Initial Form

Mettaton making a "number 1" symbol

Mettaton originally appears as a gray, largely rectangular box with a 4x5 grid of rectangular lights at the top, similar to the mechanism in Snowdin Forest used for Papyrus 's Tile Puzzle. The grid of lights can change colors between red, yellow, green, and blue, and Mettaton uses these color changes in place of facial expressions. He has four dials along the bottom of his body, and the bottom, he has a single leg which ends in a wheel. He has two segmented robotic arms which end in white gloves.

Mettaton EX

After having the switch on his back flipped in the Neutral or True Pacifist Route , Mettaton transforms into Mettaton EX; a new body he specially requested Alphys make for him. In this humanoid form, he has black hair with a long fringe that seems to cover his right eye, pale "skin" and visible metal segments below and above his left eye.

He has a pink chest piece, a narrow metallic waist with a box contraption, and black shoulder pads above his segmented arms, which end in gloves. The chest piece has what appear to be a speaker and some knob or gauge, while the waist has two parts that seem to act as a locking mechanism that holds his "heart-shaped core," as they both lose white pixels during his "heart-to-heart" attack. His long black-clad legs end in pink high-heeled boots.

Mettaton NEO

After being confronted by the protagonist on the Genocide Route , Mettaton transforms into Mettaton NEO, which resembles Mettaton EX, but has a more combat-oriented design, however Mettaton NEO can be killed by a single attack and has no attacks of his own.

His right forearm is replaced by what appears to be a cannon, pauldrons clad his shoulders, which are shaped like legs and are longer than his arms, and he has wings on his back. The soul on his waist points upwards in a more monster-like manner, and he has a heart shape engraved on his chest plate like Undyne the Undying , also mimicking the Delta Rune in the same ways. His hair on the right side is spiked out, revealing an entirely black segment of his face, with a sparkle or crosshair in place of his right eye.

Personality

Mettaton is a confident, charismatic, and charming TV host that loves drama, action, and violence. He lives for his ratings and adores performing. He supposedly strikes a pose when he does something wrong and makes time on his various shows to beat up "heel-turning villains." He shows a rather shallow appreciation for existence at times. He has a strong craving for attention and seems very egotistical, shown when he boasts about the beauty of his true form. However, despite his seemingly self-centered personality, he deeply cares about the seemingly positive impact his show has had on the inhabitants of the Underground. He has also shown soft spots for several characters, such as Alphys and Napstablook. He also appears to be among the fewer characters in the Underground that has no hate or prejudice against humans despite knowing about the monsters' history with them, since his initial attempts to kill the protagonist were just him playing along with Alphys's plan, and he outright tells the protagonist he has no desire to hurt humans and is far more interested in simply entertaining. While he did attempt to kill the protagonist, it was to prevent Asgore from taking their SOUL and be seen as a hero/savior to humankind, further emphasizing his lack of prejudice. On the flip side, he has also been noted to be a very demanding boss who is very unpleasant to work for, especially according to Burgerpants, who initially considered working for him a dream, but has since grown to strongly despise him.

Neutral Route

When meeting Alphys in the Lab , the protagonist is warned about an old machine that she had created, Mettaton. Alphys describes it as a robot that was made to be a TV star but recently had anti-human combat features added as a way of making him more useful.

Mettaton with his arm up and displaying the letter M

Immediately after this warning, Mettaton bursts through the wall (noted to be only a few feet wide, indicating that he was lying in wait for the protagonist for some time) and forces the protagonist into a deadly quiz show.

Mettaton "laughing." Note that his microphone is missing

Mettaton asks a series of multiple-choice questions that must be answered correctly within several seconds (the number says 30, but goes down approximately two numbers per second, giving only 15 seconds to answer). If incorrectly answered or not answered within the time limit, Mettaton fires an unavoidable electric shock that halves the protagonist's health. In spite of this, it is impossible to die during this quiz show. Alphys on the top right gives answers via hand motion; realizing this, Mettaton chooses to humiliate her by quizzing the protagonist on the identity of her unrequited love interest. Regardless of the protagonist's choice, Mettaton ultimately departs, concluding that the quiz show has lost all dramatic tension.

Cooking with a Killer Robot

As the protagonist journeys through Hotland, Mettaton entraps them in deadly pastiches of various media genres. First, he forces the protagonist to become his assistant in a cooking show where Mettaton is preparing a cake with a human SOUL as the main ingredient, and he attempts to kill the protagonist with a chainsaw to attain it. Alphys calls in as an interruption and suggests that some viewers may be vegan in an attempt to stop Mettaton from harvesting the protagonist's SOUL as an ingredient, so instead, Mettaton points towards a human SOUL-flavored substitute on top of a faraway cupboard.

Mettaton applauding the protagonist

However, the cupboard begins to shoot up from the ground rapidly, forcing the protagonist to use the jetpack feature previously installed on their phone by Alphys. They fly to the top and retrieve the substitute within a time limit of 1 minute set by Mettaton while he drops eggs, flour, and milk, all of which slow the protagonist's ascent.

The milk slows the protagonist down a whole lot.

  • If the protagonist reaches the substitute within the time limit, Mettaton states, like all cooking shows, he had already baked the cake ahead of time before leaving.
  • If they fail, Mettaton suddenly claims the show is on commercial break and refuses to kill them as there are currently no viewers.

Either way, he comments that he was foiled yet again by Alphys. Upon interacting with the substitute after this encounter, it shows the can was stuck to the table, so the protagonist would not have been able to get the substitute given the chance.

Pre-Cooking Show

  • [Narrator] Ring...
  • [Alphys] H-hey, it's kind of dark in there, isn't it?
  • [Alphys] Don't worry!
  • [Alphys] I'll hack into the light system and brighten it up!

Cooking Show

  • [Alphys] Oh no.
  • OHHHH YES!!!
  • WELCOME, BEAUTIES, TO THE UNDERGROUND'S PREMIER COOKING SHOW!!!
  • PRE-HEAT YOUR OVENS, BECAUSE WE'VE GOT A VERY SPECIAL RECIPE FOR YOU TODAY!
  • WE'RE GOING TO BE MAKING...
  • MY LOVELY ASSISTANT HERE WILL GATHER THE INGREDIENTS.
  • EVERYONE GIVE THEM A BIG HAND!!!
  • WE'LL NEED SUGAR, MILK, AND EGGS.
  • GO FOR IT, SWEETHEART!
  • Capital wenisberry. [Interact with Mettaton; unused]
  • MILK, SUGAR, AND EGGS ! YOU SHOULD BE ABLE TO FIND THEM ON THE BACK COUNTER! [Interact with Mettaton]
  • THIS ISN'T A SHOW ABOUT WASHING YOUR HANDS, DARLING.
  • THAT'S ON WEDNESDAYS!
  • MILK? EGGS? IN THE FRIDGE?
  • NO WAY, DARLING! THEY'D GET COLD!!!
  • MTT-BRAND MICROWAVE! ORIGIN OF THE MTT CHALLENGE!
  • PUT YOUR FOOD IN AND SET THE MICROWAVE ON HIGH FOR FIVE MINUTES...
  • IF YOU CAN STILL RECOGNIZE YOUR MEAL, WE'LL DOUBLE YOUR MONEY BACK!!!
  • OH YES! MTT-BRAND OVENS CAN REACH TEMPERATURES UP TO NINE-THOUSAND DEGREES!
  • ROASTING! TOASTING! BURNING! CHARRING! YOU'RE EXCITED, AREN'T YOU, DARLING? (TM)
  • GREAT JOB! JUST PUT THEM IN THE MIDDLE OF THIS COUNTER! [Interact with Mettaton after getting the ingredients]
  • PERFECT! GREAT JOB, BEAUTIFUL!
  • WE'VE GOT ALL OF THE INGREDIENTS WE NEED TO BAKE THE CAKE!
  • MILK ... SUGAR ... EGGS ...
  • ... OH MY! WAIT A MAGNIFICENT MOMENT! HOW COULD I FORGET!!!
  • WE'RE MISSING THE MOST IMPORTANT INGREDIENT !
  • A HUMAN SOUL !!!!
  • HELLO...? I'M KIND OF IN THE MIDDLE OF SOMETHING HERE.
  • [Alphys] W-wait a second!!!
  • [Alphys] Couldn't you make a...
  • [Alphys] Couldn't you use a...
  • [Alphys] Couldn't you make a substitution in the recipe?!
  • ... A SUBSTITUTION? YOU MEAN, USE A DIFFERENT, NON-HUMAN INGREDIENT?
  • [Alphys] Uhh, what if someone's...
  • [Alphys] ...
  • [Alphys] Vegan?
  • [Alphys] Uh well I mean
  • THAT'S A BRILLIANT IDEA, ALPHYS!!
  • ACTUALLY, I HAPPEN TO HAVE AN OPTION RIGHT HERE!!!
  • MTT-BRAND ALWAYS-CONVENIENT HUMAN-SOUL-FLAVOR-SUBSTITUTE!
  • A CAN OF WHICH... IS JUST OVER ON THAT COUNTER!!!
  • WELL, DARLING? WHY DON'T YOU GO GET IT?
  • WHAT'S THE MATTER? NOT A CAN FAN? THAT'S TOO BAD!
  • MTT-BRAND USES ONLY THE FRESHEST ARTIFICIAL INGREDIENTS AND CHEMICALS!
  • [Alphys] Um, is it really a good idea to be getting a snack?
  • [Alphys] Well, I guess I really shouldn't judge you...
  • [Alphys] After all, I'm the one eating potato chips in my PJs!
  • [Alphys] Uhhh, I mean... H-hey, go over to the right!
  • [Alphys] H-hey! Head over to the right!
  • STILL FIDDLING WITH THAT MICROWAVE, EH, DARLING?
  • CAN'T BLAME YOU FOR BEING TOTALLY ENAMORED WITH AN ELECTRONIC BOX.
  • BY THE WAY, OUR SHOW RUNS ON A STRICT SCHEDULE.
  • IF YOU CAN'T GET THE CAN IN THE NEXT ONE MINUTE...
  • WE'LL JUST HAVE TO GO BACK TO THE ORIGINAL PLAN !!!
  • SO... BETTER START CLIMBING, BEAUTIFUL!!!
  • [Alphys] Oh no!!! There's not enough time to climb up!
  • [Alphys] F-f-fortunately, I might have a plan!
  • [Alphys] When I was upgrading your phone, I added a few... features.
  • [Alphys] You see that huge button that says... "JETPACK"?
  • [Alphys] Watch this!
  • [Alphys] There!
  • [Alphys] You should have just enough fuel to reach the top!
  • [Alphys] Now, get up there!!!

Post-Cooking Show

  • IT SEEMS YOU'VE BESTED ME.
  • BUT ONLY BECAUSE YOU HAD THE HELP OF THE BRILLIANT DOCTOR ALPHYS!
  • OH, I LOATHE TO THINK OF WHAT WOULD HAVE HAPPENED TO YOU WITHOUT HER!!!
  • WELL, TOODLES!!
  • OH YES! ABOUT THE SUBSTITUTION...
  • HAVEN'T YOU EVER SEEN A COOKING SHOW BEFORE?
  • I ALREADY BAKED THE CAKE AHEAD OF TIME!!!!! SO FORGET IT!!!
  • [Alphys] Ring...
  • [Alphys] Wow! We... we did it!!
  • [Alphys] We... we really did it!!!
  • [Alphys] Great job out there, team!
  • [Alphys] W-well, uh, anyway, let's keep heading forward!!!
  • [Narrator] Click...
  • [Alphys] Wh-what!? Wh-why aren't you m-m-moving?
  • [Alphys] N-no...! I must not have added enough fuel!
  • [Alphys] D-darnit... I'm sorry...
  • [Alphys] Even when it's something like this, I...
  • [Alphys] I still...! I still...
  • OH NO, WOULD YOU LOOK AT THAT!
  • [Alphys] What?
  • I FORGOT! RIGHT ABOUT NOW IS WHEN WE HAVE OUR COMMERCIAL BREAK!
  • [Alphys] Wh... What are you...
  • UNFORTUNATELY, THAT MEANS NO ONE IS WATCHING THIS RIGHT NOW.
  • I'M NOT GOING TO DESTROY YOU WITHOUT A LIVE TELEVISION AUDIENCE!!
  • LOOKS LIKE YOU'VE FOILED ME ONCE AGAIN, THANKS TO THE BRILLIANT DR. ALPHYS!!!
  • UNTIL NEXT TIME, BEAUTIFUL!
  • [Alphys] U-um... I guess we... ... did it?

Secondly comes a breaking news segment, in which Mettaton asks the protagonist to report on one of the several items within a room ( a dog, a video game, a movie script, a basketball, a present, and a seemingly normal glass of water), which all turn out to be bombs hidden by Mettaton. Mettaton then scatters the bombs around the area, and the protagonist is forced to find and defuse all of them with one of Alphys's phone installments before a larger bomb in the center of the room goes off.

  • If the protagonist defuses all the bombs in time, Mettaton declares that the bomb now explodes in two seconds instead of two minutes.
  • If the protagonist has not disabled all the bombs until ten seconds before the big bomb explodes, the time slows substantially.
  • If the protagonist has not disabled all the bombs before time expires, Mettaton states that the protagonist failed to disarm them all within three minutes. However, even factoring in the slower passage of time through the last ten "seconds" of play, the total amount of time given to defuse the bombs still amounts to under three minutes.

Regardless of whether the bombs were defused or not, Alphys hacks the big bomb and disables it.

Pre-Bomb Defusal

  • [Alphys] Okay, I'm back!
  • [Alphys] A-another dark room, huh?
  • [Alphys] M-my hacking skills have got things covered!
  • [Alphys] Are you serious?
  • OHHHHHH YESSS!!!
  • GOOD EVENING, BEAUTIES AND GENTLEBEAUTIES!
  • THIS IS METTATON, REPORTING LIVE FROM MTT NEWS!
  • AN INTERESTING SITUATION HAS ARISEN IN EASTERN HOTLAND!
  • FORTUNATELY, OUR CORRESPONDENT IS OUT THERE, REPORTING LIVE!
  • BRAVE CORRESPONDENT! PLEASE FIND SOMETHING NEWSWORTHY TO REPORT!
  • OUR TEN WONDERFUL VIEWERS ARE WAITING FOR YOU!!
  • BASKETBALL'S A BLAST, ISN'T IT, DARLING?
  • TOO BAD YOU CAN'T PLAY WITH THESE BALLS.
  • THEY'RE MTT-BRAND FASHION BASKETBALLS. FOR WEARING, NOT PLAYING.
  • YOU CAN'T GET RICH AND FAMOUS LIKE MOI WITHOUT BEAUTIFYING A FEW ORBS.
  • (REPORT THIS ONE?)
  • IT SEEMS OUR REPORTER IS DRAWN TO SPORTS LIKE MOTHS TO A FLAMING BASKETBALL HOOP.
  • ATTENTION, VIEWERS! OUR CORRESPONDENT HAS FOUND... A BASKETBALL!
  • AH. BASKETBALLS.
  • CIRCLES OF FUN. ORBS OF JOY. SPHERES OF AMUSEMENT.
  • BUT YOU SHOULDN'T PLAY WITH THIS ONE. IT'S AN MTT-BRAND FASHIONBALL.
  • PROPER MAINTENANCE IS REQUIRED TO KEEP IT LOOKING GOOD.
  • AS YOU CAN SEE, EVEN EXPOSURE TO HUMAN BODY HEAT CAUSES THE PAINT TO SLOUGH OFF.
  • WAIT A SECOND.
  • THAT'S NOT A BASKETBALL.
  • THAT'S A BOMB!!!
  • OH NO!!! THIS SPORT REVIEW...
  • IS TURNING INTO A SHORT REVIEW!
  • BECAUSE IT'LL BE OVER. AFTER YOU BLOW UP.
  • BUT DON'T GET TOO EXCITED!
  • YOU HAVEN'T EVEN SEEN THE REST OF THE ROOM YET!
  • WHAT A SENSATIONAL OPPORTUNITY FOR A STORY!
  • I CAN SEE THE HEADLINE NOW:
  • "A DOG EXISTS SOMEWHERE."
  • FRANKLY, I'M BLOWN AWAY.
  • THIS DOG... STILL EXISTS!
  • THIS STORY... JUST KEEPS GETTING BETTER AND BETTER!
  • ATTENTION, VIEWERS! OUR CORRESPONDENT HAS FOUND... A DOG!
  • (CUE AUDIENCE AWWS)
  • THAT'S RIGHT, FOLKS! IT'S THE FEEL-GOOD STORY OF THE YEAR!
  • LOOK AT ITS LITTLE EARS, TINY PAWS, FLUFFY TAIL...
  • THAT'S NOT A TAIL!
  • THAT'S RIGHT... THAT DOG...
  • IS A BOMB!!!
  • BUT DON'T PANIC!
  • YOU HAVEN'T EVEN SEEN THE REST OF THE ROOM YET!!!
  • OH MY! IT'S A PRESENT! AND IT'S ADDRESSED TO YOU, DARLING!
  • AREN'T YOU JUST BURSTING WITH EXCITEMENT?
  • WHAT COULD BE INSIDE? WELL, NO TIME LIKE THE "PRESENT" TO FIND OUT!
  • READY FOR YOUR... PRESENTATION?
  • (... LET'S CUT THAT ONE IN POST.)
  • ATTENTION, VIEWERS! OUR CORRESPONDENT HAS FOUND... A PRESENT!
  • AND IT'S TIME FOR THE UNBOXING VIDEO!!!
  • LET'S FIND OUT WHAT'S INSIDE!!
  • THAT ROUND, BLACK SHAPE... COULD IT BE???
  • LOOKS LIKE CHRISTMAS CAME EARLY THIS YEAR.
  • IF SANTA GAVE PEOPLE BOMBS INSTEAD OF PRESENTS!!
  • REALLY THOUGH. A BOMB. WHAT A THOUGHTFUL GIFT.
  • THEY EVEN DECIDED TO LIGHT IT FOR YOU!
  • OOH LA LA! THIS VIDEO GAME YOU FOUND... IS DYNAMITE!!!
  • THOUGH I DON'T MAKE AN APPEARANCE IN IT UNTIL THREE-FOURTHS IN.
  • BUT I LIKE THAT.
  • APPEARING FROM THE HEAVENS LIKE MANNA, SLAKING THE AUDIENCE'S HUNGER FOR GORGEOUS ROBOTS...
  • OOH! THAT'S METTATON!
  • AH, YOU UNDERSTAND.
  • THIS IS A GAME WHERE YOU SHOULD CHECK EVERYTHING TWICE.
  • ATTENTION, VIEWERS! OUR CORRESPONDENT HAS FOUND... A VIDEO GAME!
  • THIS ACTION-PACKED GAME IS GUARANTEED TO BLOW YOU AWAY!
  • STRANGE ENEMIES! STRANGE ALLIES! ATTRACTIVE ROBOTS!
  • FEATURING UP TO SIX ARBITRARY DIALOGUE CHOICES AT ONCE!
  • CORRESPONDENT! LET'S LOOK INSIDE THE CASE!
  • ... THOSE RED CYLINDERS WITH BURNING FUSES...
  • OH NO! THIS GAME LITERALLY IS DYNAMITE!
  • I GUESS THEY WERE RIGHT ALL ALONG!!!
  • VIDEO-GAMES DO CAUSE VIOLENCE!
  • OR AT LEAST THIS ONE'S ABOUT TO.
  • OH NO!!! THAT MOVIE SCRIPT!!! HOW'D??? THAT GET THERE???
  • IT'S A SUPER-JUICY SNEAK PREVIEW OF MY LATEST GUARANTEED-NOT-TO-BOMB FILM:
  • METTATON THE MOVIE XXVIII... STARRING METTATON!
  • I'VE HEARD THAT LIKE THE OTHER FILMS...
  • IT CONSISTS MOSTLY OF A SINGLE FOUR-HOUR SHOT OF ROSE PETALS SHOWERING ON MY RECLINING BODY.
  • OOH!!! BUT THAT'S!!! NOT CONFIRMED!!
  • YOU WOULDN'T (COUGH) SPOIL MY MOVIE FOR EVERYONE WITH A PROMOTIONAL STORY, WOULD YOU?
  • PHEW!!! THAT WAS CLOSE!! YOU ALMOST GAVE ME A BUNCH OF FREE ADVERTISEMENT!! [Look More]
  • OH! YOU'RE BACK!
  • THAT'S RIGHT, FOLKS! IT SEEMS NO ONE CAN RESIST THE ALLURE OF MY NEW FILM!
  • ATTENTION, VIEWERS! OUR CORRESPONDENT HAS FOUND... A MOVIE SCRIPT!
  • OH MY! AND IT LOOKS LIKE IT'S FOR MY LATEST FILM!
  • LET'S NOT KEEP THEM WAITING! LET'S OPEN IT UP AND GET THE SCOOP!
  • ... OH??? WHAT'S THAT INSIDE THE SCRIPT?
  • THAT TICKING SOUND... THAT LIT FUSE...
  • OH MY!!! LOOKS LIKE I WAS WRONG ABOUT THE MOVIE!
  • WE DEFINITELY HAVE A BOX OFFICE BOMB ON OUR HANDS!
  • AND IT'S ABOUT TO BLAST YOU TO BITS!
  • ... IT'S A COMPLETELY NONDESCRIPT GLASS OF WATER.
  • BUT ANYTHING CAN MAKE A GREAT STORY WITH ENOUGH SPIN!
  • I'M HONORED TO BE IN THE PRESENCE OF SUCH A HUGE LUKEWARM WATER FAN, FOLKS!
  • ATTENTION, VIEWERS! OUR CORRESPONDENT HAS FOUND... A GLASS OF WATER!
  • BUT WHAT'S ASTONISHING ABOUT THIS GLASS OF WATER...
  • IS HOW UNINTERESTING IT IS!
  • LIKE ALL GLASSES OF WATER, IT'S COMPRISED OF WATER, GLASS, NITROGLYCERIN...
  • THAT'S NOT A GLASS OF WATER!!!
  • OH NO!!! THIS NEWS REPORT...
  • IS TURNING INTO A DISASTER REPORT!!!
  • IT SEEMS EVERYTHING IN THIS AREA IS ACTUALLY A BOMB!
  • THAT DOG'S A BOMB!
  • THAT PRESENT'S A BOMB!
  • THAT BASKETBALL'S A BOMB!
  • EVEN MY WORDS ARE...!
  • BRAVE CORRESPONDENT... IF YOU DON'T DEFUSE ALL OF THE BOMBS...
  • THIS BIG BOMB WILL BLOW YOU TO SMITHEREENS IN TWO MINUTES !
  • THEN YOU WON'T BE REPORTING "LIVE" ANY LONGER!
  • HOW TERRIBLE! HOW DISTURBING!
  • OUR NINE VIEWERS ARE GOING TO LOVE WATCHING THIS!
  • GOOD LUCK, DARLING!!
  • [Alphys] D-don't worry!
  • [Alphys] I installed a bomb-defusing program on your phone!
  • [Alphys] Use the 'defuse' option when the bomb is in the DEFUSE ZONE !
  • [Alphys] N-now, go get 'em!

Bomb Defusal

  • [Alphys] Error Baby [Unused]
  • [Alphys] Error. [Unused]
  • [Alphys] Error, [Unused]
  • [Alphys] Great job! Keep heading around the room!
  • [Alphys] Try to go for the one in the bottom-left next!
  • [Alphys] Try to go for the one in the top-right next!
  • [Alphys] Great job! Head to the left next! [Present Bomb]
  • [Alphys] Great job! Head to the right next! [Game Bomb]
  • [Alphys] Great job! Head for the center! There's one left there! [Annoying Dog]
  • [Alphys] Great job! Head for the center!
  • [Alphys] I'm using, uh, EM fields to trap the glass of water there!
  • [Alphys] Great job! There's only one left in the bottom-right! [Script Bomb]
  • [Alphys] Great job! There's only one left at the top! [Basket Bomb]
  • [Alphys] Great job! There's only one left at the top-right! [Present Bomb]
  • [Alphys] Great job! There's only one left at the bottom-left! [Game Bomb]
  • [Alphys] You couldn't even get one bomb...!? [Not defuse a single bomb, but ran out of time in the battle screen]
  • [Alphys] It's... it's... [Defuse four bombs but ran out of time in the battle screen on the fifth bomb]

Post-Bomb Defusal

  • WELL DONE, DARLING!
  • YOU'VE DEACTIVATED ALL OF THE BOMBS!
  • IF YOU DIDN'T DEACTIVATE THEM, THE BIG BOMB WOULD HAVE EXPLODED IN TWO MINUTES .
  • NOW IT WON'T EXPLODE IN TWO MINUTES !
  • INSTEAD IT'LL EXPLODE IN TWO SECONDS !
  • GOODBYE, DARLING!
  • IT SEEMS THE BOMB ISN'T GOING OFF.
  • [Alphys] That's b-because!!!
  • [Alphys] While you were monologuing... I...!!!
  • [Alphys] I f... fix... Um... I ch-change...
  • OH NO. YOU DEACTIVATED THE BOMB WITH YOUR HACKING SKILLS.
  • [Alphys] Yeah! That's what I did!
  • CURSES! IT SEEMS I'VE BEEN FOILED AGAIN!
  • CURSE YOU, HUMAN! CURSE YOU, DR. ALPHYS, FOR HELPING SO MUCH!
  • BUT I DON'T CURSE MY EIGHT WONDERFUL VIEWERS FOR TUNING IN!!!
  • UNTIL NEXT TIME, DARLING!
  • [Alphys] W-wow... W-we really showed him, huh?
  • [Alphys] H-hey, I know I was kind of weird at first...
  • [Alphys] But I really think I'm getting more...
  • [Alphys] Uh, more...
  • [Alphys] M-more confident about guiding you!
  • [Alphys] S-so don't worry about that b-big d-dumb robot...
  • [Alphys] I-I'll protect you from him!
  • [Alphys] A-and if it really c-came down to it, we could just t-turn...
  • [Alphys] Um, nevermind.
  • [Alphys] Later!
  • [Narrator] (Click...) [Unused]
  • TOO BAD, DARLING!
  • YOU FAILED TO DEFUSE ALL OF THE BOMBS WITHIN THREE MINUTES !
  • NOW THE BIG BOMB IS GOING TO BLOW YOU TO SMITHEREENS!
  • READY, VIEWERS? HERE COMES THE MOMENT YOU'VE ALL BEEN WAITING FOR!
  • [Alphys] B-boy... That was close, huh?
  • [Alphys] I guess a little closer than I would have liked.
  • [Alphys] I should have given you better directions...
  • [Alphys] A-and there j-just w-wasn't enough time...
  • [Alphys] W-well! That's Mettaton's fault, not mine!
  • [Alphys] I c-can't second guess myself now.
  • [Alphys] I'm f-finally starting to f-feel confident about g-guiding you.
  • [Alphys] I'll protect you from that mean old robot, n-no matter what!
  • [Alphys] If I have to, I'll even t-turn...
  • [Alphys] We're over halfway to the core!
  • [Alphys] Let's go!
  • [Narrator] (Click...)

During the news segment, a ticker at the bottom of the screen reports the following news.

  • MTT-BRAND STILL TOP-RATED    ||    SCHOOL CANCELLED OVER REACTIVATED PUZZLES    ||
  • SCIENTIST DISCOVERS HEALTH BENEFITS OF USING COMPUTER (JUST KIDDING LOL)   ||
  • LOCAL METTATON VERY RICH FAMOUS AND GORGEOUS ||
  • TINY VOLCANO MONSTER TRIES ITS BEST, RECEIVES TINY APPLAUSE ||
  • PYROPE IRONICALLY MISSES INVITATION TO THIS SCENARIO "WOULD HAVE LOVED IT"  ||
  • LOCAL PLANE CREATES HUGE LINE AT STORE BY SAYING "IT'S NOT LIKE I WANT TO BUY THESE PRODUCTS OR ANYTHING" CASHIER CONFUSED   ||
  • HOTLAND TECHNICAL MALFUNCTIONS ACCEDE AND RECEDE IN LINEAR PROGRESSION THROUGHOUT AREA  ||
  • WOSHUA CLEANS UP LOCAL CRIME, LITERALLY FINDS CRIMINALS AND DOUSES THEM IN SOAP, CRIME DOESN'T GO DOWN BUT IT SMELLS AMAZING   ||
  • HISTORIC NEWS TICKER HEADLINE SHORTAGE ||

UNDERTALE the Musical

Mettaton in a dress during the musical segment

Thirdly and finally, Mettaton traps the protagonist within a musical, where Mettaton sings about a forbidden love between him, a monster; and the protagonist, a human. He mentions how sad it is that they must be sent to the dungeon, before opening a trapdoor underneath them into a room with a colored tile puzzle. Mettaton announces that the protagonist must pass the colored tile puzzle within a set time before a line of flame comes at them from the side and burns them to death.

  • If the protagonist manages to complete the puzzle, Mettaton deactivates the flames "knowing" that Alphys would have done so.
  • If the protagonist has stepped on a green tile, Mettaton reminds them that the tile signals a monster, which turns out to be Mettaton himself.
  • If the protagonist has not stepped on a green tile, Mettaton repeatedly says "well" before acknowledging that the protagonist never stepped on a green tile, but still fights them.

Mettaton proceeds to engage the protagonist in battle after the tile puzzle. However, Alphys has one final installment on the phone she had given to the protagonist. This yellow button fires a projectile at Mettaton, which the protagonist can use in succession. Mettaton acts like he is defeated from such projectile and flees. If the protagonist does not activate yellow mode, Mettaton attacks with boxes until the protagonist has barely any health left.

Pre-Tile Puzzle

  • OH? THAT HUMAN...
  • COULD IT BE...?
  • ... MY ONE TRUE LOVE?
  • (YOU LOOK BORED, DARLING.)
  • (I WANT THIS TO BE A STELLAR PERFORMANCE, SO IF YOU WON'T GIVE IT YOUR ALL...)
  • (THEN I'LL SKIP AHEAD FOR THE AUDIENCE'S SAKE.)
  • (KA-SIGH...) (THE SHOW MUST GO ON!)
  • OOMPH! I AM SO OVERWHELMED WITH TRAGEDIES.
  • THE KING HAS ORDERED YOU TO WASTE AWAY IN THE CASTLE BASEMENT.
  • AND BEFORE WE EVEN HAD TIME TO SING A SWEET SONG ABOUT IT.
  • MY DEAR HEART! I CAN BARELY LOOK UPON YOU, KNOWING WHAT COMES NEXT...
  • WELL, TOODLES!
  • (UNDERSTOOD.) (LET'S KNOCK 'EM DEAD!) [Yeah]
  • Please run away
  • Monster King
  • Forbids your stay
  • Humans must
  • Live far apart
  • It breaks my heart
  • They'll put you
  • In the dungeon
  • And then you'll die a lot
  • You're gonna die
  • Cry cry cry
  • So sad it's happening.
  • (Hmmm? Getting creative?)
  • (Dance with me, darling.)
  • (Oh! The audience can feel your passion!)
  • (Show the audience your passion!)
  • (So close... How passionate...)
  • (... do you need some help?)
  • (... what ARE you doing?)
  • (Don't stop now!)
  • (Look at you, leaping around the stage...)
  • (Can't keep your hands off, huh?)
  • (Is that how humans dance?)
  • (Humans are stranger than I thought.)
  • (Oh! They're really getting into it.)
  • (Moving so far...)
  • (Who can blame you?)
  • (Hmmm, I'll have to get used to it...)
  • (Even better than I thought...)
  • (So that's what it's like.)
  • (Dancing with... A human.)
  • (What a shame...)
  • SO SAD THAT YOU ARE GOING TO THE DUNGEON.
  • OH NO! WHATEVER SHALL I DO?
  • MY LOVE HAS BEEN CAST AWAY INTO THE DUNGEON.
  • A DUNGEON WITH A PUZZLE SO DASTARDLY, MY PARAMOUR WILL SURELY PERISH!
  • O, HEAVENS HAVE MERCY! THE HORRIBLE COLORED TILE MAZE!
  • EACH COLORED TILE HAS ITS OWN SADISTIC FUNCTION.
  • FOR EXAMPLE, A GREEN TILE SOUNDS A NOISE, AND THEN YOU MUST FIGHT A MONSTER.
  • RED TILES WILL... ACTUALLY, WAIT A SECOND.
  • DIDN'T WE SEE THIS PUZZLE ABOUT A HUNDRED ROOMS AGO?
  • THAT'S RIGHT. YOU REMEMBER ALL THE RULES, DON'T YOU?
  • GREAT... THEN I WON'T WASTE YOUR TIME REPEATING THEM!!
  • OH, AND YOU'D BETTER HURRY.
  • BECAUSE IF YOU DON'T GET THROUGH IN 30 SECONDS ...
  • YOU'LL BE INCINERATED BY THESE JETS OF FIRE!!
  • AHAHAHAHAHAHA! AHAHA... HA... HA!
  • MY POOR LOVE! I'M SO FILLED WITH GRIEF, I CAN'T STOP LAUGHING!
  • GOOD LUCK, DARLING!

Tile Puzzle

  • Has fallen down
  • Now in tears
  • We all will drown
  • Colored tiles
  • Make them a fool
  • If only they
  • Still knew the rules
  • Well that was
  • A sorry try
  • Now let's watch

End of CORE Encounter

After the protagonist reaches the end of the CORE, they are once more confronted by Mettaton. This time, however, Mettaton reveals that he had re-arranged the CORE and hired monsters so as to attempt to kill the protagonist legitimately. He states that Alphys had set up an extensive plan to self-insert herself into the story because she liked the protagonist so much and wanted to feel important by helping them. All of the previous threats from Mettaton were entirely fake, and everything was acted out, and used by Alphys to ally herself further with the protagonist; Alphys plans to intervene in the fight between Mettaton and the protagonist by "deactivating" Mettaton and thus appearing heroic to them. This time, however, Mettaton has made plans to prevent Alphys's aid, so that he could have an actual battle against the protagonist, and locks the door to prevent Alphys from entering. He explains that he wants to take their SOUL so that he can leave the Underground and become a superstar on the Surface . This way, Asgore cannot destroy the Barrier and re-ignite The War of Humans and Monsters , which would take a toll on his views.

Ohhhh my. If you flipped my switch, that can only mean one thing. You're desperate for the premiere of my new body. How rude... Lucky for you, I've been aching to show this off for a long time. So... as thanks, I'll give you a handsome reward. I'll make your last living moments... ABSOLUTELY beautiful! Mettaton after his transformation

Mettaton overworld trashed

Mettaton's overworld wreck sprite, if he's killed

Mettaton EX battle armless

Mettaton lacking his arms

Mettaton attacks the protagonist, but under the advice of Alphys, the protagonist tricks Mettaton into turning around so that they can flip the switch on his back to make him vulnerable. This switch transforms Mettaton into Mettaton EX and begins the true game show. By taking damage, using popular brand items, or using specific ACTs to raise the Ratings of the show to above 10 000 (or 12 000, if he still has legs), Mettaton stops fighting.

To his surprise, this is the highest-rated episode he has ever had and begins to take call-ins from viewers. Several callers, the first of which is Napstablook , convince Mettaton that he is highly valued in the Underground, and is the primary (perhaps only) source of entertainment that many of the inhabitants have. He is moved by their passion for the show and decides that he no longer wishes to leave the Underground, as well as explaining that the protagonist is strong enough to defeat Asgore. He then mentions about his inefficient power consumption, and shuts down, leaving only his upper torso remaining. Alphys then manages to unlock the door and assess Mettaton EX's state, being thankful that he was simply out of power. She acts very stressed at the thought of Mettaton being truly killed. After that, Mettaton's body can be found in the Lab and is undergoing repairs.

Mettaton EX overworld limbless

Mettaton's deactivated body after being spared

If Mettaton EX is killed, his overworld sprite reverts to his box form, though visibly severely damaged. Examining the wreck warrants the description "It's completely trashed."

Mystery Key

If the protagonist purchases the Mystery Key from Bratty and Catty , they can enter the house to the right of Napstablook's. The room is primarily pink, with a bed, a star-patterned pillow, star-patterned wallpaper, a pink TV, a window, and a star imprinted rug. Through a series of diaries, the house is shown to be owned by another ghost known as "Napstablook's cousin" within the game (though Papyrus states that his headcanon name for him was "Happstablook", and the room name of his house is room_water_ha p stablook ). After meeting Alphys, she designed a body for Mettaton to inhabit.

Using the Mystery Key during Mettaton EX's fight prompts him to pretend "it isn't there."

True Pacifist Route

Mettaton overworld legs

Mettaton, as seen in the True Pacifist Ending Credits

After the protagonist has finished their date with Alphys, Mettaton's deactivated body can be found on the second floor of the Lab, atop her work table for repairs.

In the True Lab , there is a log entry stating that Alphys fears that Mettaton may not talk to her anymore after receiving his new body due to his attachment to fame.

At the end of the game, when all of the Monsters encountered as bosses have united before Flowey intervenes, Mettaton EX shows his leg from the right side of the screen. He tells Alphys and Undyne that they should kiss already since the entire crowd wants it.

After defeating Asriel , if the protagonist returns to Waterfall before exiting New Home, Mettaton is seen standing outside his old house, having recruited Napstablook as his sound mixer and Shyren as his backup singer. During the credits, he is shown on tour, able to use his original box form in conjunction with his new body's legs.

  • There you are, Frisk-darling.
  • Feast your eyes! Dr. Alphys completed my wonderful new body.
  • Oooh! And did you hear? The barrier's OPEN!
  • I can't wait to see the sun...
  • ... the greatest spotlight of all!!
  • Oh yes. I suppose I should thank you, too, darling.
  • Before fighting you, I had...
  • Forgotten how fun it was to perform with others.
  • So I've been searching for HOT TALENTS to fill up my upcoming troupe.
  • So far, Shyren 's agreed to be my back-up singer.
  • And Bl... Napstablook, here, will be my sound mixer!
  • The three of us performing together...
  • It really feels overdue, doesn't it?
  • Frisk, darling. Can you help me with something?
  • What kind of merchandise do you think humans would want to buy...?
  • I've thought of a few ideas so far.
  • Buttons (with my face) Stickers (with my face) CDs (with my face)
  • Posters (with my face) T-shirts (with my face) Underwear (with my face)
  • ... and plush dolls of TORIEL .
  • But, you know. With my face instead of hers.
  • So what do you think?
  • [Narrator] (...)
  • [Narrator] (A yes or no prompt was not provided.)
  • Fabulous! I completely agree!
  • Oh, Frisk. Why don't you go see how Alphys is doing?
  • Since the flash of light she's been working hard to set everything right.
  • Ha-Ha. About time, huh?

Genocide Route

OH? HOW SASSY. YOU'RE JUST ITCHING TO GET YOUR HANDS ON ME, AREN'T YOU? WELL... T-O-O B-A-D! THIS WORLD NEEDS STARS MORE THAN IT NEEDS CORPSES! TOODLES! Mettaton in the Lab on the Genocide Route

Mettaton shows up in Alphys's Lab and tells the protagonist that he will not battle them, knowing that he is no match for them. He shows up at the CORE and realizes that the protagonist means not only to kill monsters but all of humanity as well. Claiming his original functions as a human eradication robot were never fully removed, he transforms into Mettaton NEO. Despite all the buildup, he has no attacks, and the protagonist can destroy him in a single attack.

Mettaton NEO has two different variations depending on whether the player is on track for the Genocide Route. This is determined by three conditions. Mettaton NEO's Genocide Route variation requires all three of these conditions to be met:

  • Completing the Hotland/Core kill count
  • Killing Royal Guards
  • Killing Muffet

Due to a glitch, the game always counts Royal Guards/Muffet as killed if the previous encounter was killed. This applies even if Muffet or Royal Guards are spared, which would yield Mettaton NEO's Genocide Route variation (assuming the Hotland/CORE kill count is met) despite the Genocide Route having been aborted.

For Mettaton NEO's Genocide Route variation, any attack that hits will be scripted to inflict anywhere from 900,000 to 999,999 damage and destroys Mettaton NEO in a single hit. Missing does not abort the Genocide Route as the battle continues until Mettaton NEO is destroyed. Upon his death, Mettaton NEO laments that the protagonist does not want to join his fan club and the protagonist's EXP is set to 50,000, resulting in their LOVE reaching 19.

If any of the three above conditions are not met, Mettaton NEO will be in a different variation that ultimately causes the Genocide Route to be aborted and leads to the Alphys Ending of the Neutral Route. He is still destroyed in a single attack; however, this attack only deals 30,000 to 39,999 damage. Mettaton NEO points out the protagonist's lack of killing intent, telling them that they are not "absolutely evil" before dying. This event causes the protagonist to only gain 10,000 EXP.

Relationships

The protagonist.

Initially, Mettaton actively antagonized the protagonist under the assumption his human eradication functions along with directive errors gave him an intensified hate for humans and a need to murder them. This was proven to be an act, however, as Mettaton mentions loving humanity, but continued to go against the protagonist on his terms to take their SOUL and prevent a possible war by Asgore. After their battle, however, he was confident the protagonist was strong enough to avoid this themselves.

Mettaton initially bonded over their common interest in the culture of humanity, and he is grateful of Alphys for making his physical body. Soon after he receives his initial body, he often belittles her and her interests. However, he owed Alphys enough to play along with her plan to act out the role of a malfunctioning robot before eventually going on his agenda with the protagonist. Despite this, in the ending where he is given the role as king, he mentions how he regrets being cruel to her before she went missing.

Napstablook

Before he got his body, he was Napstablook's cousin, helping them with the snail farm and living next-door to them. They seemed very close, to the point that Mettaton originally declared that he would never leave them behind and often called them Blooky (an alias he would keep). Although he had since left them for stardom, he obviously cared for his cousin, looked regretful when Napstablook called into the program to give gratitude for his show ever since he left. Once he had his permanent new body, he immediately recruited his cousin on his tour so they could stay together.

"Mettaton: Live from Hotland," a tour poster of Mettaton and a live studio audience sold on Fangamer

  • If "Metta", "Mett" (case insensitive) or "MTT" is typed for the fallen child 's name , the response becomes "OOOOH!!! ARE YOU PROMOTING MY BRAND?" and allows the name to be used.
  • "Mettaton" is similar in pronunciation and spelling to " Metatron ," the highest angel in Judeo-Islamic lore , also known as the Voice of God and Recording Angel, whose name is transliterated into Greek as MTT. This could be a reference to Shin Megami Tensei 's Metatron specifically, as both have metallic appearances, have similar poses, and are very powerful late-game bosses.
  • The name could also be a combination of "metal" and "automaton" or "zettaton."
  • "Mettaton" may be a portmanteau of "Meta" and "ton," as Mettaton offers many instances of meta-commentary on the structure of Undertale . Examples include referencing the viewer if one picks "Don't Know" for Alphys's crush question during the quiz show, his words turning into bombs in the news report, changing the window name during his musical segment, mentioning room numbers before his colored tile maze, discussing Toriel plushies in the epilogue and offering a newscast to the fourth wall while the protagonist is visible on-screen.
  • Mettaton EX has shoulder pads similar to Undyne's, both of which are longer and spikier in the Genocide Route.
  • Notably, Mettaton NEO can be encountered in an aborted Genocide Route (Alphys Ending, Near Genocide) unlike Undyne the Undying.
  • Both of their forms on the Genocide Route feature hearts on their chest pieces.
  • Both of their battles on the Genocide Route feature the Battle Against a True Hero leitmotif.
  • Undyne holds a spear in her right hand, and Mettaton NEO has an arm cannon replacing his right hand.
  • Mettaton's box form resembles the coin-operated robot Cooker from the Wallace and Gromit short " A Grand Day Out ," as confirmed by the artbook.
  • Mettaton is at least partially inspired by Twitter user nerdbotmk2. [1]
  • Mettaton may be inspired by the character of TV Dinnah from the 2009 video game Little King's Story , another TV presenter boss fight who wants to cook the protagonist on a cooking show and gives them a life-or-death quiz. TV Dinnah also somewhat resembles Mettaton's box form. Little King's Story was directed and produced by Yoshiro Kimura, whom Toby Fox has praised in-person during an interview for creating games that inspired Undertale . [2]
  • In his box form, Mettaton's overworld sprite has five squares under the main screen, while his battle sprite only has 4.
  • Out of every character in the game, Mettaton has the most songs associated with him, at 14 songs (15 if CORE is counted).
  • Along with Flowey, Mettaton is one of the only characters in the game to have a voice clip, exclaiming " Oh, yes! " when he transforms into Mettaton EX, and "Yeah!" when attacked in his Mettaton EX form or when the protagonist steps on a green tile in the Multicolor Tile Puzzle.
  • The video game found during the MTT News scene is heavily implied to be Undertale itself. It is graphically similar to the Undertale box cover and uses the same sprite as the game behind Mysterious Door that was programmed by the dog . Mettaton states that his appearance doesn't occur until three-fourths of the way through the game and that the game is one where the player should check everything twice if checked twice (referencing the dialogue changing in the game if something is checked or spoken to twice.)
  • In the trailer announcing the game's special edition, Mettaton's hands are shown motioning towards the boxed special edition; it was confirmed in a Twitch livestream of Super Mario RPG by Legends of Localization that the costumed hands filmed for the sequence belong to Heidi "Poe" Mandelin, wife of Clyde "Tomato" Mandelin, who was involved in the fan translation of Mother 3 , and also co-wrote the book Legends of Localization Book 3: UNDERTALE , a book about the game's Japanese localization. Poe also stated that the sequence was filmed in reverse; the footage was then played backwards, to achieve the final shot seen in the trailer.
  • Several unused sprites for Mettaton EX's head show his right eye, and what appears to be the machinery on his face. [3]
  • "Hopes&Dreams" references the track " Hopes and Dreams ," among other recurring mentions of this phrase.
  • "Snips&Snails" and "Sugar&Spice" are references to the nursery rhyme " What Are Little Boys Made Of? "
  • Mettaton's box form is identical to the console that operates the colored tile maze in Snowdin Forest, and this machine may have been Mettaton himself; during his colored tile maze, Mettaton remarks that the protagonist has seen the puzzle "ABOUT A HUNDRED ROOMS AGO" and presumes that they remember the rules.
  • The song and the scenario from the musical are a reference to the " Maria and Draco " opera from Final Fantasy VI . [4]
  • Mettaton EX recognizes answers for "sexy," "foxy," and "tantalizing" in his essay question, but since the X and Z keys cannot be inputted without holding CTRL, they are only possible inputs in the PlayStation and Nintendo Switch versions. [5] [6]
  • Mettaton EX originally saved images of all the essays to the computer's hard drive. This feature was removed for being too buggy. [7]
  • According to the Undertale Kickstarter , there were originally plans for the protagonist to have a robot husband, which is presumed to be Mettaton. [8] This feature ended up being cut out of the final version of the game for unknown reasons. [9] Additional evidence that points to this event is an unused variation of the Start Menu that can be found in the game's files and explains why the full menu suddenly includes Mettaton and Napstablook.
  • In Undertale v1.001, Mettaton's stats for his box form have been changed from "10 ATK 999 DEF" to "ATK 30 DEF 255." 255 is the maximum value of an unsigned 8-bit integer , and is often the maximum value possible for things in video games.
  • The Mettaton EX fight, according to Toby, was the hardest one for him to design. [10] There were many attack ideas for the Mettaton EX fight that were never used, such as more tiny Mettaton robots, and disco dancing robots that threw stars. [11] [12] [13]
  • Mettaton EX slightly resembles Bob Sparker and was inspired by the character.
  • Based off of old official artwork by Temmie Chang it is likely that Mettaton was originally intended to have claw hands. [14] [15]
  • Mettaton's line when you check the sink in the cooking show " THIS ISN'T A SHOW ABOUT WASHING YOUR HANDS, DARLING. THAT'S ON WEDNESDAYS! " suggests that Undertale does not take place on a Wednesday. However, the HUD during Papyrus's date [16] and the River Person [17] can allude to Wednesday if that is the computer's date; the actual day of the week is likely intentionally ambiguous.
  • Spamton mentions Mettaton NEO's theme, Power of "NEO" , by name during his battle. The melody of Power of "NEO" can also be heard in Spamton NEO's boss theme, BIG SHOT .
  • After defeating Spamton NEO, another character implies that his body was created by the ghost who would become Mettaton in Deltarune' s continuity.
  • If Spamton NEO is defeated through violence in a neutral route, he attempts (and fails) to transform into "Spamton EX".
  • On a Snowgrave route, Spamton says his NEO body is "known for its high defense", an ironic reference to the fact that Undertale' s Mettaton NEO is killed in one hit.
  • In the MTT News segment, the line " THIS NEWS REPORT... IS TURNING INTO A DISASTER REPORT!!! " is a reference to the 2003 Playstation 2 game Disaster Report . [18]
  • ↑ @NerdbotMk2 @fridayafternoon I won't lie if I said I wasn't thinking about you a lil when I put him in - @FwugRadiation on Twitter, September 22, 2015. Archived on November 07, 2015.
  • ↑ 『moon』は日本語版しかないので、僕の日本語がもっと上達するなり、英語版が出るなりしないと完全にはやり込めないんですが、 "勇者が悪役”ということや、 "モンスターは必ずしも悪者ではない”というゲームのコンセプト自体に感銘を受けました。 - 『UNDERTALE』トビー・フォックス×『東方』ZUN×Onion Games木村祥朗鼎談──自分が幸せでいられる道を進んだらこうなった──同人の魂、インディーの自由を大いに語る (October 19, 2018.) 電ファミニコゲーマー .
  • ↑ Sprites "spr_mettface_defeated_" 9 through 11. Imgur
  • ↑ Final Fantasy VI Adv: Oh, Maria - YouTube
  • ↑ Question about Mettaton's Essay - Reddit
  • ↑ Undertale - The Cutting Room Floor
  • ↑ Originally Mettaton actually saved images of all of your essays to the hard drive. But it was buggy so I removed it. - @FwugRadiation on Twitter, January 12, 2016. [deleted]
  • ↑ "Seriously, you can literally have a robot husband." – Toby Fox. June 24, 2013. Kickstarter.
  • ↑ "After 2.5 years... some critical features promised in the KS changed. Example: I said you could marry a robot. Actually, you can't marry a robot." – Toby Fox. June 30, 2015. Kickstarter.
  • ↑ I think the Mettaton battle was actually the hardest one for me to design. I made a lot of bullet objects/attack ideas I didn't use since the action is "shooting," the action of "aiming&shooting" had to be important. But shooting&aiming in shmups (like Touhou) is often not actually about "aim&fire!" but "hold down the shoot button and position yourself under the enemy" for the most part. - Toby Fox (@tobyfox) on Twitter, June 13, 2016.
  • ↑ I don't remember all the unused attacks but I wanted to have more tiny mettaton robots besides the umbrella ones that blow kisses - Toby Fox (@tobyfox) on Twitter, June 13, 2016.
  • ↑ There was a disco dancing one that threw stars but it wasn't fun. I think that's it. - Toby Fox (@tobyfox) on Twitter, June 13, 2016.
  • ↑ All of Mettaton EX's unused attacks restored (Undertale) - YouTube
  • ↑ hooray!! we reached 51k!! Thank you all so much for your contributions and interest!! - temmiechang on Tumblr, July 25, 2013.
  • ↑ (undertale spoilers!!!!) look at what @tuyoki made me 2 years ago for my birthday that i've been sitting on until now - @FwugRadiation on Twitter, October 11, 2015. [deleted]
  • ↑ WED - Dating HUD, on Wednesdays
  • ↑ Tra la la. Somewhere, it's Wednesday. So be careful. - River Person, on Wednesdays
  • ↑ Legends of Localization Book 3: UNDERTALE , page 200, ISBN 9781945908019
  • 1 True Pacifist Route
  • 2 Genocide Route

Introducing: Undertale Prompt Month!

From September 1st-30th , flex your creative muscles with a whole month of prompts in honor of Undertale's 5th anniversary!

Fanfics, comics, art, crafts, music; everything is fair game! Whether you want to use all 30 prompts or even just one, you're free to choose! There's only 2 rules.

  • Tag your creation #underprompt2020

Anyway, I hope everyone enjoys this little event I threw together lol. Thank you!

@ underprompts

Studying / Writing Tools

Essay Topic Generator

book notes

What do you do when you know what type of essay you need to write but can’t think of a proper topic? Answer: Come to our Essay Topic Generator and let us create the kind of topics you need to get started.

Need to write a controversial essay? Select that type from the drop down menu and click on the Generate Topic button!

Need to write an argumentative essay? No problem—we’ve got that covered as well!

Need to write a compare and contrast essay but don’t know what to write it about? You’ve come to the right place. Let’s see how it works!

How to Use Essay Topic Generator

Our essay topic generator is simple to use. You start by selecting the type of essay you will be writing. Options include: controversial essay, persuasive essay, personal narrative, and many more.

Once you’ve selected the type of essay you want, just click on the big blue Generate Topics button. Your first result will give you a topic idea followed by a proposed Essay Title.

For example, let’s say you have to write a controversial essay but can’t think of a topic. Select controversial essay from the drop down bar, and click the blue button. The first result you receive might be a topic on gun control. Gun control is a controversial topic in America, right? See how simple it is?

Well, say you don’t want to write on gun control—no problem. Hit that blue button again. You’ll get another result—maybe abortion as a topic. Abortion is a super controversial topic and would be a great subject for a controversial essay. But maybe you don’t want to write on that either. So smash that button again! We have hundreds of topic ideas and we’re sure you’ll find one that strikes your fancy.

Choose Your Essay Type

Essay topics.

Okay—so what? You’ve got a topic. Now you have to explore that topic. What’s that mean? It means you have to get to know the topic and understand it before you can expect to write about it. The topic is basically just a broad field for you play in. But in order to connect with your reader, you need to narrow the field. Think of your topic as a ballpark and connecting with your reader as a series of bases that have to be tagged before you can make it home. The topic tells you which ballpark you’ll be playing in—but you still need to step up to the plate and put the ball in play for your audience to care.

So how do you put the ball in play? That’s where our suggested Essay Title can help. It might strike you as a bunt or, meh, as a single. Or you make get one that strikes you as a double or a triple—or maybe one that even looks a like a homerun. It doesn’t really matter because a title is about putting the ball in play. You’re narrowing the focus and just trying to reach base in most cases. Once you’re on you can think about how to get from first to second or from second to third. The end goal is to make it to home plate. But the title is where it starts. It gets you thinking in the right direction. That’s why we don’t just stop at generating a topic idea for you. We also give you a great title to think about. So keep hitting that blue button and generating more results for yourself until you find one that fits!

From Essay Topic to Essay Title

The  essay topic  gets you in the right ballpark, the essay title lets you put the ball in play, and now you have to round the bases—i.e., write your essay. What are the steps to doing this? How do you go from topic to title to writing? Think about the bases on the base path. What shape do they make if you trace a line from home to first to second to third and back to home plate? They make a diamond shape. That’s why it’s called the baseball diamond. Now consider the writing process as a similar shape that you need to create. You’re creating a diamond for your reader and it is basically a step by step process just like rounding the bases.

Need help coming up with a title? Try our  essay title generator.

First, you need to touch first base. Let’s say first base is where you brainstorm using the title you’ve been provided. For example, if you like the title, “Should Public Schools Allow Teachers to Carry Weapons?” you can brainstorm the pros and cons of teacher carry schools. What would be the benefits of having armed teachers in public schools? What would be the drawbacks? What would be your preference if you were a student in such a school? Jot down your answers to these questions. There! See? Now you’re on first base.

Let’s get you over to second. Reaching second base is about pulling those ideas together and giving them some shape. This is where you want to start creating an outline for your essay. You bring these ideas together and arrange them in a way that makes sense. Your outline should start with an introduction that tells what you’ll be looking at in the essay. Then create a section for each point you want to cover: a section for the pros, a section for the cons, and a section for your personal view. Then follow that up with a conclusion that reiterates your points. There you go—that’s an outline!

Now you have to get to third base. Easy—start writing! Follow your outline to stay in the base paths and before you know it you’ll be rounding third and heading for home. To get to home plate all you need to do is go back and edit your essay!

Additional Title Information

You’ll notice that once you hit that big blue button we don’t just give you a topic and a title. We also give you Additional Info. This is where we provide with some more tips to think about when you got to make your outline. For example, with an essay on gun control, you might want to give both perspectives by arguing for one side and then writing a rebuttal. So pay attention to the Additional Info that we offer because it will help you round those bases.

Our Essay Topic Generator is a great way for anyone with writer’s block to get ideas on a topic. Click on the type of essay you need to write by selecting it from the drop down bar. Then click on the Generate Topics button. We’ll give you a topic that fits the type of essay you’re writing. We’ll also give you an essay title to help you get started with the brainstorming process. Finally, we’ll hit you up with some helpful additional info that you can use to flesh out your outline and round the bases towards writing your essay. Hey—no need to thank us! That’s why we’re here: we know that when it comes to writing, every little bit helps.

essay prompt undertale

Join thousands of other students and "spark your studies."

essay prompt undertale

2 Uses Left

Register now for FREE and get Unlimited Access to all Study Documents & Studying / Writing Tools.

You’ve reached your preview limit this month

essay prompt undertale

Already a StudySpark member? Log In

or Contact customer support in case of any questions.

Study Guides

Writing Guides

Customer Service

Your customer service team resolved my issue in minutes!

Studyspark

Study Spark - providing your mind the spark it needs to help improve your grades.

©2020 Study Spark LLC.

Studyspark.com uses cookies to offer our users the best experience. By continuing, you are agreeing to receive cookies. Privacy Policy

Home — Essay Samples — Entertainment — Video Games — A role-playing video game Undertale

test_template

A Role-playing Video Game Undertale

  • Categories: Video Games

About this sample

close

Words: 454 |

Published: Mar 1, 2019

Words: 454 | Page: 1 | 3 min read

Works Cited

  • Fox, T. (2015). Undertale [Video game]. Toby Fox.
  • Matulef, J. (2015). Undertale review. Eurogamer. Retrieved from https://www.eurogamer.net/articles/2015-10-19-undertale-review
  • Rigney, R. (2016). Why Undertale is the best indie game you will ever play. IGN. Retrieved from https://www.ign.com/articles/2016/01/27/why-undertale-is-the-best-indie-game-you-will-ever-play
  • Parkin, S. (2015). How Undertale's moral dilemmas stack up to philosophy's greatest questions. The Guardian. Retrieved from https://www.theguardian.com/technology/2015/dec/07/how-undertales-moral-dilemmas-stack-up-to-philosophys-greatest-questions
  • Machkovech, S. (2015). The 2015 Indie Game of the Year: Undertale. Ars Technica. Retrieved from https://arstechnica.com/gaming/2015/12/the-2015-indie-game-of-the-year-undertale/
  • Radulovic, P. (2018). How Undertale became one of the decade's biggest indie hits. Polygon. Retrieved from https://www.polygon.com/features/2018/12/20/18149501/undertale-anniversary-toby-fox-interview
  • Wawro, A. (2015). Road to the IGF: Toby Fox's Undertale. Gamasutra. Retrieved from https://www.gamasutra.com/view/news/264010/Road_to_the_IGF_Toby_Foxs_Undertale.php
  • Webster, A. (2016). Why Undertale is a masterpiece of subversion. The Verge. Retrieved from https://www.theverge.com/2016/1/12/10755680/undertale-video-game-masterpiece-of-subversion
  • Crecente, B. (2015). Why Undertale is 2015's most important game. Polygon. Retrieved from https://www.polygon.com/2015/12/31/10693490/undertale-2015-game-of-the-year
  • Cork, J. (2015). Why Undertale deserves its cult following. Game Informer. Retrieved from https://www.gameinformer.com/b/features/archive/2015/12/31/why-undertale-deserves-its-cult-following.aspx

Image of Dr. Charlotte Jacobson

Cite this Essay

Let us write you an essay from scratch

  • 450+ experts on 30 subjects ready to help
  • Custom essay delivered in as few as 3 hours

Get high-quality help

author

Dr. Karlyna PhD

Verified writer

  • Expert in: Entertainment

writer

+ 120 experts online

By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy . We’ll occasionally send you promo and account related email

No need to pay just yet!

Related Essays

2 pages / 723 words

5 pages / 2475 words

1 pages / 505 words

1 pages / 490 words

Remember! This is just a sample.

You can get your custom paper by one of our expert writers.

121 writers online

Still can’t find what you need?

Browse our vast selection of original essay samples, each expertly formatted and styled

Related Essays on Video Games

The commentary is a theoretical framework that builds on the concept that eSports should be considered a sport. The first part of the paper analyzes the definition of a sport and determines that competitive video games should [...]

The world of video gaming has expanded rapidly over the past few decades, with millions of people of all ages engaging in this form of entertainment. While video games can be a source of fun and enjoyment, there is growing [...]

Blomberg, R. (2019). Video Games Can Never Be Sport. Huffpost. Retrieved from

Video games have become an integral part of modern society, with millions of people around the world playing games on various devices. From consoles to mobile phones, video games have become ubiquitous and enjoyed by individuals [...]

Anderson, C. A., Shibuya, A., Ihori, N., Swing, E. L., Bushman, B. J., Sakamoto, A., ... & Saleem, M. (2010). Violent video game effects on aggression, empathy, and prosocial behavior in Eastern and Western countries: A [...]

Sports is well respected and well recognized by many around the world. The reason for this is because as a society, we all agree that it takes a set amount of skill to be considered great at the sport. But Esports doesn’t get [...]

Related Topics

By clicking “Send”, you agree to our Terms of service and Privacy statement . We will occasionally send you account related emails.

Where do you want us to send this sample?

By clicking “Continue”, you agree to our terms of service and privacy policy.

Be careful. This essay is not unique

This essay was donated by a student and is likely to have been used and submitted before

Download this Sample

Free samples may contain mistakes and not unique parts

Sorry, we could not paraphrase this essay. Our professional writers can rewrite it and get you a unique paper.

Please check your inbox.

We can write you a custom essay that will follow your exact instructions and meet the deadlines. Let's fix your grades together!

Get Your Personalized Essay in 3 Hours or Less!

We use cookies to personalyze your web-site experience. By continuing we’ll assume you board with our cookie policy .

  • Instructions Followed To The Letter
  • Deadlines Met At Every Stage
  • Unique And Plagiarism Free

essay prompt undertale

How can I be sure you will write my paper, and it is not a scam?

Susan Devlin

The University of Chicago The Law School

College essays and diversity in the post-affirmative action era, sonja starr’s latest research adds data, legal analysis to discussion about race in college admissions essays.

A woman sitting on a couch with a book on her lap

Editor’s Note: This story is part of an occasional series on research projects currently in the works at the Law School.

The Supreme Court’s decision in June 2023 to bar the use of affirmative action in college admissions raised many questions. One of the most significant is whether universities should consider applicants’ discussion of race in essays. The Court’s decision in Students for Fair Admissions (SFFA) v. Harvard did not require entirely race-blind admissions. Rather, the Court explicitly stated that admissions offices may weigh what students say about how race affected their lives. Yet the Court also warned that this practice may not be used to circumvent the bar on affirmative action.

Many university leaders made statements after SFFA suggesting that they take this passage seriously, and that it potentially points to a strategy for preserving diversity. But it’s not obvious how lower courts will distinguish between consideration of “race-related experience” and consideration of “race qua race.” Sonja Starr, Julius Kreeger Professor of Law & Criminology at the Law School, was intrigued by the implication of that question, calling the key passage of the Court’s opinion the “essay carveout.”

“Where is the line?” she wrote in a forthcoming article, the first of its kind to discuss this issue in depth in the post- SFFA era. “And what other potential legal pitfalls could universities encounter in evaluating essays about race?”

To inform her paper’s legal analysis, Starr conducted empirical analyses of how universities and students have included race in essays, both before and after the Court’s decision. She concluded that large numbers of applicants wrote about race, and that college essay prompts encouraged them to do so, even before SFFA .

Some thought the essay carveout made no sense. Justice Sonia Sotomayor called it “an attempt to put lipstick on a pig” in her dissent. Starr, however, disagrees. She argues that universities are on sound legal footing relying on the essay carveout, so long as they consider race-related experience in an individualized way. In her article, Starr points out reasons the essay carveout makes sense in the context of the Court’s other arguments. However, she points to the potential for future challenges—on both equal protection and First Amendment grounds—and discusses how colleges can survive them.

What the Empirical Research Showed

After SFFA , media outlets suggested that universities would add questions about race or identity in their admissions essays and that students would increasingly focus on that topic. Starr decided to investigate this speculation. She commissioned a professional survey group to recruit a nationally representative sample of recent college applicants. The firm queried 881 people about their essay content, about half of whom applied in 2022-23, before SFFA , and half of whom submitted in 2023-24.

The survey found that more than 60 percent of students in non-white groups wrote about race in at least some of their essays, as did about half of white applicants. But contrary to what the media suggested, there were no substantial changes between the pre-and post- SFFA application cycles.

Starr also reviewed essay prompts that 65 top schools have used over the last four years. She found that diversity and identity questions—as well as questions about overcoming adversity, which, for example, provide opportunities for students to discuss discrimination that they have faced—are common and have increased in frequency both before and after SFFA.

A Personally Inspired Interest

Although Starr has long written about equal protection issues, until about two years ago, she would have characterized educational admissions as a bit outside her wheelhouse. Her research has mostly focused on the criminal justice system, though race is often at the heart of it. In the past, for example, she has assessed the role of race in sentencing, the constitutionality of algorithmic risk assessment instruments in criminal justice, as well as policies to expand employment options for people with criminal records.

But a legal battle around admissions policies at Fairfax County’s Thomas Jefferson High School for Science and Technology—the high school that Starr attended—caught her attention. Starr followed the case closely and predicted that “litigation may soon be an ever-present threat for race-conscious policymaking” in a 2024 Stanford Law Review article on that and other magnet school cases.

“I got really interested in that case partly because of the personal connection,” she said. “But I ended up writing about it as an academic matter, and that got me entrenched in this world of educational admissions questions and their related implications for other areas of equal protection law.”

Implications in Education and Beyond

Starr’s forthcoming paper argues that the essay carveout provides a way for colleges to maintain diversity and stay on the right side of the Court’s decision.

“I believe there’s quite a bit of space that’s open for colleges to pursue in this area without crossing that line,” she said. “I lay out the arguments that colleges can put forth.”

Nevertheless, Starr expects future litigation targeting the essay carveout.

“I think we could see cases filed as soon as this year when the admissions numbers come out,” she said, pointing out that conservative legal organizations, such as the Pacific Legal Foundation, have warned that they’re going to be keeping a close eye on admissions numbers and looking for ways that schools are circumventing SFFA .

Starr envisions her paper being used as a resource for schools that want to obey the law while also maintaining diversity. “The preservation of diversity is not a red flag that something unconstitutional is happening,” she said. “There are lots of perfectly permissible ways that we can expect diversity to be maintained in this post- affirmative action era.”

Starr’s article, “Admissions Essays after SFFA ,” is slated to be published in Indiana Law Journal in early 2025.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 03 June 2024

Applying large language models for automated essay scoring for non-native Japanese

  • Wenchao Li 1 &
  • Haitao Liu 2  

Humanities and Social Sciences Communications volume  11 , Article number:  723 ( 2024 ) Cite this article

129 Accesses

1 Altmetric

Metrics details

  • Language and linguistics

Recent advancements in artificial intelligence (AI) have led to an increased use of large language models (LLMs) for language assessment tasks such as automated essay scoring (AES), automated listening tests, and automated oral proficiency assessments. The application of LLMs for AES in the context of non-native Japanese, however, remains limited. This study explores the potential of LLM-based AES by comparing the efficiency of different models, i.e. two conventional machine training technology-based methods (Jess and JWriter), two LLMs (GPT and BERT), and one Japanese local LLM (Open-Calm large model). To conduct the evaluation, a dataset consisting of 1400 story-writing scripts authored by learners with 12 different first languages was used. Statistical analysis revealed that GPT-4 outperforms Jess and JWriter, BERT, and the Japanese language-specific trained Open-Calm large model in terms of annotation accuracy and predicting learning levels. Furthermore, by comparing 18 different models that utilize various prompts, the study emphasized the significance of prompts in achieving accurate and reliable evaluations using LLMs.

Similar content being viewed by others

essay prompt undertale

Accurate structure prediction of biomolecular interactions with AlphaFold 3

essay prompt undertale

Testing theory of mind in large language models and humans

essay prompt undertale

Highly accurate protein structure prediction with AlphaFold

Conventional machine learning technology in aes.

AES has experienced significant growth with the advancement of machine learning technologies in recent decades. In the earlier stages of AES development, conventional machine learning-based approaches were commonly used. These approaches involved the following procedures: a) feeding the machine with a dataset. In this step, a dataset of essays is provided to the machine learning system. The dataset serves as the basis for training the model and establishing patterns and correlations between linguistic features and human ratings. b) the machine learning model is trained using linguistic features that best represent human ratings and can effectively discriminate learners’ writing proficiency. These features include lexical richness (Lu, 2012 ; Kyle and Crossley, 2015 ; Kyle et al. 2021 ), syntactic complexity (Lu, 2010 ; Liu, 2008 ), text cohesion (Crossley and McNamara, 2016 ), and among others. Conventional machine learning approaches in AES require human intervention, such as manual correction and annotation of essays. This human involvement was necessary to create a labeled dataset for training the model. Several AES systems have been developed using conventional machine learning technologies. These include the Intelligent Essay Assessor (Landauer et al. 2003 ), the e-rater engine by Educational Testing Service (Attali and Burstein, 2006 ; Burstein, 2003 ), MyAccess with the InterlliMetric scoring engine by Vantage Learning (Elliot, 2003 ), and the Bayesian Essay Test Scoring system (Rudner and Liang, 2002 ). These systems have played a significant role in automating the essay scoring process and providing quick and consistent feedback to learners. However, as touched upon earlier, conventional machine learning approaches rely on predetermined linguistic features and often require manual intervention, making them less flexible and potentially limiting their generalizability to different contexts.

In the context of the Japanese language, conventional machine learning-incorporated AES tools include Jess (Ishioka and Kameda, 2006 ) and JWriter (Lee and Hasebe, 2017 ). Jess assesses essays by deducting points from the perfect score, utilizing the Mainichi Daily News newspaper as a database. The evaluation criteria employed by Jess encompass various aspects, such as rhetorical elements (e.g., reading comprehension, vocabulary diversity, percentage of complex words, and percentage of passive sentences), organizational structures (e.g., forward and reverse connection structures), and content analysis (e.g., latent semantic indexing). JWriter employs linear regression analysis to assign weights to various measurement indices, such as average sentence length and total number of characters. These weights are then combined to derive the overall score. A pilot study involving the Jess model was conducted on 1320 essays at different proficiency levels, including primary, intermediate, and advanced. However, the results indicated that the Jess model failed to significantly distinguish between these essay levels. Out of the 16 measures used, four measures, namely median sentence length, median clause length, median number of phrases, and maximum number of phrases, did not show statistically significant differences between the levels. Additionally, two measures exhibited between-level differences but lacked linear progression: the number of attributives declined words and the Kanji/kana ratio. On the other hand, the remaining measures, including maximum sentence length, maximum clause length, number of attributive conjugated words, maximum number of consecutive infinitive forms, maximum number of conjunctive-particle clauses, k characteristic value, percentage of big words, and percentage of passive sentences, demonstrated statistically significant between-level differences and displayed linear progression.

Both Jess and JWriter exhibit notable limitations, including the manual selection of feature parameters and weights, which can introduce biases into the scoring process. The reliance on human annotators to label non-native language essays also introduces potential noise and variability in the scoring. Furthermore, an important concern is the possibility of system manipulation and cheating by learners who are aware of the regression equation utilized by the models (Hirao et al. 2020 ). These limitations emphasize the need for further advancements in AES systems to address these challenges.

Deep learning technology in AES

Deep learning has emerged as one of the approaches for improving the accuracy and effectiveness of AES. Deep learning-based AES methods utilize artificial neural networks that mimic the human brain’s functioning through layered algorithms and computational units. Unlike conventional machine learning, deep learning autonomously learns from the environment and past errors without human intervention. This enables deep learning models to establish nonlinear correlations, resulting in higher accuracy. Recent advancements in deep learning have led to the development of transformers, which are particularly effective in learning text representations. Noteworthy examples include bidirectional encoder representations from transformers (BERT) (Devlin et al. 2019 ) and the generative pretrained transformer (GPT) (OpenAI).

BERT is a linguistic representation model that utilizes a transformer architecture and is trained on two tasks: masked linguistic modeling and next-sentence prediction (Hirao et al. 2020 ; Vaswani et al. 2017 ). In the context of AES, BERT follows specific procedures, as illustrated in Fig. 1 : (a) the tokenized prompts and essays are taken as input; (b) special tokens, such as [CLS] and [SEP], are added to mark the beginning and separation of prompts and essays; (c) the transformer encoder processes the prompt and essay sequences, resulting in hidden layer sequences; (d) the hidden layers corresponding to the [CLS] tokens (T[CLS]) represent distributed representations of the prompts and essays; and (e) a multilayer perceptron uses these distributed representations as input to obtain the final score (Hirao et al. 2020 ).

figure 1

AES system with BERT (Hirao et al. 2020 ).

The training of BERT using a substantial amount of sentence data through the Masked Language Model (MLM) allows it to capture contextual information within the hidden layers. Consequently, BERT is expected to be capable of identifying artificial essays as invalid and assigning them lower scores (Mizumoto and Eguchi, 2023 ). In the context of AES for nonnative Japanese learners, Hirao et al. ( 2020 ) combined the long short-term memory (LSTM) model proposed by Hochreiter and Schmidhuber ( 1997 ) with BERT to develop a tailored automated Essay Scoring System. The findings of their study revealed that the BERT model outperformed both the conventional machine learning approach utilizing character-type features such as “kanji” and “hiragana”, as well as the standalone LSTM model. Takeuchi et al. ( 2021 ) presented an approach to Japanese AES that eliminates the requirement for pre-scored essays by relying solely on reference texts or a model answer for the essay task. They investigated multiple similarity evaluation methods, including frequency of morphemes, idf values calculated on Wikipedia, LSI, LDA, word-embedding vectors, and document vectors produced by BERT. The experimental findings revealed that the method utilizing the frequency of morphemes with idf values exhibited the strongest correlation with human-annotated scores across different essay tasks. The utilization of BERT in AES encounters several limitations. Firstly, essays often exceed the model’s maximum length limit. Second, only score labels are available for training, which restricts access to additional information.

Mizumoto and Eguchi ( 2023 ) were pioneers in employing the GPT model for AES in non-native English writing. Their study focused on evaluating the accuracy and reliability of AES using the GPT-3 text-davinci-003 model, analyzing a dataset of 12,100 essays from the corpus of nonnative written English (TOEFL11). The findings indicated that AES utilizing the GPT-3 model exhibited a certain degree of accuracy and reliability. They suggest that GPT-3-based AES systems hold the potential to provide support for human ratings. However, applying GPT model to AES presents a unique natural language processing (NLP) task that involves considerations such as nonnative language proficiency, the influence of the learner’s first language on the output in the target language, and identifying linguistic features that best indicate writing quality in a specific language. These linguistic features may differ morphologically or syntactically from those present in the learners’ first language, as observed in (1)–(3).

我-送了-他-一本-书

Wǒ-sòngle-tā-yī běn-shū

1 sg .-give. past- him-one .cl- book

“I gave him a book.”

Agglutinative

彼-に-本-を-あげ-まし-た

Kare-ni-hon-o-age-mashi-ta

3 sg .- dat -hon- acc- give.honorification. past

Inflectional

give, give-s, gave, given, giving

Additionally, the morphological agglutination and subject-object-verb (SOV) order in Japanese, along with its idiomatic expressions, pose additional challenges for applying language models in AES tasks (4).

足-が 棒-に なり-ました

Ashi-ga bo-ni nar-mashita

leg- nom stick- dat become- past

“My leg became like a stick (I am extremely tired).”

The example sentence provided demonstrates the morpho-syntactic structure of Japanese and the presence of an idiomatic expression. In this sentence, the verb “なる” (naru), meaning “to become”, appears at the end of the sentence. The verb stem “なり” (nari) is attached with morphemes indicating honorification (“ます” - mashu) and tense (“た” - ta), showcasing agglutination. While the sentence can be literally translated as “my leg became like a stick”, it carries an idiomatic interpretation that implies “I am extremely tired”.

To overcome this issue, CyberAgent Inc. ( 2023 ) has developed the Open-Calm series of language models specifically designed for Japanese. Open-Calm consists of pre-trained models available in various sizes, such as Small, Medium, Large, and 7b. Figure 2 depicts the fundamental structure of the Open-Calm model. A key feature of this architecture is the incorporation of the Lora Adapter and GPT-NeoX frameworks, which can enhance its language processing capabilities.

figure 2

GPT-NeoX Model Architecture (Okgetheng and Takeuchi 2024 ).

In a recent study conducted by Okgetheng and Takeuchi ( 2024 ), they assessed the efficacy of Open-Calm language models in grading Japanese essays. The research utilized a dataset of approximately 300 essays, which were annotated by native Japanese educators. The findings of the study demonstrate the considerable potential of Open-Calm language models in automated Japanese essay scoring. Specifically, among the Open-Calm family, the Open-Calm Large model (referred to as OCLL) exhibited the highest performance. However, it is important to note that, as of the current date, the Open-Calm Large model does not offer public access to its server. Consequently, users are required to independently deploy and operate the environment for OCLL. In order to utilize OCLL, users must have a PC equipped with an NVIDIA GeForce RTX 3060 (8 or 12 GB VRAM).

In summary, while the potential of LLMs in automated scoring of nonnative Japanese essays has been demonstrated in two studies—BERT-driven AES (Hirao et al. 2020 ) and OCLL-based AES (Okgetheng and Takeuchi, 2024 )—the number of research efforts in this area remains limited.

Another significant challenge in applying LLMs to AES lies in prompt engineering and ensuring its reliability and effectiveness (Brown et al. 2020 ; Rae et al. 2021 ; Zhang et al. 2021 ). Various prompting strategies have been proposed, such as the zero-shot chain of thought (CoT) approach (Kojima et al. 2022 ), which involves manually crafting diverse and effective examples. However, manual efforts can lead to mistakes. To address this, Zhang et al. ( 2021 ) introduced an automatic CoT prompting method called Auto-CoT, which demonstrates matching or superior performance compared to the CoT paradigm. Another prompt framework is trees of thoughts, enabling a model to self-evaluate its progress at intermediate stages of problem-solving through deliberate reasoning (Yao et al. 2023 ).

Beyond linguistic studies, there has been a noticeable increase in the number of foreign workers in Japan and Japanese learners worldwide (Ministry of Health, Labor, and Welfare of Japan, 2022 ; Japan Foundation, 2021 ). However, existing assessment methods, such as the Japanese Language Proficiency Test (JLPT), J-CAT, and TTBJ Footnote 1 , primarily focus on reading, listening, vocabulary, and grammar skills, neglecting the evaluation of writing proficiency. As the number of workers and language learners continues to grow, there is a rising demand for an efficient AES system that can reduce costs and time for raters and be utilized for employment, examinations, and self-study purposes.

This study aims to explore the potential of LLM-based AES by comparing the effectiveness of five models: two LLMs (GPT Footnote 2 and BERT), one Japanese local LLM (OCLL), and two conventional machine learning-based methods (linguistic feature-based scoring tools - Jess and JWriter).

The research questions addressed in this study are as follows:

To what extent do the LLM-driven AES and linguistic feature-based AES, when used as automated tools to support human rating, accurately reflect test takers’ actual performance?

What influence does the prompt have on the accuracy and performance of LLM-based AES methods?

The subsequent sections of the manuscript cover the methodology, including the assessment measures for nonnative Japanese writing proficiency, criteria for prompts, and the dataset. The evaluation section focuses on the analysis of annotations and rating scores generated by LLM-driven and linguistic feature-based AES methods.

Methodology

The dataset utilized in this study was obtained from the International Corpus of Japanese as a Second Language (I-JAS) Footnote 3 . This corpus consisted of 1000 participants who represented 12 different first languages. For the study, the participants were given a story-writing task on a personal computer. They were required to write two stories based on the 4-panel illustrations titled “Picnic” and “The key” (see Appendix A). Background information for the participants was provided by the corpus, including their Japanese language proficiency levels assessed through two online tests: J-CAT and SPOT. These tests evaluated their reading, listening, vocabulary, and grammar abilities. The learners’ proficiency levels were categorized into six levels aligned with the Common European Framework of Reference for Languages (CEFR) and the Reference Framework for Japanese Language Education (RFJLE): A1, A2, B1, B2, C1, and C2. According to Lee et al. ( 2015 ), there is a high level of agreement (r = 0.86) between the J-CAT and SPOT assessments, indicating that the proficiency certifications provided by J-CAT are consistent with those of SPOT. However, it is important to note that the scores of J-CAT and SPOT do not have a one-to-one correspondence. In this study, the J-CAT scores were used as a benchmark to differentiate learners of different proficiency levels. A total of 1400 essays were utilized, representing the beginner (aligned with A1), A2, B1, B2, C1, and C2 levels based on the J-CAT scores. Table 1 provides information about the learners’ proficiency levels and their corresponding J-CAT and SPOT scores.

A dataset comprising a total of 1400 essays from the story writing tasks was collected. Among these, 714 essays were utilized to evaluate the reliability of the LLM-based AES method, while the remaining 686 essays were designated as development data to assess the LLM-based AES’s capability to distinguish participants with varying proficiency levels. The GPT 4 API was used in this study. A detailed explanation of the prompt-assessment criteria is provided in Section Prompt . All essays were sent to the model for measurement and scoring.

Measures of writing proficiency for nonnative Japanese

Japanese exhibits a morphologically agglutinative structure where morphemes are attached to the word stem to convey grammatical functions such as tense, aspect, voice, and honorifics, e.g. (5).

食べ-させ-られ-まし-た-か

tabe-sase-rare-mashi-ta-ka

[eat (stem)-causative-passive voice-honorification-tense. past-question marker]

Japanese employs nine case particles to indicate grammatical functions: the nominative case particle が (ga), the accusative case particle を (o), the genitive case particle の (no), the dative case particle に (ni), the locative/instrumental case particle で (de), the ablative case particle から (kara), the directional case particle へ (e), and the comitative case particle と (to). The agglutinative nature of the language, combined with the case particle system, provides an efficient means of distinguishing between active and passive voice, either through morphemes or case particles, e.g. 食べる taberu “eat concusive . ” (active voice); 食べられる taberareru “eat concusive . ” (passive voice). In the active voice, “パン を 食べる” (pan o taberu) translates to “to eat bread”. On the other hand, in the passive voice, it becomes “パン が 食べられた” (pan ga taberareta), which means “(the) bread was eaten”. Additionally, it is important to note that different conjugations of the same lemma are considered as one type in order to ensure a comprehensive assessment of the language features. For example, e.g., 食べる taberu “eat concusive . ”; 食べている tabeteiru “eat progress .”; 食べた tabeta “eat past . ” as one type.

To incorporate these features, previous research (Suzuki, 1999 ; Watanabe et al. 1988 ; Ishioka, 2001 ; Ishioka and Kameda, 2006 ; Hirao et al. 2020 ) has identified complexity, fluency, and accuracy as crucial factors for evaluating writing quality. These criteria are assessed through various aspects, including lexical richness (lexical density, diversity, and sophistication), syntactic complexity, and cohesion (Kyle et al. 2021 ; Mizumoto and Eguchi, 2023 ; Ure, 1971 ; Halliday, 1985 ; Barkaoui and Hadidi, 2020 ; Zenker and Kyle, 2021 ; Kim et al. 2018 ; Lu, 2017 ; Ortega, 2015 ). Therefore, this study proposes five scoring categories: lexical richness, syntactic complexity, cohesion, content elaboration, and grammatical accuracy. A total of 16 measures were employed to capture these categories. The calculation process and specific details of these measures can be found in Table 2 .

T-unit, first introduced by Hunt ( 1966 ), is a measure used for evaluating speech and composition. It serves as an indicator of syntactic development and represents the shortest units into which a piece of discourse can be divided without leaving any sentence fragments. In the context of Japanese language assessment, Sakoda and Hosoi ( 2020 ) utilized T-unit as the basic unit to assess the accuracy and complexity of Japanese learners’ speaking and storytelling. The calculation of T-units in Japanese follows the following principles:

A single main clause constitutes 1 T-unit, regardless of the presence or absence of dependent clauses, e.g. (6).

ケンとマリはピクニックに行きました (main clause): 1 T-unit.

If a sentence contains a main clause along with subclauses, each subclause is considered part of the same T-unit, e.g. (7).

天気が良かった の で (subclause)、ケンとマリはピクニックに行きました (main clause): 1 T-unit.

In the case of coordinate clauses, where multiple clauses are connected, each coordinated clause is counted separately. Thus, a sentence with coordinate clauses may have 2 T-units or more, e.g. (8).

ケンは地図で場所を探して (coordinate clause)、マリはサンドイッチを作りました (coordinate clause): 2 T-units.

Lexical diversity refers to the range of words used within a text (Engber, 1995 ; Kyle et al. 2021 ) and is considered a useful measure of the breadth of vocabulary in L n production (Jarvis, 2013a , 2013b ).

The type/token ratio (TTR) is widely recognized as a straightforward measure for calculating lexical diversity and has been employed in numerous studies. These studies have demonstrated a strong correlation between TTR and other methods of measuring lexical diversity (e.g., Bentz et al. 2016 ; Čech and Miroslav, 2018 ; Çöltekin and Taraka, 2018 ). TTR is computed by considering both the number of unique words (types) and the total number of words (tokens) in a given text. Given that the length of learners’ writing texts can vary, this study employs the moving average type-token ratio (MATTR) to mitigate the influence of text length. MATTR is calculated using a 50-word moving window. Initially, a TTR is determined for words 1–50 in an essay, followed by words 2–51, 3–52, and so on until the end of the essay is reached (Díez-Ortega and Kyle, 2023 ). The final MATTR scores were obtained by averaging the TTR scores for all 50-word windows. The following formula was employed to derive MATTR:

\({\rm{MATTR}}({\rm{W}})=\frac{{\sum }_{{\rm{i}}=1}^{{\rm{N}}-{\rm{W}}+1}{{\rm{F}}}_{{\rm{i}}}}{{\rm{W}}({\rm{N}}-{\rm{W}}+1)}\)

Here, N refers to the number of tokens in the corpus. W is the randomly selected token size (W < N). \({F}_{i}\) is the number of types in each window. The \({\rm{MATTR}}({\rm{W}})\) is the mean of a series of type-token ratios (TTRs) based on the word form for all windows. It is expected that individuals with higher language proficiency will produce texts with greater lexical diversity, as indicated by higher MATTR scores.

Lexical density was captured by the ratio of the number of lexical words to the total number of words (Lu, 2012 ). Lexical sophistication refers to the utilization of advanced vocabulary, often evaluated through word frequency indices (Crossley et al. 2013 ; Haberman, 2008 ; Kyle and Crossley, 2015 ; Laufer and Nation, 1995 ; Lu, 2012 ; Read, 2000 ). In line of writing, lexical sophistication can be interpreted as vocabulary breadth, which entails the appropriate usage of vocabulary items across various lexicon-grammatical contexts and registers (Garner et al. 2019 ; Kim et al. 2018 ; Kyle et al. 2018 ). In Japanese specifically, words are considered lexically sophisticated if they are not included in the “Japanese Education Vocabulary List Ver 1.0”. Footnote 4 Consequently, lexical sophistication was calculated by determining the number of sophisticated word types relative to the total number of words per essay. Furthermore, it has been suggested that, in Japanese writing, sentences should ideally have a length of no more than 40 to 50 characters, as this promotes readability. Therefore, the median and maximum sentence length can be considered as useful indices for assessment (Ishioka and Kameda, 2006 ).

Syntactic complexity was assessed based on several measures, including the mean length of clauses, verb phrases per T-unit, clauses per T-unit, dependent clauses per T-unit, complex nominals per clause, adverbial clauses per clause, coordinate phrases per clause, and mean dependency distance (MDD). The MDD reflects the distance between the governor and dependent positions in a sentence. A larger dependency distance indicates a higher cognitive load and greater complexity in syntactic processing (Liu, 2008 ; Liu et al. 2017 ). The MDD has been established as an efficient metric for measuring syntactic complexity (Jiang, Quyang, and Liu, 2019 ; Li and Yan, 2021 ). To calculate the MDD, the position numbers of the governor and dependent are subtracted, assuming that words in a sentence are assigned in a linear order, such as W1 … Wi … Wn. In any dependency relationship between words Wa and Wb, Wa is the governor and Wb is the dependent. The MDD of the entire sentence was obtained by taking the absolute value of governor – dependent:

MDD = \(\frac{1}{n}{\sum }_{i=1}^{n}|{\rm{D}}{{\rm{D}}}_{i}|\)

In this formula, \(n\) represents the number of words in the sentence, and \({DD}i\) is the dependency distance of the \({i}^{{th}}\) dependency relationship of a sentence. Building on this, the annotation of sentence ‘Mary-ga-John-ni-keshigomu-o-watashita was [Mary- top -John- dat -eraser- acc -give- past] ’. The sentence’s MDD would be 2. Table 3 provides the CSV file as a prompt for GPT 4.

Cohesion (semantic similarity) and content elaboration aim to capture the ideas presented in test taker’s essays. Cohesion was assessed using three measures: Synonym overlap/paragraph (topic), Synonym overlap/paragraph (keywords), and word2vec cosine similarity. Content elaboration and development were measured as the number of metadiscourse markers (type)/number of words. To capture content closely, this study proposed a novel-distance based representation, by encoding the cosine distance between the essay (by learner) and essay task’s (topic and keyword) i -vectors. The learner’s essay is decoded into a word sequence, and aligned to the essay task’ topic and keyword for log-likelihood measurement. The cosine distance reveals the content elaboration score in the leaners’ essay. The mathematical equation of cosine similarity between target-reference vectors is shown in (11), assuming there are i essays and ( L i , …. L n ) and ( N i , …. N n ) are the vectors representing the learner and task’s topic and keyword respectively. The content elaboration distance between L i and N i was calculated as follows:

\(\cos \left(\theta \right)=\frac{{\rm{L}}\,\cdot\, {\rm{N}}}{\left|{\rm{L}}\right|{\rm{|N|}}}=\frac{\mathop{\sum }\nolimits_{i=1}^{n}{L}_{i}{N}_{i}}{\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{L}_{i}^{2}}\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{N}_{i}^{2}}}\)

A high similarity value indicates a low difference between the two recognition outcomes, which in turn suggests a high level of proficiency in content elaboration.

To evaluate the effectiveness of the proposed measures in distinguishing different proficiency levels among nonnative Japanese speakers’ writing, we conducted a multi-faceted Rasch measurement analysis (Linacre, 1994 ). This approach applies measurement models to thoroughly analyze various factors that can influence test outcomes, including test takers’ proficiency, item difficulty, and rater severity, among others. The underlying principles and functionality of multi-faceted Rasch measurement are illustrated in (12).

\(\log \left(\frac{{P}_{{nijk}}}{{P}_{{nij}(k-1)}}\right)={B}_{n}-{D}_{i}-{C}_{j}-{F}_{k}\)

(12) defines the logarithmic transformation of the probability ratio ( P nijk /P nij(k-1) )) as a function of multiple parameters. Here, n represents the test taker, i denotes a writing proficiency measure, j corresponds to the human rater, and k represents the proficiency score. The parameter B n signifies the proficiency level of test taker n (where n ranges from 1 to N). D j represents the difficulty parameter of test item i (where i ranges from 1 to L), while C j represents the severity of rater j (where j ranges from 1 to J). Additionally, F k represents the step difficulty for a test taker to move from score ‘k-1’ to k . P nijk refers to the probability of rater j assigning score k to test taker n for test item i . P nij(k-1) represents the likelihood of test taker n being assigned score ‘k-1’ by rater j for test item i . Each facet within the test is treated as an independent parameter and estimated within the same reference framework. To evaluate the consistency of scores obtained through both human and computer analysis, we utilized the Infit mean-square statistic. This statistic is a chi-square measure divided by the degrees of freedom and is weighted with information. It demonstrates higher sensitivity to unexpected patterns in responses to items near a person’s proficiency level (Linacre, 2002 ). Fit statistics are assessed based on predefined thresholds for acceptable fit. For the Infit MNSQ, which has a mean of 1.00, different thresholds have been suggested. Some propose stricter thresholds ranging from 0.7 to 1.3 (Bond et al. 2021 ), while others suggest more lenient thresholds ranging from 0.5 to 1.5 (Eckes, 2009 ). In this study, we adopted the criterion of 0.70–1.30 for the Infit MNSQ.

Moving forward, we can now proceed to assess the effectiveness of the 16 proposed measures based on five criteria for accurately distinguishing various levels of writing proficiency among non-native Japanese speakers. To conduct this evaluation, we utilized the development dataset from the I-JAS corpus, as described in Section Dataset . Table 4 provides a measurement report that presents the performance details of the 14 metrics under consideration. The measure separation was found to be 4.02, indicating a clear differentiation among the measures. The reliability index for the measure separation was 0.891, suggesting consistency in the measurement. Similarly, the person separation reliability index was 0.802, indicating the accuracy of the assessment in distinguishing between individuals. All 16 measures demonstrated Infit mean squares within a reasonable range, ranging from 0.76 to 1.28. The Synonym overlap/paragraph (topic) measure exhibited a relatively high outfit mean square of 1.46, although the Infit mean square falls within an acceptable range. The standard error for the measures ranged from 0.13 to 0.28, indicating the precision of the estimates.

Table 5 further illustrated the weights assigned to different linguistic measures for score prediction, with higher weights indicating stronger correlations between those measures and higher scores. Specifically, the following measures exhibited higher weights compared to others: moving average type token ratio per essay has a weight of 0.0391. Mean dependency distance had a weight of 0.0388. Mean length of clause, calculated by dividing the number of words by the number of clauses, had a weight of 0.0374. Complex nominals per T-unit, calculated by dividing the number of complex nominals by the number of T-units, had a weight of 0.0379. Coordinate phrases rate, calculated by dividing the number of coordinate phrases by the number of clauses, had a weight of 0.0325. Grammatical error rate, representing the number of errors per essay, had a weight of 0.0322.

Criteria (output indicator)

The criteria used to evaluate the writing ability in this study were based on CEFR, which follows a six-point scale ranging from A1 to C2. To assess the quality of Japanese writing, the scoring criteria from Table 6 were utilized. These criteria were derived from the IELTS writing standards and served as assessment guidelines and prompts for the written output.

A prompt is a question or detailed instruction that is provided to the model to obtain a proper response. After several pilot experiments, we decided to provide the measures (Section Measures of writing proficiency for nonnative Japanese ) as the input prompt and use the criteria (Section Criteria (output indicator) ) as the output indicator. Regarding the prompt language, considering that the LLM was tasked with rating Japanese essays, would prompt in Japanese works better Footnote 5 ? We conducted experiments comparing the performance of GPT-4 using both English and Japanese prompts. Additionally, we utilized the Japanese local model OCLL with Japanese prompts. Multiple trials were conducted using the same sample. Regardless of the prompt language used, we consistently obtained the same grading results with GPT-4, which assigned a grade of B1 to the writing sample. This suggested that GPT-4 is reliable and capable of producing consistent ratings regardless of the prompt language. On the other hand, when we used Japanese prompts with the Japanese local model “OCLL”, we encountered inconsistent grading results. Out of 10 attempts with OCLL, only 6 yielded consistent grading results (B1), while the remaining 4 showed different outcomes, including A1 and B2 grades. These findings indicated that the language of the prompt was not the determining factor for reliable AES. Instead, the size of the training data and the model parameters played crucial roles in achieving consistent and reliable AES results for the language model.

The following is the utilized prompt, which details all measures and requires the LLM to score the essays using holistic and trait scores.

Please evaluate Japanese essays written by Japanese learners and assign a score to each essay on a six-point scale, ranging from A1, A2, B1, B2, C1 to C2. Additionally, please provide trait scores and display the calculation process for each trait score. The scoring should be based on the following criteria:

Moving average type-token ratio.

Number of lexical words (token) divided by the total number of words per essay.

Number of sophisticated word types divided by the total number of words per essay.

Mean length of clause.

Verb phrases per T-unit.

Clauses per T-unit.

Dependent clauses per T-unit.

Complex nominals per clause.

Adverbial clauses per clause.

Coordinate phrases per clause.

Mean dependency distance.

Synonym overlap paragraph (topic and keywords).

Word2vec cosine similarity.

Connectives per essay.

Conjunctions per essay.

Number of metadiscourse markers (types) divided by the total number of words.

Number of errors per essay.

Japanese essay text

出かける前に二人が地図を見ている間に、サンドイッチを入れたバスケットに犬が入ってしまいました。それに気づかずに二人は楽しそうに出かけて行きました。やがて突然犬がバスケットから飛び出し、二人は驚きました。バスケット の 中を見ると、食べ物はすべて犬に食べられていて、二人は困ってしまいました。(ID_JJJ01_SW1)

The score of the example above was B1. Figure 3 provides an example of holistic and trait scores provided by GPT-4 (with a prompt indicating all measures) via Bing Footnote 6 .

figure 3

Example of GPT-4 AES and feedback (with a prompt indicating all measures).

Statistical analysis

The aim of this study is to investigate the potential use of LLM for nonnative Japanese AES. It seeks to compare the scoring outcomes obtained from feature-based AES tools, which rely on conventional machine learning technology (i.e. Jess, JWriter), with those generated by AI-driven AES tools utilizing deep learning technology (BERT, GPT, OCLL). To assess the reliability of a computer-assisted annotation tool, the study initially established human-human agreement as the benchmark measure. Subsequently, the performance of the LLM-based method was evaluated by comparing it to human-human agreement.

To assess annotation agreement, the study employed standard measures such as precision, recall, and F-score (Brants 2000 ; Lu 2010 ), along with the quadratically weighted kappa (QWK) to evaluate the consistency and agreement in the annotation process. Assume A and B represent human annotators. When comparing the annotations of the two annotators, the following results are obtained. The evaluation of precision, recall, and F-score metrics was illustrated in equations (13) to (15).

\({\rm{Recall}}(A,B)=\frac{{\rm{Number}}\,{\rm{of}}\,{\rm{identical}}\,{\rm{nodes}}\,{\rm{in}}\,A\,{\rm{and}}\,B}{{\rm{Number}}\,{\rm{of}}\,{\rm{nodes}}\,{\rm{in}}\,A}\)

\({\rm{Precision}}(A,\,B)=\frac{{\rm{Number}}\,{\rm{of}}\,{\rm{identical}}\,{\rm{nodes}}\,{\rm{in}}\,A\,{\rm{and}}\,B}{{\rm{Number}}\,{\rm{of}}\,{\rm{nodes}}\,{\rm{in}}\,B}\)

The F-score is the harmonic mean of recall and precision:

\({\rm{F}}-{\rm{score}}=\frac{2* ({\rm{Precision}}* {\rm{Recall}})}{{\rm{Precision}}+{\rm{Recall}}}\)

The highest possible value of an F-score is 1.0, indicating perfect precision and recall, and the lowest possible value is 0, if either precision or recall are zero.

In accordance with Taghipour and Ng ( 2016 ), the calculation of QWK involves two steps:

Step 1: Construct a weight matrix W as follows:

\({W}_{{ij}}=\frac{{(i-j)}^{2}}{{(N-1)}^{2}}\)

i represents the annotation made by the tool, while j represents the annotation made by a human rater. N denotes the total number of possible annotations. Matrix O is subsequently computed, where O_( i, j ) represents the count of data annotated by the tool ( i ) and the human annotator ( j ). On the other hand, E refers to the expected count matrix, which undergoes normalization to ensure that the sum of elements in E matches the sum of elements in O.

Step 2: With matrices O and E, the QWK is obtained as follows:

K = 1- \(\frac{\sum i,j{W}_{i,j}\,{O}_{i,j}}{\sum i,j{W}_{i,j}\,{E}_{i,j}}\)

The value of the quadratic weighted kappa increases as the level of agreement improves. Further, to assess the accuracy of LLM scoring, the proportional reductive mean square error (PRMSE) was employed. The PRMSE approach takes into account the variability observed in human ratings to estimate the rater error, which is then subtracted from the variance of the human labels. This calculation provides an overall measure of agreement between the automated scores and true scores (Haberman et al. 2015 ; Loukina et al. 2020 ; Taghipour and Ng, 2016 ). The computation of PRMSE involves the following steps:

Step 1: Calculate the mean squared errors (MSEs) for the scoring outcomes of the computer-assisted tool (MSE tool) and the human scoring outcomes (MSE human).

Step 2: Determine the PRMSE by comparing the MSE of the computer-assisted tool (MSE tool) with the MSE from human raters (MSE human), using the following formula:

\({\rm{PRMSE}}=1-\frac{({\rm{MSE}}\,{\rm{tool}})\,}{({\rm{MSE}}\,{\rm{human}})\,}=1-\,\frac{{\sum }_{i}^{n}=1{({{\rm{y}}}_{i}-{\hat{{\rm{y}}}}_{{\rm{i}}})}^{2}}{{\sum }_{i}^{n}=1{({{\rm{y}}}_{i}-\hat{{\rm{y}}})}^{2}}\)

In the numerator, ŷi represents the scoring outcome predicted by a specific LLM-driven AES system for a given sample. The term y i − ŷ i represents the difference between this predicted outcome and the mean value of all LLM-driven AES systems’ scoring outcomes. It quantifies the deviation of the specific LLM-driven AES system’s prediction from the average prediction of all LLM-driven AES systems. In the denominator, y i − ŷ represents the difference between the scoring outcome provided by a specific human rater for a given sample and the mean value of all human raters’ scoring outcomes. It measures the discrepancy between the specific human rater’s score and the average score given by all human raters. The PRMSE is then calculated by subtracting the ratio of the MSE tool to the MSE human from 1. PRMSE falls within the range of 0 to 1, with larger values indicating reduced errors in LLM’s scoring compared to those of human raters. In other words, a higher PRMSE implies that LLM’s scoring demonstrates greater accuracy in predicting the true scores (Loukina et al. 2020 ). The interpretation of kappa values, ranging from 0 to 1, is based on the work of Landis and Koch ( 1977 ). Specifically, the following categories are assigned to different ranges of kappa values: −1 indicates complete inconsistency, 0 indicates random agreement, 0.0 ~ 0.20 indicates extremely low level of agreement (slight), 0.21 ~ 0.40 indicates moderate level of agreement (fair), 0.41 ~ 0.60 indicates medium level of agreement (moderate), 0.61 ~ 0.80 indicates high level of agreement (substantial), 0.81 ~ 1 indicates almost perfect level of agreement. All statistical analyses were executed using Python script.

Results and discussion

Annotation reliability of the llm.

This section focuses on assessing the reliability of the LLM’s annotation and scoring capabilities. To evaluate the reliability, several tests were conducted simultaneously, aiming to achieve the following objectives:

Assess the LLM’s ability to differentiate between test takers with varying levels of oral proficiency.

Determine the level of agreement between the annotations and scoring performed by the LLM and those done by human raters.

The evaluation of the results encompassed several metrics, including: precision, recall, F-Score, quadratically-weighted kappa, proportional reduction of mean squared error, Pearson correlation, and multi-faceted Rasch measurement.

Inter-annotator agreement (human–human annotator agreement)

We started with an agreement test of the two human annotators. Two trained annotators were recruited to determine the writing task data measures. A total of 714 scripts, as the test data, was utilized. Each analysis lasted 300–360 min. Inter-annotator agreement was evaluated using the standard measures of precision, recall, and F-score and QWK. Table 7 presents the inter-annotator agreement for the various indicators. As shown, the inter-annotator agreement was fairly high, with F-scores ranging from 1.0 for sentence and word number to 0.666 for grammatical errors.

The findings from the QWK analysis provided further confirmation of the inter-annotator agreement. The QWK values covered a range from 0.950 ( p  = 0.000) for sentence and word number to 0.695 for synonym overlap number (keyword) and grammatical errors ( p  = 0.001).

Agreement of annotation outcomes between human and LLM

To evaluate the consistency between human annotators and LLM annotators (BERT, GPT, OCLL) across the indices, the same test was conducted. The results of the inter-annotator agreement (F-score) between LLM and human annotation are provided in Appendix B-D. The F-scores ranged from 0.706 for Grammatical error # for OCLL-human to a perfect 1.000 for GPT-human, for sentences, clauses, T-units, and words. These findings were further supported by the QWK analysis, which showed agreement levels ranging from 0.807 ( p  = 0.001) for metadiscourse markers for OCLL-human to 0.962 for words ( p  = 0.000) for GPT-human. The findings demonstrated that the LLM annotation achieved a significant level of accuracy in identifying measurement units and counts.

Reliability of LLM-driven AES’s scoring and discriminating proficiency levels

This section examines the reliability of the LLM-driven AES scoring through a comparison of the scoring outcomes produced by human raters and the LLM ( Reliability of LLM-driven AES scoring ). It also assesses the effectiveness of the LLM-based AES system in differentiating participants with varying proficiency levels ( Reliability of LLM-driven AES discriminating proficiency levels ).

Reliability of LLM-driven AES scoring

Table 8 summarizes the QWK coefficient analysis between the scores computed by the human raters and the GPT-4 for the individual essays from I-JAS Footnote 7 . As shown, the QWK of all measures ranged from k  = 0.819 for lexical density (number of lexical words (tokens)/number of words per essay) to k  = 0.644 for word2vec cosine similarity. Table 9 further presents the Pearson correlations between the 16 writing proficiency measures scored by human raters and GPT 4 for the individual essays. The correlations ranged from 0.672 for syntactic complexity to 0.734 for grammatical accuracy. The correlations between the writing proficiency scores assigned by human raters and the BERT-based AES system were found to range from 0.661 for syntactic complexity to 0.713 for grammatical accuracy. The correlations between the writing proficiency scores given by human raters and the OCLL-based AES system ranged from 0.654 for cohesion to 0.721 for grammatical accuracy. These findings indicated an alignment between the assessments made by human raters and both the BERT-based and OCLL-based AES systems in terms of various aspects of writing proficiency.

Reliability of LLM-driven AES discriminating proficiency levels

After validating the reliability of the LLM’s annotation and scoring, the subsequent objective was to evaluate its ability to distinguish between various proficiency levels. For this analysis, a dataset of 686 individual essays was utilized. Table 10 presents a sample of the results, summarizing the means, standard deviations, and the outcomes of the one-way ANOVAs based on the measures assessed by the GPT-4 model. A post hoc multiple comparison test, specifically the Bonferroni test, was conducted to identify any potential differences between pairs of levels.

As the results reveal, seven measures presented linear upward or downward progress across the three proficiency levels. These were marked in bold in Table 10 and comprise one measure of lexical richness, i.e. MATTR (lexical diversity); four measures of syntactic complexity, i.e. MDD (mean dependency distance), MLC (mean length of clause), CNT (complex nominals per T-unit), CPC (coordinate phrases rate); one cohesion measure, i.e. word2vec cosine similarity and GER (grammatical error rate). Regarding the ability of the sixteen measures to distinguish adjacent proficiency levels, the Bonferroni tests indicated that statistically significant differences exist between the primary level and the intermediate level for MLC and GER. One measure of lexical richness, namely LD, along with three measures of syntactic complexity (VPT, CT, DCT, ACC), two measures of cohesion (SOPT, SOPK), and one measure of content elaboration (IMM), exhibited statistically significant differences between proficiency levels. However, these differences did not demonstrate a linear progression between adjacent proficiency levels. No significant difference was observed in lexical sophistication between proficiency levels.

To summarize, our study aimed to evaluate the reliability and differentiation capabilities of the LLM-driven AES method. For the first objective, we assessed the LLM’s ability to differentiate between test takers with varying levels of oral proficiency using precision, recall, F-Score, and quadratically-weighted kappa. Regarding the second objective, we compared the scoring outcomes generated by human raters and the LLM to determine the level of agreement. We employed quadratically-weighted kappa and Pearson correlations to compare the 16 writing proficiency measures for the individual essays. The results confirmed the feasibility of using the LLM for annotation and scoring in AES for nonnative Japanese. As a result, Research Question 1 has been addressed.

Comparison of BERT-, GPT-, OCLL-based AES, and linguistic-feature-based computation methods

This section aims to compare the effectiveness of five AES methods for nonnative Japanese writing, i.e. LLM-driven approaches utilizing BERT, GPT, and OCLL, linguistic feature-based approaches using Jess and JWriter. The comparison was conducted by comparing the ratings obtained from each approach with human ratings. All ratings were derived from the dataset introduced in Dataset . To facilitate the comparison, the agreement between the automated methods and human ratings was assessed using QWK and PRMSE. The performance of each approach was summarized in Table 11 .

The QWK coefficient values indicate that LLMs (GPT, BERT, OCLL) and human rating outcomes demonstrated higher agreement compared to feature-based AES methods (Jess and JWriter) in assessing writing proficiency criteria, including lexical richness, syntactic complexity, content, and grammatical accuracy. Among the LLMs, the GPT-4 driven AES and human rating outcomes showed the highest agreement in all criteria, except for syntactic complexity. The PRMSE values suggest that the GPT-based method outperformed linguistic feature-based methods and other LLM-based approaches. Moreover, an interesting finding emerged during the study: the agreement coefficient between GPT-4 and human scoring was even higher than the agreement between different human raters themselves. This discovery highlights the advantage of GPT-based AES over human rating. Ratings involve a series of processes, including reading the learners’ writing, evaluating the content and language, and assigning scores. Within this chain of processes, various biases can be introduced, stemming from factors such as rater biases, test design, and rating scales. These biases can impact the consistency and objectivity of human ratings. GPT-based AES may benefit from its ability to apply consistent and objective evaluation criteria. By prompting the GPT model with detailed writing scoring rubrics and linguistic features, potential biases in human ratings can be mitigated. The model follows a predefined set of guidelines and does not possess the same subjective biases that human raters may exhibit. This standardization in the evaluation process contributes to the higher agreement observed between GPT-4 and human scoring. Section Prompt strategy of the study delves further into the role of prompts in the application of LLMs to AES. It explores how the choice and implementation of prompts can impact the performance and reliability of LLM-based AES methods. Furthermore, it is important to acknowledge the strengths of the local model, i.e. the Japanese local model OCLL, which excels in processing certain idiomatic expressions. Nevertheless, our analysis indicated that GPT-4 surpasses local models in AES. This superior performance can be attributed to the larger parameter size of GPT-4, estimated to be between 500 billion and 1 trillion, which exceeds the sizes of both BERT and the local model OCLL.

Prompt strategy

In the context of prompt strategy, Mizumoto and Eguchi ( 2023 ) conducted a study where they applied the GPT-3 model to automatically score English essays in the TOEFL test. They found that the accuracy of the GPT model alone was moderate to fair. However, when they incorporated linguistic measures such as cohesion, syntactic complexity, and lexical features alongside the GPT model, the accuracy significantly improved. This highlights the importance of prompt engineering and providing the model with specific instructions to enhance its performance. In this study, a similar approach was taken to optimize the performance of LLMs. GPT-4, which outperformed BERT and OCLL, was selected as the candidate model. Model 1 was used as the baseline, representing GPT-4 without any additional prompting. Model 2, on the other hand, involved GPT-4 prompted with 16 measures that included scoring criteria, efficient linguistic features for writing assessment, and detailed measurement units and calculation formulas. The remaining models (Models 3 to 18) utilized GPT-4 prompted with individual measures. The performance of these 18 different models was assessed using the output indicators described in Section Criteria (output indicator) . By comparing the performances of these models, the study aimed to understand the impact of prompt engineering on the accuracy and effectiveness of GPT-4 in AES tasks.

Based on the PRMSE scores presented in Fig. 4 , it was observed that Model 1, representing GPT-4 without any additional prompting, achieved a fair level of performance. However, Model 2, which utilized GPT-4 prompted with all measures, outperformed all other models in terms of PRMSE score, achieving a score of 0.681. These results indicate that the inclusion of specific measures and prompts significantly enhanced the performance of GPT-4 in AES. Among the measures, syntactic complexity was found to play a particularly significant role in improving the accuracy of GPT-4 in assessing writing quality. Following that, lexical diversity emerged as another important factor contributing to the model’s effectiveness. The study suggests that a well-prompted GPT-4 can serve as a valuable tool to support human assessors in evaluating writing quality. By utilizing GPT-4 as an automated scoring tool, the evaluation biases associated with human raters can be minimized. This has the potential to empower teachers by allowing them to focus on designing writing tasks and guiding writing strategies, while leveraging the capabilities of GPT-4 for efficient and reliable scoring.

figure 4

PRMSE scores of the 18 AES models.

This study aimed to investigate two main research questions: the feasibility of utilizing LLMs for AES and the impact of prompt engineering on the application of LLMs in AES.

To address the first objective, the study compared the effectiveness of five different models: GPT, BERT, the Japanese local LLM (OCLL), and two conventional machine learning-based AES tools (Jess and JWriter). The PRMSE values indicated that the GPT-4-based method outperformed other LLMs (BERT, OCLL) and linguistic feature-based computational methods (Jess and JWriter) across various writing proficiency criteria. Furthermore, the agreement coefficient between GPT-4 and human scoring surpassed the agreement among human raters themselves, highlighting the potential of using the GPT-4 tool to enhance AES by reducing biases and subjectivity, saving time, labor, and cost, and providing valuable feedback for self-study. Regarding the second goal, the role of prompt design was investigated by comparing 18 models, including a baseline model, a model prompted with all measures, and 16 models prompted with one measure at a time. GPT-4, which outperformed BERT and OCLL, was selected as the candidate model. The PRMSE scores of the models showed that GPT-4 prompted with all measures achieved the best performance, surpassing the baseline and other models.

In conclusion, this study has demonstrated the potential of LLMs in supporting human rating in assessments. By incorporating automation, we can save time and resources while reducing biases and subjectivity inherent in human rating processes. Automated language assessments offer the advantage of accessibility, providing equal opportunities and economic feasibility for individuals who lack access to traditional assessment centers or necessary resources. LLM-based language assessments provide valuable feedback and support to learners, aiding in the enhancement of their language proficiency and the achievement of their goals. This personalized feedback can cater to individual learner needs, facilitating a more tailored and effective language-learning experience.

There are three important areas that merit further exploration. First, prompt engineering requires attention to ensure optimal performance of LLM-based AES across different language types. This study revealed that GPT-4, when prompted with all measures, outperformed models prompted with fewer measures. Therefore, investigating and refining prompt strategies can enhance the effectiveness of LLMs in automated language assessments. Second, it is crucial to explore the application of LLMs in second-language assessment and learning for oral proficiency, as well as their potential in under-resourced languages. Recent advancements in self-supervised machine learning techniques have significantly improved automatic speech recognition (ASR) systems, opening up new possibilities for creating reliable ASR systems, particularly for under-resourced languages with limited data. However, challenges persist in the field of ASR. First, ASR assumes correct word pronunciation for automatic pronunciation evaluation, which proves challenging for learners in the early stages of language acquisition due to diverse accents influenced by their native languages. Accurately segmenting short words becomes problematic in such cases. Second, developing precise audio-text transcriptions for languages with non-native accented speech poses a formidable task. Last, assessing oral proficiency levels involves capturing various linguistic features, including fluency, pronunciation, accuracy, and complexity, which are not easily captured by current NLP technology.

Data availability

The dataset utilized was obtained from the International Corpus of Japanese as a Second Language (I-JAS). The data URLs: [ https://www2.ninjal.ac.jp/jll/lsaj/ihome2.html ].

J-CAT and TTBJ are two computerized adaptive tests used to assess Japanese language proficiency.

SPOT is a specific component of the TTBJ test.

J-CAT: https://www.j-cat2.org/html/ja/pages/interpret.html

SPOT: https://ttbj.cegloc.tsukuba.ac.jp/p1.html#SPOT .

The study utilized a prompt-based GPT-4 model, developed by OpenAI, which has an impressive architecture with 1.8 trillion parameters across 120 layers. GPT-4 was trained on a vast dataset of 13 trillion tokens, using two stages: initial training on internet text datasets to predict the next token, and subsequent fine-tuning through reinforcement learning from human feedback.

https://www2.ninjal.ac.jp/jll/lsaj/ihome2-en.html .

http://jhlee.sakura.ne.jp/JEV/ by Japanese Learning Dictionary Support Group 2015.

We express our sincere gratitude to the reviewer for bringing this matter to our attention.

On February 7, 2023, Microsoft began rolling out a major overhaul to Bing that included a new chatbot feature based on OpenAI’s GPT-4 (Bing.com).

Appendix E-F present the analysis results of the QWK coefficient between the scores computed by the human raters and the BERT, OCLL models.

Attali Y, Burstein J (2006) Automated essay scoring with e-rater® V.2. J. Technol., Learn. Assess., 4

Barkaoui K, Hadidi A (2020) Assessing Change in English Second Language Writing Performance (1st ed.). Routledge, New York. https://doi.org/10.4324/9781003092346

Bentz C, Tatyana R, Koplenig A, Tanja S (2016) A comparison between morphological complexity. measures: Typological data vs. language corpora. In Proceedings of the workshop on computational linguistics for linguistic complexity (CL4LC), 142–153. Osaka, Japan: The COLING 2016 Organizing Committee

Bond TG, Yan Z, Heene M (2021) Applying the Rasch model: Fundamental measurement in the human sciences (4th ed). Routledge

Brants T (2000) Inter-annotator agreement for a German newspaper corpus. Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00), Athens, Greece, 31 May-2 June, European Language Resources Association

Brown TB, Mann B, Ryder N, et al. (2020) Language models are few-shot learners. Advances in Neural Information Processing Systems, Online, 6–12 December, Curran Associates, Inc., Red Hook, NY

Burstein J (2003) The E-rater scoring engine: Automated essay scoring with natural language processing. In Shermis MD and Burstein JC (ed) Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Čech R, Miroslav K (2018) Morphological richness of text. In Masako F, Václav C (ed) Taming the corpus: From inflection and lexis to interpretation, 63–77. Cham, Switzerland: Springer Nature

Çöltekin Ç, Taraka, R (2018) Exploiting Universal Dependencies treebanks for measuring morphosyntactic complexity. In Aleksandrs B, Christian B (ed), Proceedings of first workshop on measuring language complexity, 1–7. Torun, Poland

Crossley SA, Cobb T, McNamara DS (2013) Comparing count-based and band-based indices of word frequency: Implications for active vocabulary research and pedagogical applications. System 41:965–981. https://doi.org/10.1016/j.system.2013.08.002

Article   Google Scholar  

Crossley SA, McNamara DS (2016) Say more and be more coherent: How text elaboration and cohesion can increase writing quality. J. Writ. Res. 7:351–370

CyberAgent Inc (2023) Open-Calm series of Japanese language models. Retrieved from: https://www.cyberagent.co.jp/news/detail/id=28817

Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, Minnesota, 2–7 June, pp. 4171–4186. Association for Computational Linguistics

Diez-Ortega M, Kyle K (2023) Measuring the development of lexical richness of L2 Spanish: a longitudinal learner corpus study. Studies in Second Language Acquisition 1-31

Eckes T (2009) On common ground? How raters perceive scoring criteria in oral proficiency testing. In Brown A, Hill K (ed) Language testing and evaluation 13: Tasks and criteria in performance assessment (pp. 43–73). Peter Lang Publishing

Elliot S (2003) IntelliMetric: from here to validity. In: Shermis MD, Burstein JC (ed) Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Google Scholar  

Engber CA (1995) The relationship of lexical proficiency to the quality of ESL compositions. J. Second Lang. Writ. 4:139–155

Garner J, Crossley SA, Kyle K (2019) N-gram measures and L2 writing proficiency. System 80:176–187. https://doi.org/10.1016/j.system.2018.12.001

Haberman SJ (2008) When can subscores have value? J. Educat. Behav. Stat., 33:204–229

Haberman SJ, Yao L, Sinharay S (2015) Prediction of true test scores from observed item scores and ancillary data. Brit. J. Math. Stat. Psychol. 68:363–385

Halliday MAK (1985) Spoken and Written Language. Deakin University Press, Melbourne, Australia

Hirao R, Arai M, Shimanaka H et al. (2020) Automated essay scoring system for nonnative Japanese learners. Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pp. 1250–1257. European Language Resources Association

Hunt KW (1966) Recent Measures in Syntactic Development. Elementary English, 43(7), 732–739. http://www.jstor.org/stable/41386067

Ishioka T (2001) About e-rater, a computer-based automatic scoring system for essays [Konpyūta ni yoru essei no jidō saiten shisutemu e − rater ni tsuite]. University Entrance Examination. Forum [Daigaku nyūshi fōramu] 24:71–76

Hochreiter S, Schmidhuber J (1997) Long short- term memory. Neural Comput. 9(8):1735–1780

Article   CAS   PubMed   Google Scholar  

Ishioka T, Kameda M (2006) Automated Japanese essay scoring system based on articles written by experts. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17–18 July 2006, pp. 233-240. Association for Computational Linguistics, USA

Japan Foundation (2021) Retrieved from: https://www.jpf.gp.jp/j/project/japanese/survey/result/dl/survey2021/all.pdf

Jarvis S (2013a) Defining and measuring lexical diversity. In Jarvis S, Daller M (ed) Vocabulary knowledge: Human ratings and automated measures (Vol. 47, pp. 13–44). John Benjamins. https://doi.org/10.1075/sibil.47.03ch1

Jarvis S (2013b) Capturing the diversity in lexical diversity. Lang. Learn. 63:87–106. https://doi.org/10.1111/j.1467-9922.2012.00739.x

Jiang J, Quyang J, Liu H (2019) Interlanguage: A perspective of quantitative linguistic typology. Lang. Sci. 74:85–97

Kim M, Crossley SA, Kyle K (2018) Lexical sophistication as a multidimensional phenomenon: Relations to second language lexical proficiency, development, and writing quality. Mod. Lang. J. 102(1):120–141. https://doi.org/10.1111/modl.12447

Kojima T, Gu S, Reid M et al. (2022) Large language models are zero-shot reasoners. Advances in Neural Information Processing Systems, New Orleans, LA, 29 November-1 December, Curran Associates, Inc., Red Hook, NY

Kyle K, Crossley SA (2015) Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Q 49:757–786

Kyle K, Crossley SA, Berger CM (2018) The tool for the automatic analysis of lexical sophistication (TAALES): Version 2.0. Behav. Res. Methods 50:1030–1046. https://doi.org/10.3758/s13428-017-0924-4

Article   PubMed   Google Scholar  

Kyle K, Crossley SA, Jarvis S (2021) Assessing the validity of lexical diversity using direct judgements. Lang. Assess. Q. 18:154–170. https://doi.org/10.1080/15434303.2020.1844205

Landauer TK, Laham D, Foltz PW (2003) Automated essay scoring and annotation of essays with the Intelligent Essay Assessor. In Shermis MD, Burstein JC (ed), Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 159–174

Laufer B, Nation P (1995) Vocabulary size and use: Lexical richness in L2 written production. Appl. Linguist. 16:307–322. https://doi.org/10.1093/applin/16.3.307

Lee J, Hasebe Y (2017) jWriter Learner Text Evaluator, URL: https://jreadability.net/jwriter/

Lee J, Kobayashi N, Sakai T, Sakota K (2015) A Comparison of SPOT and J-CAT Based on Test Analysis [Tesuto bunseki ni motozuku ‘SPOT’ to ‘J-CAT’ no hikaku]. Research on the Acquisition of Second Language Japanese [Dainigengo to shite no nihongo no shūtoku kenkyū] (18) 53–69

Li W, Yan J (2021) Probability distribution of dependency distance based on a Treebank of. Japanese EFL Learners’ Interlanguage. J. Quant. Linguist. 28(2):172–186. https://doi.org/10.1080/09296174.2020.1754611

Article   MathSciNet   Google Scholar  

Linacre JM (2002) Optimizing rating scale category effectiveness. J. Appl. Meas. 3(1):85–106

PubMed   Google Scholar  

Linacre JM (1994) Constructing measurement with a Many-Facet Rasch Model. In Wilson M (ed) Objective measurement: Theory into practice, Volume 2 (pp. 129–144). Norwood, NJ: Ablex

Liu H (2008) Dependency distance as a metric of language comprehension difficulty. J. Cognitive Sci. 9:159–191

Liu H, Xu C, Liang J (2017) Dependency distance: A new perspective on syntactic patterns in natural languages. Phys. Life Rev. 21. https://doi.org/10.1016/j.plrev.2017.03.002

Loukina A, Madnani N, Cahill A, et al. (2020) Using PRMSE to evaluate automated scoring systems in the presence of label noise. Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, Seattle, WA, USA → Online, 10 July, pp. 18–29. Association for Computational Linguistics

Lu X (2010) Automatic analysis of syntactic complexity in second language writing. Int. J. Corpus Linguist. 15:474–496

Lu X (2012) The relationship of lexical richness to the quality of ESL learners’ oral narratives. Mod. Lang. J. 96:190–208

Lu X (2017) Automated measurement of syntactic complexity in corpus-based L2 writing research and implications for writing assessment. Lang. Test. 34:493–511

Lu X, Hu R (2022) Sense-aware lexical sophistication indices and their relationship to second language writing quality. Behav. Res. Method. 54:1444–1460. https://doi.org/10.3758/s13428-021-01675-6

Ministry of Health, Labor, and Welfare of Japan (2022) Retrieved from: https://www.mhlw.go.jp/stf/newpage_30367.html

Mizumoto A, Eguchi M (2023) Exploring the potential of using an AI language model for automated essay scoring. Res. Methods Appl. Linguist. 3:100050

Okgetheng B, Takeuchi K (2024) Estimating Japanese Essay Grading Scores with Large Language Models. Proceedings of 30th Annual Conference of the Language Processing Society in Japan, March 2024

Ortega L (2015) Second language learning explained? SLA across 10 contemporary theories. In VanPatten B, Williams J (ed) Theories in Second Language Acquisition: An Introduction

Rae JW, Borgeaud S, Cai T, et al. (2021) Scaling Language Models: Methods, Analysis & Insights from Training Gopher. ArXiv, abs/2112.11446

Read J (2000) Assessing vocabulary. Cambridge University Press. https://doi.org/10.1017/CBO9780511732942

Rudner LM, Liang T (2002) Automated Essay Scoring Using Bayes’ Theorem. J. Technol., Learning and Assessment, 1 (2)

Sakoda K, Hosoi Y (2020) Accuracy and complexity of Japanese Language usage by SLA learners in different learning environments based on the analysis of I-JAS, a learners’ corpus of Japanese as L2. Math. Linguist. 32(7):403–418. https://doi.org/10.24701/mathling.32.7_403

Suzuki N (1999) Summary of survey results regarding comprehensive essay questions. Final report of “Joint Research on Comprehensive Examinations for the Aim of Evaluating Applicability to Each Specialized Field of Universities” for 1996-2000 [shōronbun sōgō mondai ni kansuru chōsa kekka no gaiyō. Heisei 8 - Heisei 12-nendo daigaku no kaku senmon bun’ya e no tekisei no hyōka o mokuteki to suru sōgō shiken no arikata ni kansuru kyōdō kenkyū’ saishū hōkoku-sho]. University Entrance Examination Section Center Research and Development Department [Daigaku nyūshi sentā kenkyū kaihatsubu], 21–32

Taghipour K, Ng HT (2016) A neural approach to automated essay scoring. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, 1–5 November, pp. 1882–1891. Association for Computational Linguistics

Takeuchi K, Ohno M, Motojin K, Taguchi M, Inada Y, Iizuka M, Abo T, Ueda H (2021) Development of essay scoring methods based on reference texts with construction of research-available Japanese essay data. In IPSJ J 62(9):1586–1604

Ure J (1971) Lexical density: A computational technique and some findings. In Coultard M (ed) Talking about Text. English Language Research, University of Birmingham, Birmingham, England

Vaswani A, Shazeer N, Parmar N, et al. (2017) Attention is all you need. In Advances in Neural Information Processing Systems, Long Beach, CA, 4–7 December, pp. 5998–6008, Curran Associates, Inc., Red Hook, NY

Watanabe H, Taira Y, Inoue Y (1988) Analysis of essay evaluation data [Shōronbun hyōka dēta no kaiseki]. Bulletin of the Faculty of Education, University of Tokyo [Tōkyōdaigaku kyōiku gakubu kiyō], Vol. 28, 143–164

Yao S, Yu D, Zhao J, et al. (2023) Tree of thoughts: Deliberate problem solving with large language models. Advances in Neural Information Processing Systems, 36

Zenker F, Kyle K (2021) Investigating minimum text lengths for lexical diversity indices. Assess. Writ. 47:100505. https://doi.org/10.1016/j.asw.2020.100505

Zhang Y, Warstadt A, Li X, et al. (2021) When do you need billions of words of pretraining data? Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Online, pp. 1112-1125. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-long.90

Download references

This research was funded by National Foundation of Social Sciences (22BYY186) to Wenchao Li.

Author information

Authors and affiliations.

Department of Japanese Studies, Zhejiang University, Hangzhou, China

Department of Linguistics and Applied Linguistics, Zhejiang University, Hangzhou, China

You can also search for this author in PubMed   Google Scholar

Contributions

Wenchao Li is in charge of conceptualization, validation, formal analysis, investigation, data curation, visualization and writing the draft. Haitao Liu is in charge of supervision.

Corresponding author

Correspondence to Wenchao Li .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

Ethical approval was not required as the study did not involve human participants.

Informed consent

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental material file #1, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Li, W., Liu, H. Applying large language models for automated essay scoring for non-native Japanese. Humanit Soc Sci Commun 11 , 723 (2024). https://doi.org/10.1057/s41599-024-03209-9

Download citation

Received : 02 February 2024

Accepted : 16 May 2024

Published : 03 June 2024

DOI : https://doi.org/10.1057/s41599-024-03209-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

essay prompt undertale

IMAGES

  1. Undertale

    essay prompt undertale

  2. undertale mettaton essay responses

    essay prompt undertale

  3. Mettaton EX's Pop Quiz Essay Responses

    essay prompt undertale

  4. The Ultimate Analysis Of Undertale

    essay prompt undertale

  5. All Mettaton Essay Responses : r/Undertale

    essay prompt undertale

  6. Essay question (Luaudrey) : r/Undertale

    essay prompt undertale

VIDEO

  1. Mettaton EX's Pop Quiz Essay Responses

  2. The Ultimate Analysis Of Undertale

  3. Undertale

  4. Why I Never Did A Zero Punctuation on Undertale

  5. Game Theory: Who is W.D. Gaster? (Undertale)

  6. The Chilling Brilliance of True Lab

COMMENTS

  1. All Mettaton Essay Responses : r/Undertale

    All Mettaton Essay Responses : r/Undertale. Go to Undertale. r/Undertale. r/Undertale. UNDERTALE is an indie RPG created by developer Toby Fox about a child, who falls into an underworld filled with monsters. Their only weapon being their DETERMINATION as they try to FIGHT or ACT their way out. Will you show monsters standing in your way MERCY ...

  2. r/Undertale on Reddit: (possible spoilers) Long-ass list of words

    Ratings. The game Mettaton processes these words in a certain order. This means, that the last one processed takes priority. Only one instance of each word is counted. So, spamming leg will result only in +350, unfortunately. So, swearing or "talking about yourself" takes priority and negates all other ratings.

  3. Mettaton/In Battle

    Writing a word that Mettaton deems insulting prompts him to tell the protagonist that this is an essay about him, not them. This will lose 200 points. Writing "LEG" earns 350 points, which is the second highest amount, being the 'correct answer.' Writing "DANCING" earns 250 points, Mettaton saying that he is self-taught.

  4. Mettaton EX's Pop Quiz Essay Responses

    These are the Mettaton Responses if you choose a specific essay answer :Sentence Length :012-1213-4950-8990-139140+List of Compliments :beauty hot pretty han...

  5. Undertale

    ESSAY PROMPT: What do you love most about mettaton? (No X or Z)=====我的Instagram(My Instagram):http://gg.gg/onenoobins訂閱我(Subscribe Me):http://...

  6. The Ultimate Analysis Of Undertale

    Underneath the surface of Undertale, there are many interesting secrets…Full spoilers for Undertalesans.Permanently Exhausted Theory:https://www.reddit.com/r...

  7. Undertale Walkthrough, Part Four: Hotland Guide

    Main Walkthrough. Hotland. - After crossing a bridge from Waterfall you'll wind up in hotty hotty Hotland. There's a water cooler on the next screen; you can, if you wish, systematically drain the thing by dumping water onto the ground. It takes a while, though, and there's seemingly no point... unless you got Undyne to chase you here.

  8. [Spoilers] Mettaton EX's essay replies

    total posts: 5653. since: Jun 2012. Dec 30, 15 at 12:46am (PST) ^. [Spoilers] Mettaton EX's essay replies. So in the battle of Mettaton EX, you'll get to face some essay prompts you'll have to ...

  9. Mettaton/In Battle

    Early in the battle, Mettaton will ask the protagonist to write an essay about what they like most about him. Writing "LEGS" earns 350 points, which is the highest amount, being the 'correct answer.'. Writing "TOBY" earns 300 points, Mettaton saying that Toby sounds sexy. Writing "DANCING" earns 250 points, Mettaton saying that he is self-taught.

  10. Help with Mettaton's essay :: Undertale General Discussions

    Help with Mettaton's essay. At the part where you're supposed to write an essay for Mettaton, the spacebar doesn't work for me for some reason. All the other keys work, and I've seen screenshots of other people writing it and there are spaces between their words, so I don't know what's wrong. It really isn't that important. Just type Legs or Toby.

  11. ESSAY PROMPT: What do you love most about Mettaton?

    I'm still shocked that it's been that long since I first played Undertale (about a month or two after it came out) that could affect so much of my life and brought me not only so much joy being invested in the story and characters, but allowed me to meet so many of my current friends, so many new people in the cosplay community and bring an ...

  12. What did you type on Mettaton (ex)'s essay? « Undertale « Forum

    kokoronis. The first time, I typed something like, "I love Mettaton's hair, it's so shiny and stylish" and I got a big bonus for that. And Mettaton told me what products he used on his hair. Then when I died, I said Mettaton had really good legs, and I got an even bigger bonus...! He said something like, "That's the correct answer!"

  13. @underprompts on Tumblr

    underprompts. Toriel and company go grocery shopping on the surface. #toriel #writing prompts #fanfic #fanfiction #undertale prompts #undertale. Undertale/Deltarune prompts featuring all your faves! Credit not necessary but encouraged. Feel free to send me a link to your works. I'd love to see what you do with my prompts! Icon by @funsizemini.

  14. Mettaton

    Mettaton is a major character and the fourth boss in Undertale. Mettaton is a robot with a SOUL, whose body was built by Alphys. He is the sole television star of the Underground. Mettaton poises as an entertainment robot turned human killing robot in Hotland, though he later reveals the truth to the protagonist at the end of the CORE .

  15. Introducing: Undertale Prompt Month!

    Introducing: Undertale Prompt Month! From September 1st-30th, flex your creative muscles with a whole month of prompts in honor of Undertale's 5th anniversary! Fanfics, comics, art, crafts, music; everything is fair game! Whether you want to use all 30 prompts or even just one, you're free to choose! There's only 2 rules. Tag your creation # ...

  16. Essay Topic Generator

    Conclusion. Our Essay Topic Generator is a great way for anyone with writer's block to get ideas on a topic. Click on the type of essay you need to write by selecting it from the drop down bar. Then click on the Generate Topics button. We'll give you a topic that fits the type of essay you're writing. We'll also give you an essay title ...

  17. A role-playing video game Undertale: [Essay Example], 454 words

    Undertale is a role-playing video game created by American indie developer and composer Toby Fox. Usually this type of games are not very popular but undertale is not one of this cases. It has a very interesting and really complicated plot. In the game, players control a human child who has fallen into the Underground, a large, secluded region ...

  18. Undertale Essay Prompt

    Undertale Essay Prompt, My Home Garden English Essay, Popular Dissertation Introduction Writing Sites Gb, Essay On Plastic Boon Or Bane, Fractions Of Numbers Homework Year 3, College Essay Application Review Service My, Cheap Biography Writing For Hire Us ...

  19. Question: so what are yall answering when you get the essay ...

    The correct answer is "legs" and I'm pretty sure there's no way to stack bonuses with it so I just type "legs" and wait for the essay box to timeout. Minecraft is a sandbox video game developed by Mojang Studios. The game was created by Markus "Notch" Persson in the Java programming language.

  20. College Essays and Diversity in the Post-Affirmative Action Era

    Editor's Note: This story is part of an occasional series on research projects currently in the works at the Law School. The Supreme Court's decision in June 2023 to bar the use of affirmative action in college admissions raised many questions. One of the most significant is whether universities should consider applicants' discussion of race in essays. The Court's decision in Students ...

  21. 5 Strategies To Unlock Your Winning College Essay

    The best essays have clear, coherent language and are free of errors. The story is clearly and specifically told. After drafting, take the time to revise and polish your writing. Seek feedback ...

  22. I'm going to be writing about Undertale for my graduating ...

    The font is called 8bitoperator and is the exact font used in most of Undertale. I googled "undertale font" and added it to MS Word back in the day when I was a loser in high school writing fake Undertale dialogue during class. And thanks!

  23. Applying large language models for automated essay scoring for non

    The GPT 4 API was used in this study. A detailed explanation of the prompt-assessment criteria is provided in Section Prompt. All essays were sent to the model for measurement and scoring.

  24. r/Undertale on Reddit: Writing an essay on Undertale, would love some

    Writing an essay on Undertale, would love some feedback on the intro. I'm writing a paper on Undertale for my Game Analysis class and I've written the first 4-5 pages as an introduction to Undertale's meta narrative. The class' formatting/style requirements are really loose so I've gone a little wild, trying to tell a story ala Undertale ...