ML Engineer · Krafton AI · Seoul

Suyoung Lee.

I build AI agents that play games — currently fine-tuning vision-language-action models and running large-scale distributed RL on PUBG at Krafton's Gameplay AI Team.

Before Krafton, I built LLM agent frameworks at Samsung Research. PhD in EE from KAIST, advised by Prof. Youngchul Sung & Prof. Sae-Young Chung.

Suyoung Lee
Updated · 2026
About

Building agents
that play.

My research interests span AI agents (agent harnessing, automated research, planning), reinforcement learning (meta-RL, generalization, sample efficiency, offline RL), and vision-language-action models for gameplay and screen-control agents.

Now · Krafton

Distributed RL & VLA on PUBG

Large-scale distributed reinforcement learning and vision-language-action model training on PUBG, at the Gameplay AI Team.

Recent · Open Source

Prompt2Policy

Released April 2026 — an LLM-powered system that turns natural-language behavior descriptions into trained RL policies.

Selected Publications

Research.

AIM-DA
ICML Workshop 2022

Adaptive Intrinsic Motivation with Decision Awareness

Suyoung Lee, Sae-Young Chung
Decision Awareness in RL Workshop @ ICML · 2022

An intrinsic-reward coefficient adaptation scheme equipped with intrinsic-motivation awareness, adjusting the coefficient online to maximize extrinsic return.

Awards

Honors.

2024

Outstanding Ph.D. Dissertation Award

Meta-Reinforcement Learning with Imaginary Tasks · KAIST EE

2018

Qualcomm-KAIST Innovation Awards

Paper competition for graduate students · Qualcomm

2017

Un Chong-Kwan Scholarship

Excellence in 2017 entrance examination · KAIST EE

Education

Path.

2022 – 2024
Ph.D., Electrical Engineering
KAIST · Adv. Prof. Youngchul Sung
2019 – 2022
Ph.D. Candidate, Electrical Engineering
KAIST · Adv. Prof. Sae-Young Chung
2017 – 2019
M.S., Electrical Engineering
KAIST · Adv. Prof. Sae-Young Chung
2012 – 2017
B.S., Electrical Engineering
KAIST
2010 – 2012
Hansung Science High School
Seoul, Republic of Korea
2007 – 2009
Tashkent International School
Tashkent, Uzbekistan
Coda

How I try to live.

Meta-RL view of life

I view life as a meta-reinforcement learning task — reminiscent of MuJoCo's Ant-direction, where an agent learns to walk toward a hidden goal direction it can never observe directly.

Everyone has their own optimal life direction T: unique, often obscured. The objective is to maximize the cumulative reward

r = M · T,

the dot product of M — how we choose to live — and the unseen true direction T. Two things matter: the angle θ between them, and the magnitude |M|.

I was fortunate to have guidance from two advisors who instilled in me the importance of both — minimizing |θ| by choosing the right direction, and maximizing |M| by moving with conviction.

Aside

Off the clock — vibe-coding games.

The same agent harnessing I do at work, I do for fun on weekends — pair-programming with LLMs to ship small games end-to-end.

This one has a back-story. I picked up Go as a kid and stuck with it long enough to get reasonably serious. During my PhD I worked on Bayes-adaptive policy optimization — agents that infer hidden task structure from observation. Stonecode is what falls out when those two things meet.

Side Project 1v1 Web Game Live

Stonecode

A 1v1 game played on a Go board, but the rule space is enormous: each match has a different hidden scoring rule, and players must adapt and infer it Bayes-adaptively from the score feedback after every move. Built end-to-end with an agentic dev workflow — feature branches, parallel sub-agents for client / server / shared layers, and an automated PR-review loop.

Stonecode — discover the hidden rules through play
Connect

Get in touch.

© Suyoung Lee · MMXXVI Built with care · Seoul, KR