Jongmin Lee

I'm a postdoc at UC Berkeley, advised by Pieter Abbeel. I received my PhD from KAIST, where I was fortunate to be advised by Kee-Eung Kim.

[Google Scholar]

Education

2017. 03.  - 2022. 02: PhD, School of Computing, KAIST, Korea (Advisor: Kee-Eung Kim)

2015. 03. - 2017. 02.: MS, School of Computing, KAIST, Korea (Advisor: Kee-Eung Kim)

2009. 03. - 2014. 02.: BS, Department of Computer Science and Engineering, Seoul National University, Korea

Experience

Publications

International

[C22] Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies [paper]

[C21] AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline Multi-Agent RL via Alternating Stationary Distribution Correction Estimation [paper] [code]

[C20] SafeDICE: Offline Safe Imitation Learning with Non-Preferred Demonstrations [paper]

[C19] Tempo Adaptation in Non-stationary Reinforcement Learning [paper]

[C18] LobsDICE: Offline Imitation Learning from Observation via Stationary Distribution Correction Estimation [paper]

[C17] Local Metric Learning for Off-Policy Evaluation in Contextual Bandits with Continuous Actions

[C16] COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation [paper] [code]

[C15] DemoDICE: Offline Imitation Learning with Supplementary Imperfect Demonstrations [paper] [code]

[C14] GPT-Critic: Offline Reinforcement Learning for End-to-End Task-Oriented Dialogue Systems [paper]

[C13,W4] OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation [paper] [code]

[C12] Representation Balancing Offline Model-based Reinforcement Learning [paper] [code]

[C11] Monte-Carlo Planning and Learning with Language Action Value Estimates [paper] [code]

[C10] Reinforcement Learning for Control with Multiple Frequencies [paper] [code]

[C9] Batch Reinforcement Learning with Hyperparameter Gradients [paper] [code]

[C8] Monte-Carlo Tree Search in Continuous Action Spaces with Value Gradients [paper]

[C7,W4] Bayes-Adaptive Monte-Carlo Planning and Learning for Goal-Oriented Dialogues [paper]

[C6] Trust Region Sequential Variational Inference [paper]

[C5] PyOpenDial: A Python-based Domain-Independent Toolkit for Developing Spoken Dialogue Systems with Probabilistic Rules [paper] [code]

[C4] Monte-Carlo Tree Search for Constrained POMDPs [paper] [code]

[W3] Monte-Carlo Tree Search for Constrained MDPs [paper]

[J1] Layered Behavior Modeling via Combining Descriptive and Prescriptive Approaches: a Case Study of Infantry Company Engagement [paper]

[C3,W2] Constrained Bayesian Reinforcement Learning via Approximate Linear Programming [paper]

[C2] Hierarchically-partitioned Gaussian Process Approximation [paper]

[W1] Multi-View Automatic Lip-Reading using Neural Network [paper]

[C1] Bayesian Reinforcement Learning with Behavioral Feedback [paper]

Domestic

A Study on Efficient Multi-Task Offline Model-based Reinforcement Learning

A Study on Application of Efficient Lifelong Learning Algorithm to Model-based Reinforcement Learning

A Study on Monte-Carlo Tree Search in Continuous Action Spaces

Case Studies on Planning and Learning for Large-Scale CGFs with POMDPs through Counterfire and Mechanized Infantry Scenarios

A Case Study on Planning and Learning for Large-Scale CGFs with POMDPs

Awards and Honors

Reviewer

Teaching Experiences

Extracurricular Activities