{"id":4345,"date":"2025-10-10T10:29:16","date_gmt":"2025-10-10T14:29:16","guid":{"rendered":"https:\/\/oge.mit.edu\/msrp\/?post_type=profiles&#038;p=4345"},"modified":"2025-12-09T11:28:41","modified_gmt":"2025-12-09T16:28:41","slug":"andre-braga","status":"publish","type":"profiles","link":"https:\/\/oge.mit.edu\/msrp\/profiles\/andre-braga\/","title":{"rendered":"Andr\u00e9 Braga"},"content":{"rendered":"<div class=\"wp-block-image\">\n<figure class=\"alignleft size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"2560\" height=\"2560\" src=\"https:\/\/oge.mit.edu\/msrp\/wp-content\/uploads\/sites\/2\/2025\/10\/BragaAndre-edited-scaled.jpg\" alt=\"\" class=\"wp-image-4346\" style=\"width:200px;height:auto\" srcset=\"https:\/\/oge.mit.edu\/msrp\/wp-content\/uploads\/sites\/2\/2025\/10\/BragaAndre-edited-scaled.jpg 2560w, https:\/\/oge.mit.edu\/msrp\/wp-content\/uploads\/sites\/2\/2025\/10\/BragaAndre-edited-300x300.jpg 300w, https:\/\/oge.mit.edu\/msrp\/wp-content\/uploads\/sites\/2\/2025\/10\/BragaAndre-edited-1024x1024.jpg 1024w, https:\/\/oge.mit.edu\/msrp\/wp-content\/uploads\/sites\/2\/2025\/10\/BragaAndre-edited-150x150.jpg 150w, https:\/\/oge.mit.edu\/msrp\/wp-content\/uploads\/sites\/2\/2025\/10\/BragaAndre-edited-768x768.jpg 768w, https:\/\/oge.mit.edu\/msrp\/wp-content\/uploads\/sites\/2\/2025\/10\/BragaAndre-edited-1536x1536.jpg 1536w, https:\/\/oge.mit.edu\/msrp\/wp-content\/uploads\/sites\/2\/2025\/10\/BragaAndre-edited-2048x2048.jpg 2048w\" sizes=\"auto, (max-width: 2560px) 100vw, 2560px\" \/><\/figure>\n<\/div>\n\n\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<p><strong>MIT Department: <\/strong>Electrical Engineering and Computer Science<br><strong>Faculty Mentor<\/strong>: Prof. Jacob Andreas<br><strong>Research Supervisor: <\/strong>Mehul Damani<br><strong>Undergraduate Institution:<\/strong> University of California, Santa Barbara<br><strong>Website<\/strong>:<\/p>\n<\/div><\/div>\n\n\n\n<div style=\"height:0px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Biography<\/strong><\/h4>\n\n\n\n<p>Andr\u00e9 Braga is a rising third-year Statistics &amp; Data Science major at the University ofCalifornia, Santa Barbara, and a 2025 MIT MSRP intern in the Computer Science Department. Working with Professor Jacob Andreas, he uses reinforcement learning to make large language models more trustworthy, with an emphasis on uncertainty estimation. At UCSB he researches with Professors Xifeng Yan and Mingsong Yan, investigating attention-scaling techniques that enhance long-context reasoning and retrieval-augmented generation in transformer models, aiming to make large language systems faster and more interpretable. He also contributed to an AI-driven financial forecasting project with Professor Yan that explored feedback loops between search and prediction agents. Beyond academia, Andr\u00e9 co-founded Shofo, a startup where he designs fine-tuning workflows and user-facing machine-learning infrastructure for large-scale social-media analytics. He plans to pursue a Ph.D. in machine learning, developing novel architectures for trustworthy, interpretable continuous learning.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Abstract<\/strong><\/h4>\n\n\n\n<p class=\"has-text-align-center\"><strong>Training ChatGPT to Double-Check Itself with a Separate &#8216;Judge&#8217; Model<\/strong><\/p>\n\n\n\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group is-vertical is-content-justification-center is-nowrap is-layout-flex wp-container-core-group-is-layout-73832be3 wp-block-group-is-layout-flex\">\n<p class=\"has-text-align-center\"><strong>Andre Braga<sup>1<\/sup>, Mehul Damani<sup>2<\/sup>, and Jacob Andreas<sup>2<\/sup><\/strong><\/p>\n\n\n\n<div class=\"wp-block-group is-vertical is-content-justification-center is-layout-flex wp-container-core-group-is-layout-4b2eccd6 wp-block-group-is-layout-flex\">\n<p><sup>1<\/sup>Department of Statistics, University of California, Santa Barbara<\/p>\n\n\n\n<p><sup>2<\/sup>Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology<\/p>\n<\/div>\n<\/div>\n<\/div><\/div>\n<\/div><\/div>\n\n\n\n<p class=\"has-text-align-center\"><\/p>\n\n\n\n<p>Large language models (LLMs) often sound confident even when they are wrong, limiting their reliability in domains such as medicine or law, where it is vital to know when the model may not be certain of its answer. Existing methods for teaching these models to be more cautious typically judge only whether the final answer is correct or incorrect, but do little to explore the validity of how the model arrived at its answer. Our work proposes the use of a feedback-based training method called reinforcement learning that leverages an outcome reward model (ORM) to act as a separate \u2018judge\u2019 that scores both the answer and the reasoning process behind it. We then use this reward to train the LLM to adjust its behavior, learning not only to be correct but also to show caution if necessary, and even say \u201cI don\u2019t know\u201d when it&#8217;s unsure about its answer. By training language models with feedback from a reasoning-aware reward model, our approach addresses a gap in existing confidence-improvement methods that overlook the derivation of answers. This method produces models that are not only more accurate but also more cautious, able to express uncertainty appropriately, and ultimately better for real-world decision-making.\n<\/p>\n\n\n\n<p><\/p>\n","protected":false},"featured_media":4346,"template":"","profile_category":[23],"class_list":["post-4345","profiles","type-profiles","status-publish","has-post-thumbnail","hentry","profile_category-2025-interns"],"acf":[],"_links":{"self":[{"href":"https:\/\/oge.mit.edu\/msrp\/wp-json\/wp\/v2\/profiles\/4345","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/oge.mit.edu\/msrp\/wp-json\/wp\/v2\/profiles"}],"about":[{"href":"https:\/\/oge.mit.edu\/msrp\/wp-json\/wp\/v2\/types\/profiles"}],"version-history":[{"count":3,"href":"https:\/\/oge.mit.edu\/msrp\/wp-json\/wp\/v2\/profiles\/4345\/revisions"}],"predecessor-version":[{"id":4786,"href":"https:\/\/oge.mit.edu\/msrp\/wp-json\/wp\/v2\/profiles\/4345\/revisions\/4786"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/oge.mit.edu\/msrp\/wp-json\/wp\/v2\/media\/4346"}],"wp:attachment":[{"href":"https:\/\/oge.mit.edu\/msrp\/wp-json\/wp\/v2\/media?parent=4345"}],"wp:term":[{"taxonomy":"profile_category","embeddable":true,"href":"https:\/\/oge.mit.edu\/msrp\/wp-json\/wp\/v2\/profile_category?post=4345"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}