François Lanusse
National Center for Scientific Research (CNRS)
Polymathic AI
Methodologies and practices that will
transfer across domains
Transposing these methodologies to scientific data and problems brings
unique challenges
Credit: Melchior et al. 2021
Credit:DESI collaboration/DESI Legacy Imaging Surveys/LBNL/DOE & KPNO/CTIO/NOIRLab/NSF/AURA/unWISE
Collaborative project with about 30 contributors
Presented at NeurIPS 2024 Datasets & Benchmark track
Ground-based imaging from Legacy Survey
Space-based imaging from JWST
Presented at NeurIPS 2024 Datasets & Benchmark Track
Presented at NeurIPS 2024 Datasets & Benchmark Track
Most General
Most Specific
Single model capable of processing all types of data
Independent models for all types of data
Most General
Most Specific
Independent models for all types of data
Single model capable of processing all types of data
Bytes Are All You Need (Horton et al. 2023)
Most General
Most Specific
Independent models for all types of data
Single model capable of processing all types of data
Bytes Are All You Need (Horton et al. 2023)
AstroCLIP (Parker et al. 2024)
AstroCLIP
Most General
Most Specific
Independent models for all types of data
Single model capable of processing all types of data
Bytes Are All You Need (Horton et al. 2023)
Early Fusion Multimodal Models
AstroCLIP (Parker et al. 2024)
Flamingo: a Visual Language Model for Few-Shot Learning (Alayrac et al. 2022)
Chameleon: Mixed-Modal Early-Fusion Foundation Models (Chameleon team, 2024)
with extensive support from the rest of the team.
Project led by:
Francois
Lanusse
Liam
Parker
Jeff
Shen
Tom
Hehir
Ollie
Liu
Lucas
Meyer
Leopoldo
Sarra
Sebastian Wagner-Carena
Helen
Qu
Micah
Bowles
(Blanco Telescope and Dark Energy Camera.
Credit: Reidar Hahn/Fermi National Accelerator Laboratory)
(Subaru Telescope and Hyper Suprime Cam. Credit: NAOJ)
(Dark Energy Spectroscopic Instrument)
(Sloan Digital Sky Survey. Credit: SDSS)
(Gaia Satellite. Credit: ESA/ATG)
Field Embedding Strategy Developed for
Multiple Physics Pretraining (McCabe et al. 2023)
DES g
DES r
DES i
DES z
HSC g
HSC r
HSC i
HSC z
HSC y
Survey translation
Spectrum super-resolution
astro-ph abstracts mentioning Deep Learning, CNN, or Neural Networks
The vast majority of these results has relied on supervised learning and networks trained from scratch.
Francois' first Deep Learning Paper in Astro (with Barnabas Poczos)
Conventional scientific workflow with deep learning
Conventional researchers @ CMU
Circa 2016
CMU DeepLens (Lanusse et al 2017)
Foundation Model-based Scientific Workflow
Already taken care of
=> Let's discuss embedding-based adaptation
Adaptation at low cost
with simple strategies:
Trained on ->
Eval on ->
Inputs:
measured fluxes
Inputs:
measured fluxes + image
Segmenting central bar and spiral arms in galaxy images based on Galaxy Zoo 3D
Polymathic's recipe for developing Multimodal Scientific Models
Engagement with Scientific Communities
Data Curation And Aggregation
Dedicated ML R&D
Thank you for listening!