Introspecting your machine with osquery

DevOpsDays Austin 2025

Ian Littman / @ian@phpc.social / @ian.im / @iansltx

Slides at ian.im/doda25

osquery: query systems with SQL(ite)

installable via choco/brew/apt/yum

and lets you do stuff like check if Apple Intelligence is enabled on a computer

...or get the sha1 of all open files <= 100KiB

SELECT path, sha1 FROM hash WHERE path IN (
  SELECT path FROM process_open_files
  JOIN file USING (path) WHERE file.size <= 1024 * 100
);

Why and when?

Why am I talking about this here?

  • I work for Fleet Device Management, a cross-platform open-core tool
    for managing devices, vulnerabilities, software and more
  • Fleet started as an osquery results aggregator
    • One of our cofounders (@zwass) co-created osquery and is on its TSC
    • Using Fleet for this still works, including on the MIT-licensed free version
  • osquery is still generally how we read Windows/Mac/Linux device data
    • We generally contribute multiple fixes/adds to osquery per release,
      across multiple teams
    • Caveat: those contributors are coworkers rather than me personally
  • Examples may be pulled from queries Fleet uses, but this isn't a vendor pitch

Who are y'all?

...but back to osquery...

We'll talk about

  1. What its goals are
  2. What information it can pull
  3. Nuances and caveats
  4. What to do with results once you get them
  5. How to use osquery to detect when something's sus

Key Tenets

  • macOS + Linux first, Windows later
  • Write SQL, not procedural code
    • osquery pulls and parses requisite data
    • SQLite (embedded in osquery) queries data via virtual tables
    • If you know how to write SQL, you can figure out how to use osquery
  • Reliable and performant enough to run on corporate machines and in prod
    • Worker process does the heavy lifting
    • Watchdog process kills/respawns the worker if resource limits are exceeded
    • Heavy queries are denylisted
    • Some tables require WHERE clauses because e.g. hashing files is expensive
  • Dual-licensed open-source GPLv2 + Apache 2.0

270+ tables built in

Hardware

  • acpi_tables
  • battery
  • block_devices
  • chassis_info
  • connected_displays
  • cpu_info
  • cpuid
  • device_firmware
  • fan_speed_sensors
  • hardware_events
  • ibridge_info
  • intel_me_info
  • memory_devices
  • nvram
  • pci_devices
  • power_sensors
  • temperature_sensors
  • usb_devices
  • video_info
  • drivers (Windows)

software*

  • apps (macOS)
  • apt_sources
  • chocolatey_packages
  • deb_packages
  • gatekeeper
  • gatekeeper_approved_apps
  • homebrew_packages
  • package_bom
  • package_install_history
  • package_receipts
  • python_packages
  • yum_sources
  • authenticode (Windows)
  • startup_items
  • launchd
  • chrome_extensions
  • firefox_addons
  • safari_extensions
  • patches (Windows)
  • npm_packages
  • programs (Windows)
  • rpm_packages
  • rpm_package_files
  • windows_update_history

Auth and certs

  • authorized_keys
  • certificates
  • groups
  • keychain_items
  • known_hosts
  • last (logins + logouts)
  • logged_in_users
  • logon_sessions (Windows)
  • password_policy (macOS)
  • security_profile_info (Windows)
  • selinux_*
  • shadow
  • ssh_configs
  • sudoers
  • suid_bin
  • user_events
  • user_groups
  • user_ssh_keys
  • users

Disk encryption

  • disk_encryption
  • bitlocker_info

Networking

  • connectivity
  • dns_cache (Windows)
  • dns_resolvers
  • etc_hosts
  • interface_addresses
  • interface_details
  • interface_ipv6
  • listening_ports
  • process_open_sockets
  • routes
  • wifi_*
  • curl

Firewalls

  • alf
  • alf_exceptions
  • alf_explicit_auths
  • windows_firewall_rules
  • windows_security_center
  • iptables

What's running?

  • crontab
  • scheduled_tasks (Windows)
  • startup_items
  • processes
  • process_*
  • running_apps
  • systemd_units

Logging

  • windows_event_logs
  • syslog_events
  • unified_log
  • asl
  • shell_history

Files (including file integrity monitoring)

  • carves
  • disk_*
  • file
  • file_events
  • hash
  • md_* (software RAID)
  • mdls / mdfind (Spotlight)
  • mounts
  • nfs_shares
  • ntfs_*
  • office_mru
  • process_file_events
  • signature

Containers

  • docker_*
  • lxd_*
sudo ln -s $HOME/.docker/run/docker.sock /var/run/docker.sock

Cloud

  • azure_instance_metadata
  • azure_instance_tags
  • ec2_instance_metadata
  • ec2_instance_tags

configuration

  • plist
  • registry
  • kernel_*

Time

  • time
  • uptime
  • load_average

But wait, with ATC, there's more!

if you need sqlite in your sqlite

But wait, with extensions, there's more!

for example, fleetd adds 30+ tables

Comprehensiveness over consistency.

you might want an abstraction layer on top.

...at least likely join keys tend to be named the same (pid, uid, path)

Nuance #1: Cross joins for users

  • osquery queries as the current user
  • osqueryd generally runs as root
  • Retired: SELECT * FROM chrome_extensions
  • Inspired: SELECT cx.* FROM users CROSS JOIN chrome_extensions cx USING (uid)
  • Some tables work without the cross join, but give incomplete results

 

Bonus tip: Check out the SQLite docs on SELECT and the SQLite tutorial

Nuance #2: Computation effort

  • ORDER/LIMIT tend not to reduce computation effort
  • WHERE, depending on table, does
  • Some tables (e.g. hash) require a WHERE condition
  • For file paths, % == glob *; use %% for recursive subdirectory scans
  • Sometimes you can get a hash more cheaply (e.g. signature table)
  • Sometimes you need subqueries rather than JOINs

Displaying results

  • Provide query as an osqueryi argument to output and quit

  • At the CLI: --json, --csv

  • Inside the shell: .mode line, .mode column

osqueryd: JSON query log to disk

can also push externally over TLS with host identification

Can run scheduled over time with either the entire result or diffs

...right, you were promised suspect systems...

Threat Hunting && Indicators of Compromise

  • SELECT * FROM processes
    WHERE on_disk = 0
  • SELECT COUNT(*) > 0 FROM files
    WHERE path = "/my/malware"
  • SELECT * FROM process_open_sockets
    WHERE remote_address = ""
  • YARA allows for more file contents searching
  • Use diff logging to catch anomalies

Threat Hunting && Indicators of Compromise

Thanks!

Introspecting your machine with osquery - DevOpsDays Austin 2025

By Ian Littman

Introspecting your machine with osquery - DevOpsDays Austin 2025

Somewhere in your organization, maybe on your desk, maybe in your data center, is infrastructure that isn't immutable, and it's helpful to know what's going on with that infrastructure. osquery lets you do just that with SQL, joins and all. We'll take a look at the sorts of data you can pull with osquery, and the use cases that data fits into, ranging from "is this machine set up correctly?" to "is this machine infected?"

  • 499