Databricks Apps Development
FIRST: Use the parent databricks-core skill for CLI basics, authentication, and profile selection.
For data UI design (required for any data-displaying app): if the app shows ANY data — a dashboard, KPI/overview page, report, chart, table, query results, OR a conversational / chat / Genie natural-language assistant — you MUST use the databricks-app-design skill (alongside this one) to decide layout, charts, KPIs, semantic color, required states, and AI-result trust, and map them to AppKit components. This includes chat/Genie apps, not just dashboards — if in doubt, use it.
Build apps that deploy to Databricks Apps platform.
Required Reading by Phase
| Phase | READ BEFORE proceeding |
|---|---|
| Scaffolding | ⚠️ STOP — review the State Storage Guidance and complete the Data Access Decision Gate below before scaffolding. Parent databricks-core skill (auth, warehouse discovery); then run databricks apps manifest + databricks apps init with --features and --set (see AppKit section below) |
| Writing SQL queries | SQL Queries Guide |
| Writing UI components | Frontend Guide |
Using useAnalyticsQuery | AppKit SDK |
| Adding API endpoints | Custom Endpoints Guide |
| Using Lakebase (OLTP database) | Lakebase Guide |
| Adding Genie chat / Genie-powered apps | Genie Guide — follow the Genie agent workflow below |
| Using Model Serving (ML inference) | Model Serving Guide |
| Typed data contracts (proto-first design) | Proto-First Guide and Plugin Contracts |
| Managing files in UC Volumes | Files Guide |
| Triggering / monitoring Lakeflow Jobs from the app | Jobs Guide |
| Platform rules (permissions, deployment, limits) | Platform Guide — READ for ALL apps including AppKit |
| Non-AppKit app (Streamlit, FastAPI, Flask, Gradio, Next.js, etc.) | Other Frameworks |
Generic Guidelines
- App name: ≤26 characters, lowercase letters/numbers/hyphens only (no underscores). dev- prefix adds 4 chars, max 30 total.
- Validation:
databricks apps validate --profile <PROFILE>before deploying. - Smoke tests (AppKit only): ALWAYS update
tests/smoke.spec.tsselectors BEFORE running validation. Default template checks for "Minimal Databricks App" heading and "hello world" text — these WILL fail in your custom app. See testing guide. - Smoke test selectors: use only Playwright locator APIs —
getByRole,getByText,getByPlaceholder,getByLabel.getByLabelTextdoes not exist in Playwright (it is a React Testing Library method) and throwsTypeErrorat runtime. See testing guide ornpx playwright codegen. - Smoke test data: keep result sets under the 1 MB analytics-event payload cap. Queries returning thousands of rows cause
INVALID_REQUEST: Event exceeds max size of 1048576 bytesandnet::ERR_ABORTED, leaving every asserted UI element absent. UseLIMITor an aggregated query (e.g.COUNT(*) GROUP BY status) — never raw row dumps. - AppKit version: never override the
@databricks/appkitor@databricks/appkit-uiversion inpackage.json—databricks apps initsets the correct version. Do not runnpm install @databricks/appkit@<version>unless explicitly asked by the user. If you need a different version, re-scaffold withdatabricks apps init --version <version>. - Authentication: covered by parent
databricks-coreskill. - AppKit API surface: before writing code that calls AppKit APIs (
createApp, plugin shapes,useAnalyticsQuery, etc.), runnpx @databricks/appkit docs <section>and use the actual signature. Training data has stale shapes; a single invented signature failstsc --noEmitduring validate. The docs ship with the installed AppKit and are the authoritative source. - TypeScript casts: never use
as unknown as <T>double-assertions —appkit lintenforcesno-double-type-assertionand one violation fails the entire validate step. Instead: narrow with Zod (z.infer<typeof schema>), use a runtime type guard, or write a typed mapper function. If a query result needs reshaping, type the row schema via queryKey types rather than casting.
Project Structure (after databricks apps init --features analytics)
client/src/App.tsx— main React component (start here)config/queries/*.sql— SQL query files (queryKey = filename without .sql)server/server.ts— backend entry (onPluginsReady+ Express routes)tests/smoke.spec.ts— smoke test (⚠️ MUST UPDATE selectors for your app)client/src/appKitTypes.d.ts— auto-generated types (npm run typegen)
Project Structure (after databricks apps init --features lakebase)
server/server.ts— backend with Lakebase pool + Express routesclient/src/App.tsx— React frontendapp.yaml— manifest withdatabaseresource declarationpackage.json— includes@databricks/lakebasedependency- Note: No
config/queries/— Lakebase apps useappkit.lakebase.query()in Express routes, not SQL files
Data Discovery
Before writing any SQL, use the parent databricks-core skill for data exploration — search information_schema by keyword, then batch discover-schema for the tables you need. Do NOT skip this step.
State Storage Guidance (evaluate BEFORE the Decision Gate):
If the user's app description involves storing or persisting data — forms, CRUD operations, user submissions, orders, todos, or other user-generated content — the app likely needs a Lakebase database.
- Ask the user whether the app needs persistent storage (Lakebase) before scaffolding. Do not silently add Lakebase.
- If confirmed, use the
databricks-lakebaseskill to create a Lakebase project and obtain the branch and database resource names. - Scaffold with
--features lakebaseand pass--set lakebase.postgres.branch=<BRANCH_NAME> --set lakebase.postgres.database=<DATABASE_NAME>. - If the app also reads from Unity Catalog tables, proceed to the Data Access Decision Gate below to determine whether to add
--features analyticsor use Lakebase synced tables.
Do NOT add Lakebase to analytics, dashboard, or visualization apps unless the user explicitly requests persistent write-back storage. Read-only data display, filters, and preferences do not require a database.
Development Workflow (FOLLOW THIS ORDER)
Data Access Decision Gate (REQUIRED before scaffolding):
If the app reads from Unity Catalog / lakehouse tables, you MUST show the comparison below to the user and ask them to choose. Do not skip this. Do not choose for them.
| | (A) Lakebase synced tables | (B) Analytics | |--|---|---| | Speed | Sub-second responses | Takes a few seconds | | Best for | Full-text search, typeahead, autocomplete, real-time lookups, operational apps | Dashboards, charts, aggregations, KPIs, filtered queries, browsing | | How it works | Data synced from Delta into Lakebase Postgres | Queries run on SQL warehouse at read time |
After showing the table, add a brief recommendation. Default to recommending Analytics (B) for most read-only apps — dashboards, charts, filtered queries, browsing, and aggregations. Recommend Lakebase synced tables (A) only when the app needs sub-second latency for full-text search, typeahead/autocomplete, real-time lookups by ID, or operational data serving. Note: "search" or "filter" in a prompt usually means SQL WHERE clauses (Analytics), not full-text search (Lakebase). Always let the user make the final call.
After the user chooses:
- (A) Lakebase synced tables → scaffold with
--features lakebase. See Lakebase Guide for full workflow. - (B) Analytics → scaffold with
--features analytics. - Both → scaffold with
--features analytics,lakebaseif the app needs both patterns. - If the app does NOT read Unity Catalog data (pure CRUD, Genie, Model Serving), skip this gate and scaffold with the appropriate
--featuresflag.
Analytics apps (--features analytics):
- Create SQL files in
config/queries/ - Run
npm run typegen— verify all queries show ✓ - Read
client/src/appKitTypes.d.tsto see generated types - THEN write
App.tsxusing the generated types - Update
tests/smoke.spec.tsselectors - Run
databricks apps validate --profile <PROFILE>
DO NOT write UI code before running typegen — types won't exist and you'll waste time on compilation errors.
Lakebase apps (--features lakebase): No SQL files or typegen. See Lakebase Guide for the onPluginsReady pattern: initialize schema at startup, register Express routes in server/server.ts, then build the React frontend.
When to Use What
After completing the decision gate above, use this routing table:
- Read analytics data → display in chart/table: Use visualization components with
queryKeyprop - Read analytics data → custom display (KPIs, cards): Use
useAnalyticsQueryhook - Read analytics data → need computation before display: Still use
useAnalyticsQuery, transform client-side - Read lakehouse data at low latency (lookups, search, catalogs): Use Lakebase synced tables — see Lakebase Guide
- Read/write persistent data (users, orders, CRUD state): Use Lakebase via Express routes in
onPluginsReady— see Lakebase Guide - Natural language query interface over tables (Genie): Use
genie()plugin — see Genie Guide - Call ML model endpoint: Use
serving()plugin — see Model Serving Guide - Trigger or monitor a Lakeflow Job from the app: Use the
jobs()plugin — see Jobs Guide - ⚠️ NEVER add custom endpoints to run SELECT queries against the warehouse — always use SQL files in
config/queries/ - ⚠️ NEVER use
useAnalyticsQueryfor Lakebase data — it queries the SQL warehouse only
Frameworks
AppKit (Recommended)
TypeScript/React framework with type-safe SQL queries and built-in components.
Official Documentation — the source of truth for all API details:
npx @databricks/appkit docs # ← ALWAYS start here to see available pages
npx @databricks/appkit docs <query> # view a section by name or doc path
npx @databricks/appkit docs --full # full index with all API entries
npx @databricks/appkit docs "appkit-ui API reference" # example: section by name
npx @databricks/appkit docs ./docs/plugins/analytics.md # example: specific doc file
DO NOT guess doc paths. Run without args first, pick from the index. The <query> argument accepts both section names (from the index) and file paths. Docs are the authority on component props, hook signatures, and server APIs — skill files only cover anti-patterns and gotchas.
App Manifest and Scaffolding
Agent workflow for scaffolding: get the manifest first, then build the init command.
- Get the manifest (JSON schema describing plugins and their resources):
databricks apps manifest --profile <PROFILE>
# See plugins available in a specific AppKit version:
databricks apps manifest --version <VERSION> --profile <PROFILE>
# Custom template:
databricks apps manifest --template <GIT_URL> --profile <PROFILE>
The output defines:
- Plugins: each has a key (plugin ID for
--features), plusrequiredByTemplate, andresources. - requiredByTemplate: If true, that plugin is mandatory for this template — do not add it to
--features(it is included automatically); you must still supply all of its required resources via--set. If false or absent, the plugin is optional — add it to--featuresonly when the user's prompt indicates they want that capability (e.g. analytics/SQL), and then supply its required resources via--set. - Resources: Each plugin has
resources.requiredandresources.optional(arrays). Each item hasresourceKeyandfields(object: field name → description/env). Use--set <plugin>.<resourceKey>.<field>=<value>for each required resource field of every plugin you include.
- Scaffold (DO NOT use
npx; use the CLI only):
databricks apps init --name <NAME> --features <plugin1>,<plugin2> \
--set <plugin1>.<resourceKey>.<field>=<value> \
--set <plugin2>.<resourceKey>.<field>=<value> \
--description "<DESC>" --run none --profile <PROFILE>
# --run none: skip auto-run after scaffolding (review code first)
# With custom template:
databricks apps init --template <GIT_URL> --name <NAME> --features ... --set ... --profile <PROFILE>
Optionally use --version <VERSION> to target a specific AppKit version.
- Required:
--name,--profile. Name: ≤26 chars, lowercase letters/numbers/hyphens only. Use--featuresonly for optional plugins the user wants (plugins withrequiredByTemplate: falseor absent); mandatory plugins must not be listed in--features. - Resources: Pass
--setfor every required resource (each field inresources.required) for (1) all plugins withrequiredByTemplate: true, and (2) any optional plugins you added to--features. Add--setforresources.optionalonly when the user requests them. - Discovery: Use the parent
databricks-coreskill to resolve IDs (e.g. warehouse:databricks warehouses list --profile <PROFILE>ordatabricks experimental aitools tools get-default-warehouse --profile <PROFILE>).
DO NOT guess plugin names, resource keys, or property names — always derive them from databricks apps manifest output. Example: if the manifest shows plugin analytics with a required resource resourceKey: "sql-warehouse" and fields: { "id": ... }, include --set analytics.sql-warehouse.id=<ID>.
Scaffolding Rules Protocol — databricks apps manifest may emit scaffolding.rules at the template level (top-level scaffolding.rules) and on individual plugins (plugins[].scaffolding.rules). Each block has must / should / never arrays of short directive strings. Consume them as follows:
- Gather — for every plugin in your final
--featureslist AND every plugin withrequiredByTemplate: true, readplugins[].scaffolding.rules. Union those with the top-level templatescaffolding.rulesinto one working set, tagged by source (template vs<plugin>). - Precedence — manifest rules override the directives baked into this skill. Where the manifest is silent on a topic, this skill's content is the floor.
- Phase ordering — rules whose text begins with
Before initMUST be executed beforedatabricks apps init. Rules beginning withAfter initMUST be executed after init completes (e.g. migrations, typegen, connectivity checks). Rules without a phase prefix apply throughout the scaffold/develop loop. - Conflict detection — if a plugin
mustrule contradicts a templateneverrule on the same target (or vice versa), STOP and ask the user which to follow before proceeding. Do not silently pick one. Treatmustvsneveron the same action as a conflict;shouldis advisory and does not block. - Reporting — before running
databricks apps init, surface the merged working set to the user grouped by phase (Before init / After init / Always) and by severity (must / should / never), so the active guardrails are explicit.
READ AppKit Overview for project structure, workflow, and pre-implementation checklist.
Genie Agent Workflow — when the user wants a Genie-powered app, do not start by asking for a Genie Space ID. Instead:
- Ask which Unity Catalog tables the app should query (fully qualified:
catalog.schema.table). - Ask whether to reuse an existing Genie space or create a new one.
- If creating: discover the warehouse, then create the space with
databricks genie create-space(see Genie Guide for syntax and serialized space format). - If reusing: discover existing spaces with
databricks genie list-spaces --profile <PROFILE>and let the user pick. - Scaffold or wire the space ID into the app — derive
--setkeys fromdatabricks apps manifest.
Read the Genie Guide for configuration, SSE endpoints, and frontend integration.
Common Scaffolding Mistakes
# ❌ WRONG: name is NOT a positional argument
databricks apps init --features analytics my-app-name
# → "unknown command" error
# ✅ CORRECT: use --name flag
databricks apps init --name my-app-name --features analytics --set "..." --profile <PROFILE>
Directory Naming
databricks apps init creates directories in kebab-case matching the app name. App names must be lowercase with hyphens only (≤26 chars).
Other Frameworks (Streamlit, FastAPI, Flask, Gradio, Dash, Next.js, etc.)
Databricks Apps supports any framework that runs as an HTTP server. LLMs already know these frameworks — the challenge is Databricks platform integration.
READ Other Frameworks Guide BEFORE building any non-AppKit app. It covers port/host configuration, app.yaml and databricks.yml setup, dependency management, networking, and framework-specific gotchas.
Post-Deploy Verification
After deploying, verify the app is running:
databricks apps get <app-name> --profile <PROFILE> -o json # Check app_status.state: RUNNING
databricks apps logs <app-name> --follow --profile <PROFILE> # Stream live logs (Ctrl+C to stop)
Note:
databricks apps logsrequires OAuth authentication and does not work with PAT. Usedatabricks apps getfor status checks if using PAT auth.

