Who we are

Contacts

1815 W 14th St, Houston, TX 77008

281-817-6190

Databricks

September Databricks Updates

Clean Room Monitoring, VS Code Extension, and System Tables GA

Introduction

As organizations continue to scale their data and AI operations, effective monitoring, streamlined development, and transparent billing have become crucial for maintaining efficiency. This September, Databricks has introduced several powerful updates designed to address these needs. From the ability to track clean room usage in billing data to the general availability of system tables and a Visual Studio Code extension, these enhancements offer greater control and flexibility. In this post, we’ll dive into the key announcements and explore how they can improve your Databricks workflows.

1: Monitoring Clean Room Usage in the Billable Usage Table

One of the standout updates this month is the addition of clean room cost tracking in the Databricks system.billing.usage table. This enhancement introduces the usage_metadata.central_clean_room_id field, allowing users to monitor costs specifically tied to clean room usage.

Key Benefits:

  • Granular Cost Visibility: With the new field, organizations can track the exact costs incurred from clean room operations, providing more transparency and control over billing.
  • Improved Cost Allocation: This feature is especially useful for teams working with sensitive or regulated data in shared environments, where tracking clean room costs separately can be critical for budgeting and financial reporting.

Example Use Case:
Imagine a scenario where a company operates multiple clean rooms for different departments or clients. By querying the system.billing.usage table, they can easily filter by central_clean_room_id to isolate costs per clean room, enabling more accurate cost allocation. This can be crucial for cross-departmental budgeting or client billing.

For a deeper dive into how you can query and interpret these usage details, check out the see Billable usage system table reference.

2: Databricks Extension for Visual Studio Code Reaches General Availability

Databricks developers can now benefit from a more integrated development experience with the general availability of the Databricks extension for Visual Studio Code (VS Code). This extension brings Databricks’ powerful capabilities directly into your favorite IDE, enabling a smoother, more efficient development workflow.

Key Features:

  • Remote Workspace Integration: Seamlessly connect to your Databricks workspaces from VS Code, making it easier to manage and interact with your Databricks environment.
  • Develop and Deploy Asset Bundles: Define, deploy, and run Databricks Asset Bundles right from the IDE, allowing for a more organized approach to managing jobs and clusters.
  • Notebook Debugging and Job Execution: Debug Databricks notebooks and run them as jobs directly within VS Code, improving the testing and deployment process.
  • Code Synchronization: Automatically sync your local code with your Databricks workspace, keeping everything up-to-date without the need for manual uploads.

Why It Matters: The general availability of this extension offers significant productivity gains, especially for developers who work across multiple environments. Instead of switching between Databricks and your local development tools, you can now handle everything in one place – speeding up development, debugging, and deployment workflows.

Getting Started: Setting up the Databricks extension in VS Code is straightforward. You can install the extension from the Visual Studio Code Marketplace and connect to your Databricks workspace in just a few steps. Check out the official guide to get started quickly and unlock these new capabilities.

3: Foundation Model APIs Pay-Per-Token Now Available in Europe

Databricks is continuing to expand the availability of its Foundation Model APIs, with pay-per-token billing now available in the eu-central-1 and eu-west-1 regions. This offering brings flexibility and cost-efficiency to European customers leveraging large language models (LLMs) and other generative AI capabilities.

Key Benefits:

  • Flexible Pricing: The pay-per-token model means you only pay for the tokens you consume, making it easier to scale usage based on specific needs, whether for low-volume experiments or high-demand production workloads.
  • Availability in Key Regions: With access in eu-central-1 (Frankfurt) and eu-west-1 (Ireland), European businesses can now leverage these APIs with reduced latency, making model serving faster and more efficient.

Use Cases: For organizations experimenting with LLMs or other foundation models, this pricing structure is particularly beneficial. Teams that are running smaller-scale or pilot projects can take advantage of the pay-per-token model to test and fine-tune models without the upfront commitment of paying for full capacity. Additionally, enterprises with large-scale, production-grade AI workloads can better manage and predict costs as they scale.

Expanding Global AI Capabilities: As the demand for AI-driven solutions grows globally, this expanded availability of Foundation Model APIs brings powerful model-serving capabilities closer to European businesses. Organizations can now deploy models more efficiently and effectively to serve their European customers.

See Model serving feature availability.

4: System Tables Platform Now Generally Available

With the general availability of the Databricks system tables platform, users now have a powerful toolset for tracking and managing their Databricks usage and billing data in a structured and queryable format. This includes the highly anticipated system.billing.usage and system.billing.list_price tables, which offer deeper insights into resource consumption and costs.

Key Benefits:

  • Centralized Billing Data: The system.billing.usage table provides a centralized view of your Databricks usage, enabling better tracking and analysis of where your resources are being consumed.
  • Transparent Pricing Information: The system.billing.list_price table allows users to directly query the list prices for various Databricks services, making it easier to predict and manage costs over time.
  • Real-time Insights: These system tables are designed to be updated in near real-time, offering up-to-date information for better decision-making around resource allocation and financial planning.

Real-World Applications: Consider a scenario where a data engineering team is managing multiple clusters and workloads across projects. By querying the system.billing.usage table, they can drill down into specific costs, identifying which projects or departments are using the most resources. In combination with the system.billing.list_price table, this insight allows for more informed budgeting, resource optimization, and even discussions with stakeholders about cost management.

How to Use System Tables: Getting started with system tables is straightforward. You can access these tables using SQL queries from within your Databricks workspace. For example, running a simple query on system.billing.usage can give you immediate insights into the costs associated with each cluster, job, or even user.

See Monitor usage with system tables.

Conclusion

The recent updates from Databricks reflect their continued commitment to improving developer workflows, cost transparency, and AI capabilities. With the introduction of clean room cost tracking in billing, the general availability of system tables, and the new pay-per-token model for Foundation Model APIs in Europe, teams have more tools than ever to monitor usage, manage costs, and deploy AI at scale. The launch of the Databricks extension for Visual Studio Code further streamlines development, offering a powerful, integrated environment for managing Databricks resources.

Whether you’re a data engineer looking to optimize resource allocation, a developer seeking a more efficient coding environment, or a team experimenting with AI models, these updates offer something for everyone. Be sure to explore these new features to take full advantage of the enhanced capabilities Databricks provides for modern data and AI operations.