Who we are

Contacts

1815 W 14th St, Houston, TX 77008

281-817-6190

BLOG

Insights into Data Excellence

New Math Data Blog: Expertise from Our Engineers

The Winning Mindset

The Winning Mindset

Embracing Total Commitment Introduction Success in both engineering and consulting is deeply tied to the mindset of total commitment and…

The Evolution of Demand Response with OpenADR 2.0b

The Evolution of Demand Response with OpenADR 2.0b

Introduction: The rising demand for energy, fueled by factors such as the adoption of electric vehicles, the constant construction of…

Harnessing the Power of PySpark in DataBricks Delta Live Tables

Harnessing the Power of PySpark in DataBricks Delta Live Tables

Explore the integration of PySpark with DataBricks Delta Live Tables Introduction Welcome to our exploration of how PySpark integrates with…

Risk Management in Cloud Data Projects: Strategies for Success

Risk Management in Cloud Data Projects: Strategies for Success

Introduction to Risk Management in Cloud Data Projects Risk management is a critical component of any cloud data project. As…

Effective Project Planning for Cloud Data Engineering

Effective Project Planning for Cloud Data Engineering

Introduction to Project Planning in Cloud Data Engineering Project planning is a critical component of successful data engineering projects, especially…

Cultivating a Data-Driven Culture: Leadership Strategies for Success

Cultivating a Data-Driven Culture: Leadership Strategies for Success

The Imperative of a Data-Driven Culture in Modern Organizations In today’s rapidly evolving business landscape, the imperative of fostering a…

Demystifying The Artificial in Artificial Intelligence

Demystifying The Artificial in Artificial Intelligence

The field of Natural Language Processing (NLP) has exploded in popularity in recent years. Largely because accessibility has become so…

Transforming Financial Services: The Impact of Generative AI

Transforming Financial Services: The Impact of Generative AI

Generative AI is rapidly transforming the FinTech industry, offering groundbreaking solutions that enhance efficiency, bolster security, and elevate customer engagement.…

Developing Future Leaders in Cloud Data Engineering

Developing Future Leaders in Cloud Data Engineering

Exploring the Path to Leadership in the Evolving Field of Cloud Data Engineering Introduction The field of cloud data engineering…

Navigating the Ethical Landscape: Generative AI in Data Engineering

Navigating the Ethical Landscape: Generative AI in Data Engineering

Exploring the integration of generative AI in data engineering and its ethical implications. Introduction The integration of generative AI into…

Exploring the Future: Generative AI Webinars by Industry Vertical

Exploring the Future: Generative AI Webinars by Industry Vertical

Introduction to Generative AI and Its Impact Across Different Industries Generative AI is revolutionizing industries by automating creative processes and…

Augment Your Retrieval: LLMs with Python LangChain and AWS OpenSearch VectorSearch database

Augment Your Retrieval: LLMs with Python LangChain and AWS OpenSearch VectorSearch database

Introduction In this blog post we will introduce vector databases and some of the algorithms used for indexing and show…

Streamlining Talent Acquisition with HireBot, your AI Powered Recruiter

Streamlining Talent Acquisition with HireBot, your AI Powered Recruiter

Introduction HireBot is a chatbot designed to streamline the initial screening process of candidate resumes and profiles for recruiters and…

Data Quality Monitoring in AWS SageMaker

Data Quality Monitoring in AWS SageMaker

First things first, what is data quality monitoring? Data quality monitoring for machine learning can generally be thought of from…

Accurate by Design: Advanced Data Quality on AWS

Accurate by Design: Advanced Data Quality on AWS

Introduction Data pipelines in AWS orchestrate the movement and transformation of data across various AWS services. The core objective of…

The Problem of Overfitting in Machine Learning

The Problem of Overfitting in Machine Learning

By Lena Qian Introduction Machine learning stands as a pivotal element in contemporary data science, fundamentally altering the landscape of…

Unleashing Potential: High-Performing Cloud Data Engineering Teams

Unleashing Potential: High-Performing Cloud Data Engineering Teams

Introduction The ability to efficiently process, store, and analyze vast amounts of data in real-time is not just a competitive…

The Art of Collaboration in Distributed Teams

The Art of Collaboration in Distributed Teams

Introduction In today’s rapidly evolving technological landscape, cloud data engineering projects are at the forefront of innovation, driving businesses towards…

Unshakable Cloud Foundations: Elevate Your AWS with Integration Testing

Unshakable Cloud Foundations: Elevate Your AWS with Integration Testing

The final installment of our blog series on AWS testing methodologies focuses on integration testing. This crucial phase ensures that…

Turbocharge your functional tests with LocalStack for AWS

Turbocharge your functional tests with LocalStack for AWS

A deep dive into functional testing for AWS development Introduction In our exploration of advanced testing techniques for AWS development,…

Advanced Unit Testing in AWS

Advanced Unit Testing in AWS

Leveraging Moto and Pytest Introduction In the world of AWS development, ensuring the reliability, efficiency, and correctness of your cloud-based…

Reliability by design: Implementing Test Driven Development Strategies in Python Data Engineering

Reliability by design: Implementing Test Driven Development Strategies in Python Data Engineering

Introduction In the rapidly evolving field of data engineering, maintaining high-quality, reliable, and efficient data pipelines is crucial for businesses…

LLMs and chatbots: a brief update

LLMs and chatbots: a brief update

Generally and historically, data engineering, analytics, and science efforts focused on progressing from data to knowledge/wisdom. The emergence of LLMs…

Agile ‘thin slice’ technique: Explained

Agile ‘thin slice’ technique: Explained

Introduction In today’s fast-paced development environment, the Agile methodology stands out for its emphasis on delivering functional features to users…

Data Modeling for Developers

Data Modeling for Developers

An Introduction to Data Modeling and Why it Matters for Development Teams Data modeling is a critical yet often underrated…

Mounting EFS Volume to Batch Jobs in AWS

Mounting EFS Volume to Batch Jobs in AWS

Introduction In the realm of distributed computing and batch processing, operational challenges frequently arise that necessitate innovative solutions. A particular…

Green with Envy: Improving Python Performance with a Sprinkling of Feature Envy

Green with Envy: Improving Python Performance with a Sprinkling of Feature Envy

Python Performance: Issue 2 – Feature Envy Previous Issue Recap In the previous issue we discussed the differences between the…

Clean Code’s Hidden Impact: Unraveling the Python Performance Paradox

Clean Code’s Hidden Impact: Unraveling the Python Performance Paradox

Python Performance: Issue 1 – The Polymorphism Rule Welcome to Python Performance Welcome to the Python Performance blog series. In…

Revolutionizing Data Management in AWS: The Case for Apache Iceberg Over Traditional Table Formats

Revolutionizing Data Management in AWS: The Case for Apache Iceberg Over Traditional Table Formats

Introduction In the digital era, where data is king, the choice of table format for data storage and processing is…

Serverless, Fan-out Architecture Using SNS, SQS, and Lambda

Serverless, Fan-out Architecture Using SNS, SQS, and Lambda

Case Study: AWS re:invent 2023 featured a lab session on building out serverless architecture which utilized SNS, SQS, and Lambda.…

Data Engineering Methodology: From requirements to hand-off

Data Engineering Methodology: From requirements to hand-off

Introduction Joining or starting data projects in large enterprise environments with many stakeholders can be stressful, not to mention a…

Remote Development in Sagemaker Studio with VS Code

Remote Development in Sagemaker Studio with VS Code

Disclaimer about Changes to Sagemaker Studio As of Nov. 30 2023, there have been major changes to Sagemaker Studio. Existing…

The Data Journey

The Data Journey

Many organizations share similar challenges with growing their operational capabilities with data. I have given several talks on data lake…

Boost AI Fairness and Explainability with Amazon SageMaker Clarify

Boost AI Fairness and Explainability with Amazon SageMaker Clarify

From hiring decisions to loan approvals and even healthcare recommendations, machine learning (ML) impacts our lives daily. Fairness and explainability…

DBT and Databricks part 3: Loading noSQL data (from MongoDB) into Databricks

DBT and Databricks part 3: Loading noSQL data (from MongoDB) into Databricks

This series of blog posts will illustrate how to use DBT with Azure Databricks: set up a connection profile, work…

DBT and Databricks Part 2: Working with python models

DBT and Databricks Part 2: Working with python models

This series of blog posts will illustrate how to use DBT with Azure Databricks: set up a connection profile, work…

DBT and Databricks Part 1: Setting up DBT profile for connecting to Azure Databricks using…

DBT and Databricks Part 1: Setting up DBT profile for connecting to Azure Databricks using…

This series of blog posts will illustrate how to use DBT with Azure Databricks: set up a connection profile, work…

Running DBT on Databricks while using dbt_external_tables package to utilize Snowflake Tables

Running DBT on Databricks while using dbt_external_tables package to utilize Snowflake Tables

This article highlights a specific use case where one might need to run dbt on Databricks while utilizing tables in…

External Knowledge Base for LLMs: Leveraging Retrieval Augmented Generation Framework with AWS…

External Knowledge Base for LLMs: Leveraging Retrieval Augmented Generation Framework with AWS…

Introduction In the realm of artificial intelligence and language models, the pursuit of enhancing their capabilities is a constant endeavor.…

Use AWS Bedrock language models with a Slack-powered chatbot

Use AWS Bedrock language models with a Slack-powered chatbot

co-authors: Meghana Venkataswamy, Sean Cahill, Salman Ahmed Mian Architecture What is a Foundational Model? How Do we customize the FM…

Harnessing the Power of Large Language Models for Knowledge Graph Creation

Harnessing the Power of Large Language Models for Knowledge Graph Creation

· The role of large language models in creating knowledge graphs from unstructured data. · Comparison of Top Models ·…

Migrating Apache Cassandra to AWS Keyspaces

Migrating Apache Cassandra to AWS Keyspaces

Apache Cassandra is an open-source, NoSQL database with a distributed architecture to maximize availability and reliability. Due to the relative…

Creating Custom Hadoop Events in EventBridge on AWS

Creating Custom Hadoop Events in EventBridge on AWS

Introduction: Event driven architecture as defined by Amazon is a system whose architecture uses events to trigger and communicate between…