-
Notifications
You must be signed in to change notification settings - Fork 238
Description
Astronomer Cosmos Version
1.9
dbt-core version
1.9
Versions of dbt adapters
dbt-core==1.9.4
PyYAML==6.0.2
dbt-databricks==1.9.1
protobuf>=5.0,<6.0
pandas==2.1.4
numpy>=1.24.0,<2.0.0
LoadMode
DBT_LS_MANIFEST
ExecutionMode
VIRTUALENV
InvocationMode
None
airflow version
2.10.4
Operating System
Debian GNU/Linux 12 (bookworm)
If a you think it's an UI issue, what browsers are you seeing the problem on?
No response
Deployment
Docker-Compose
Deployment details
This is not the actual deployment ambient, but one I use to develop locally that has similar configuration. The error happens both in this ambient and in prod.
I works in 1.7, but it does not work on releases from 1.8 foward
What happened?
If I upgrade from 1.7 to 1.8 the dag stops working.
Relevant log output
[2025-08-21, 18:17:40 UTC] {subprocess.py:69} INFO - Running command: ['/tmp/cosmos-venva7vve1ev/bin/dbt', 'deps', '--project-dir', '/tmp/tmp_ci86hfh', '--profiles-dir', '/tmp/cosmos/profile/8b9ccb06a9d87f13f6aef626b29de43b249ab511fe2b4e4ec1b2f6dc2bd7f236', '--profile', 'dataplatform', '--target', 'databricks']
[2025-08-21, 18:17:40 UTC] {taskinstance.py:3311} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 762, in _execute_task
result = _execute_callable(context=context, **execute_callable_kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/taskinstance.py", line 733, in _execute_callable
return ExecutionCallableRunner(
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/utils/operator_helpers.py", line 252, in run
return self.func(*args, **kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/baseoperator.py", line 422, in wrapper
return func(self, *args, **kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/cosmos/operators/virtualenv.py", line 133, in execute
output = super().execute(context)
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/models/baseoperator.py", line 422, in wrapper
return func(self, *args, **kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/cosmos/operators/base.py", line 268, in execute
self.build_and_run_cmd(context=context, cmd_flags=self.add_cmd_flags())
File "/home/airflow/.local/lib/python3.10/site-packages/cosmos/operators/local.py", line 654, in build_and_run_cmd
result = self.run_command(cmd=dbt_cmd, env=env, context=context)
File "/home/airflow/.local/lib/python3.10/site-packages/cosmos/operators/virtualenv.py", line 106, in run_command
return super().run_command(cmd, env, context)
File "/home/airflow/.local/lib/python3.10/site-packages/cosmos/operators/local.py", line 470, in run_command
self.invoke_dbt(
File "/home/airflow/.local/lib/python3.10/site-packages/cosmos/operators/virtualenv.py", line 92, in run_subprocess
return super().run_subprocess(command, env, cwd)
File "/home/airflow/.local/lib/python3.10/site-packages/cosmos/operators/local.py", line 385, in run_subprocess
subprocess_result: FullOutputSubprocessResult = self.subprocess_hook.run_command(
File "/home/airflow/.local/lib/python3.10/site-packages/cosmos/hooks/subprocess.py", line 71, in run_command
self.sub_process = Popen(
File "/usr/local/lib/python3.10/subprocess.py", line 971, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/usr/local/lib/python3.10/subprocess.py", line 1863, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/cosmos-venva7vve1ev/bin/dbt'
How to reproduce
If we try to trigger the following dag we get an error.
Dag code
from airflow.decorators import dag
from airflow.utils.dates import datetime
from cosmos import (
DbtTaskGroup,
LoadMode,
ProfileConfig,
ProjectConfig,
ExecutionConfig,
ExecutionMode,
RenderConfig
)
from cosmos.profiles import DatabricksTokenProfileMapping
from pathlib import Path
AIRFLOW_HOME = Path('/opt/airflow/')
DBT_EXECUTABLE_PATH = AIRFLOW_HOME / 'dbt_venv' / 'bin' / 'dbt'
DBT_PROJECT_ROOT_FOLDER = AIRFLOW_HOME / 'dags' / 'dags_data_transformation' / 'dbt_project'
MANIFEST_PATH = DBT_PROJECT_ROOT_FOLDER / "manifest.json"
@dag(
dag_id="example_virtualenv_dbt_taskgroup",
schedule="@daily",
start_date=datetime(2023, 1, 1),
catchup=False,
)
def example_virtualenv() -> None:
DbtTaskGroup(
group_id="tmp-venv-group",
project_config=ProjectConfig(
dbt_project_path=DBT_PROJECT_ROOT_FOLDER,
manifest_path=MANIFEST_PATH,
env_vars={'DATABRICKS_ENV':'poc'}
),
profile_config=ProfileConfig(
profile_name="dataplatform",
target_name="databricks",
profile_mapping=DatabricksTokenProfileMapping(
conn_id='databricks_sp',
profile_args={
"http_path": "Path/to/cluster",
"catalog": "main",
"schema": "default",
}),
),
render_config=RenderConfig(
load_method=LoadMode.DBT_MANIFEST,
select=["tag:hands_on"],
),
execution_config=ExecutionConfig(
execution_mode=ExecutionMode.VIRTUALENV,
dbt_executable_path=DBT_EXECUTABLE_PATH,
),
operator_args={
"install_deps": True,
},
)
example_virtualenv()
I tried to keep the DAG code as simple as possible. I used a Databricks adapter, but I believe the issue would occur with any adapter.
While debugging, I found a workaround that made it work — although it's not suitable for production use. Still, it helps illustrate what’s actually breaking:
- I added virtualenv_dir=Path("/tmp/persistent-venv3") to the ExecutionConfig.init.
- I triggered the DAG once. It failed, but it created a temporary virtual environment at /tmp/persistent-venv3.
- Then, I ran: cp /opt/airflow/dbt_venv/bin/dbt /tmp/persistent-venv3/bin.
- After that, the DAGs started working.
This approach is fine for local testing, but since it relies on a single persistent virtual environment, it prevents task parallelism.
Anything else :)?
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Contact Details
No response