Skip to content

'pyarrow._hdfs.HadoopFileSystem' object has no attribute 'host' #1870

@marberi

Description

@marberi

I tried connecting to a HDFS storage, through the default configuation
(core-site.xml). Connecting, plus writing and reading a dataframe worked
find (not shown). However, when attempting to run the code:

"""
import dask.array as da

N = 10_000
rng = da.random.default_rng()
x = rng.random((N, N), chunks=(2000, 2000))
x.to_zarr("hdfs:///user/eriksen/test2.zarr")
"""

It ends up failing with the following issue, which seems to be fsspec related:

File /data/aai/scratch_ssd/eriksen/miniforge3/envs/dask/lib/python3.13/functools.py:1026, in cached_property.get(self, instance, owner)
1024 val = cache.get(self.attrname, _NOT_FOUND)
1025 if val is _NOT_FOUND:
-> 1026 val = self.func(instance)
1027 try:
1028 cache[self.attrname] = val

File /data/aai/scratch_ssd/eriksen/miniforge3/envs/dask/lib/python3.13/site-packages/fsspec/implementations/arrow.py:63, in ArrowFSWrapper.fsid(self)
61 @cached_property
62 def fsid(self):
---> 63 return "hdfs_" + tokenize(self.fs.host, self.fs.port)

AttributeError: 'pyarrow._hdfs.HadoopFileSystem' object has no attribute 'host'

I installed the environment today with Python 3.13 and the following packages:
cloudpickle==3.1.1
dask==2025.5.1
distributed==2025.5.1
fsspec==2025.5.1
pyarrow==20.0.0
toolz==1.0.0
zarr==3.0.8
zict==3.0.0

Please let me know if you need any other information or I should be reporting this
issue elsewhere.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions