You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Functions like fsspec.open_files or FileSystem.ls return list-like objects when run on directories or with glob patterns. This has two main drawbacks:
The functions only return once the entire directory has been listed. When listing cloud buckets with millions of entries, this can take many minutes. It leads to:
Higher failure risk due to long runtimes.
No way to add user feedback in the meantime, e.g. a progress bar.
Can't start processing the first files found while the rest are still being listed.
All OpenFile objects or addresses are kept in memory at once.
Is it possible to get a generator or iterator instead of a list? I'm particularly interested in support for local, s3fs, and gcsfs.