uproot.dask
Defined in uproot._dask on line 35.
- uproot._dask.dask(files, *, filter_name=<function no_filter>, filter_typename=<function no_filter>, filter_branch=<function no_filter>, recursive=True, full_paths=False, step_size=uproot._util.unset, steps_per_file=uproot._util.unset, library='ak', ak_add_doc=False, custom_classes=None, allow_missing=False, open_files=True, form_mapping=None, allow_read_errors_with_report=False, known_base_form=None, decompression_executor=None, interpretation_executor=None, **options)
- Parameters:
- files – See below. 
- filter_name (None, glob string, regex string in - "/pattern/i"syntax, function of str → bool, or iterable of the above) – A filter to select- TBranchesby name.
- filter_typename (None, glob string, regex string in - "/pattern/i"syntax, function of str → bool, or iterable of the above) – A filter to select- TBranchesby type.
- filter_branch (None or function of uproot.TBranch → bool, uproot.interpretation.Interpretation, or None) – A filter to select - TBranchesusing the full uproot.TBranch object. If the function returns False or None, the- TBranchis excluded; if the function returns True, it is included with its standard interpretation; if an uproot.interpretation.Interpretation, this interpretation overrules the standard one.
- recursive (bool) – If True, include all subbranches of branches as separate fields; otherwise, only search one level deep. 
- full_paths (bool) – If True, include the full path to each subbranch with slashes ( - /); otherwise, use the descendant’s name as the field name.
- step_size (int or str) – If an integer, the maximum number of entries to include in each chunk/partition; if a string, the maximum memory_size to include in each chunk/partition. The string must be a number followed by a memory unit, such as “100 MB”. Mutually incompatible with steps_per_file: only set step_size or steps_per_file, not both. Cannot be used with - open_files=False.
- steps_per_file (int, default 1) – Subdivide files into the specified number of chunks/partitions. Mutually incompatible with step_size: only set step_size or steps_per_file, not both. If both - step_sizeand- steps_per_fileare unset,- steps_per_file’s default value of 1 (whole file per chunk/partition) is used, regardless of- open_files.
- library (str or uproot.interpretation.library.Library) – The library that is used to represent arrays. If - library='np'it returns a dict of dask arrays and if- library='ak'it returns a single dask-awkward array.- library='pd'has not been implemented yet and will raise a- NotImplementedError.
- ak_add_doc (bool | dict) – If True and - library="ak", add the TBranch- titleto the Awkward- __doc__parameter of the array. if dict = {key:value} and- library="ak", add the TBranch- valueto the Awkward- keyparameter of the array.
- custom_classes (None or dict) – If a dict, override the classes from the uproot.ReadOnlyFile or - uproot.classes.
- allow_missing (bool) – If True, skip over any files that do not contain the specified - TTree.
- open_files (bool) – If True (default), the function will open the files to read file metadata, i.e. only the main data read is delayed till the compute call on the dask collections. If False, the opening of the files and reading the metadata is also delayed till the compute call. In this case, branch-names are inferred by opening only the first file. 
- form_mapping (Callable[awkward.forms.Form] -> awkward.forms.Form | None) – If not none and library=”ak” then apply this remapping function to the awkward form of the input data. The form keys of the desired form should be available data in the input form. 
- allow_read_errors_with_report (bool or tuple of exceptions) – If True, catch OSError exceptions and return an empty array for these nodes in the task graph. If a tuple, catch any of those exceptions and return empty arrays for those nodes. In either of those cases, The return of this function becomes a two element tuple, where the first return is the dask-awkward collection of interest and the second return is a report dask-awkward collection. 
- known_base_form (awkward.forms.Form | None) – If not none use this form instead of opening one file to determine the dataset’s form. Only available with open_files=False. 
- decompression_executor (None or Executor with a - submitmethod) – The executor that is used to decompress- TBaskets; if None, a uproot.TrivialExecutor is created. Executors attached to a file are- shutdownwhen the file is closed.
- interpretation_executor (None or Executor with a - submitmethod) – The executor that is used to interpret uncompressed- TBasketdata as arrays; if None, a uproot.TrivialExecutor is created. Executors attached to a file are- shutdownwhen the file is closed.
- options – See below. 
 
 - Returns dask equivalents of the backends supported by uproot. If - library='np', the function returns a Python dict of dask arrays. If- library='ak', the function returns a single dask-awkward array.- For example: - >>> uproot.dask(root_file) dask.awkward<from-uproot, npartitions=1> >>> uproot.dask(root_file,library='np') {'Type': dask.array<Type-from-uproot, shape=(2304,), dtype=object, chunksize=(2304,), chunktype=numpy.ndarray>, ...} - Allowed types for the - filesparameter:- str/bytes: relative or absolute filesystem path or URL, without any colons other than Windows drive letter or URL schema. Examples: - "rel/file.root",- "C:\abs\file.root",- "http://where/what.root"
- str/bytes: same with an object-within-ROOT path, separated by a colon. Example: - "rel/file.root:tdirectory/ttree"
- pathlib.Path: always interpreted as a filesystem path or URL only (no object-within-ROOT path), regardless of whether there are any colons. Examples: - Path("rel:/file.root"),- Path("/abs/path:stuff.root")
- glob syntax in str/bytes and pathlib.Path. Examples: - Path("rel/*.root"),- "/abs/*.root:tdirectory/ttree"
- dict: keys are filesystem paths, values are objects-within-ROOT paths. Example: - {"/data_v1/*.root": "ttree_v1", "/data_v2/*.root": "ttree_v2"}
- dict: keys are filesystem paths, values are dicts containing objects-within-ROOT and steps (chunks/partitions) as a list of starts and stops or steps as a list of offsets Example: - {{“/data_v1/tree1.root”: {“object_path”: “ttree_v1”, “steps”: [[0, 10000], [15000, 20000], …]},
- “/data_v1/tree2.root”: {“object_path”: “ttree_v1”, “steps”: [0, 10000, 20000, …]}}} 
 - (This - filespattern is incompatible with- step_sizeand- steps_per_file.)
- already-open TTree objects. 
- iterables of the above. 
 - Options (type; default): - handler (uproot.source.chunk.Source class; None) 
- timeout (float for HTTP, int for XRootD; 30) 
- max_num_elements (None or int; None) 
- num_workers (int; 1) 
- use_threads (bool; False on the emscripten platform (i.e. in a web browser), else True) 
- num_fallback_workers (int; 10) 
- begin_chunk_size (memory_size; 403, the smallest a ROOT file can be) 
- minimal_ttree_metadata (bool; True) 
 - Other file entry points: - uproot.open: opens one file to read any of its objects. 
- uproot.iterate: iterates through chunks of contiguous entries in - TTrees.
- uproot.concatenate: returns a single concatenated array from - TTrees.
- uproot.dask (this function): returns an unevaluated Dask array from - TTrees.