-
Notifications
You must be signed in to change notification settings - Fork 4
Simulator experiment runner #102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
1ntEgr8
wants to merge
149
commits into
erdos-project:main
from
gatech-sysml:simulator-experiment-runner
Closed
Simulator experiment runner #102
1ntEgr8
wants to merge
149
commits into
erdos-project:main
from
gatech-sysml:simulator-experiment-runner
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ling-simulator into tpch-loader
…it in tpch_loader
We found that deadlines for task graphs weren't consistent between the simulator and Spark even with the same RNG seed being used, due to the fact that EventTime keeps a global RNG it uses for all of its fuzzing and both deadlines and runtime variances are fuzzed. Since in simulator runs, task deadlines are all calculated at the start and runtime variances are calculated later, and in Spark, task deadlines and runtime variances are calculated throughout the experiment lifecycle, different deadline variances are obtained between simulator and Spark runs on the same experiment. Our solution is to pass a unique RNG used just for calculating deadline variances to the fuzzer. This RNG is hardcoded with a seed of 42; this is fine for experiments but it should probably be changed to the random_seed command line flag.
Originally, the service named job graphs in the form Q<query>[<spark-app-id>], where Spark sets the app id to app-<timestamp>-<index>, while TPC-H data loader named job graphs in the form Q<query>[<index>]. This commit changes how the service names job graphs by passing the index as an argument to the TpchQuery Spark application, which will then be forwarded into the Servicer through RegisterTaskGraph as a part of the query name. RegisterTaskGraph then uses the index to name the job graph. This ensures that the job graph names are always the same between a Spark run and a simulator run, irrespective of when the task graphs are actually released during a Spark run (which can be nondeterministic). The intent is to use these names to generate deadlines for the task graphs, so that deadlines are always consistent between Spark and simulator runs. This change requires a corresponding change to tpch-spark to forward the index to the Servicer.
To avoid needless reruns, TetriSchedScheduler does not run the scheduler if there are no tasks which are not scheduled, not part of a task graph that has been previously considered, and not part of a task graph that has been cancelled. We remove this second condition to account for situations in which a task graph is considered and its tasks scheduled, but the tasks failed to be placed (for instance, if another task on the same worker finished late, taking up resources). In such cases, the task graph would not be cancelled and might still be able to be completed, so we need to run the scheduler again to try to schedule the tasks that could not be placed before.
This patch implements a data loader for TPC-H workloads. The loader is modeled after `data/alibaba_loader.py`.
Adjustments to metric calculation
* Implement TPC-H data loader * Bug fix: convert job graph to task graph * Make loop_timeout configurable * Make profile path configurable * release time handling on workload * Wrap up tpch loader implementation * scale runtime based on max number of tasks * fix bug in runtime calc * rename optimization_passes flag to opt_passes * add cloudlab support, fix runtime rounding bug, make rng gen match service * restore tpch_utils to main version * split tpch_replay config files * remove opt_passes flag * update tpch_utils.py * setup new service.py * update rpc proto dir hierarchy to resolve module import issue with sys.path hack * checkout dhruv's version of service.py * make workload_loader optional, make step and get_time_until_next_event public * implement register/deregister framework * refactor sim time calculation * implement RegisterWorker * factor out __get_worker_pool * refactor tpch loader * implement register task graph * add testing for service * implement register environment ready * init impl for get placements, readme with spark-erdos setup * WIP: service changes to handle first tpch taskgraph * Fix the tick() function to dequeue all events upto n, pass runtime_unit in tpch_loader * refactor runtime unit setting * reomve microsecond granularity for now, will add support later * rename sim_time to stime to make it less confusing * refactor tick * update rpc service to invoke new tick, update test * add comment explaining scheduler runtime zero setting * oops forget to add comment on update workload * refactor naming in tpch loader * fix documentation error in workload/tasks.py * add support for returning current task graph placements from simulator * fix logging * construct and return placements response * do not return placements if task graph is complete * fix placement time bug in simulator * implement orchestrated mode * remove redundant checks * enqueue scheduler start event in register task graph * bug fixes * add a lock, haven't checked all places * Add tests for notify task completion * add cancelled field to proto * populate cancelled field in rpc response * Updates to spark erdos service documentation * implement register driver * quick fixes in regsister and de-register driver * Updates to spark erdos documentation * register driver bug fix * Add override_worker_cpu_count flag * More documentation: tpch-spark and spark-mirror related * add delay in test_service for register env ready * doc update for tpch spark * Add test to invoke getPlacements before task registration * create task graph after environment is ready * update test * Separate out internal state mgmt for driver and application * re-add impl for override_worker_cpu_count * fix: correct enqueue of task_finished and sched_start in notifyTaskCompletion * enable task cancellations to be sent back to backend * update test script to include cancel_task scenario (needs to be sped up) * allow service to use different schedulers based on args * format service * add enforce deadlines flag * step workers during a tick * update placement time only if it is in the past * refactor file * document refactored simulate_f * handle case where task graph cannot be accommodated in worker pool * Potpourri of bug fixes and improvements - Undo stepping of worker every tick; it lead to some errors pertaining to task eviction. Punt investigation for later - Fix bug associated with mapping task ids to spark stage ids - Add a flag to control how the minimum amount by which a placement can be pushed into the future * Add log stats method to simulator * check for worker registration * update documentation in service * updating the service to operate in US instead of S time unit * misc improvements * override flags in service * fix workload release bug * update docs about sched time discretization * added comments about setting deadlines in generate_task_graph * updates to test script (verified across edf, fifo, dsched) * update documentation for the service * correctly handle task cancellations * simulator bug fixes and improvements - update placement time of pushed placement - dedup scheduler events with the same timestamp * add launch script * add shutdown rpc method * add service experiment runner * sleep for a bit before launching queries * remove extraneous sleep * set start_time of task to its placement time * add testcase to verify new taskgraph termination approach * nit documentation in erdos-spark setup * more explainable msgs from GetPlacements * reorder event queue priority to process scheduler events before task placement * improvements to experiment runner * fix profile path in tpch loader * add spark-master-ip flag * reinstate previous eventQueue priority order (task_placement before scheduer) * [simulator] Unschedule subtree rooted at task if task is unable to run at timestep * [service] log line to track tasks that get delayed in execution * sleep for some time before signalling shutdown * hack analyze pipeline to work with tpch output * [simulator] check task state before invoking unschedule on it * add support for tpch query partitioning * run_service_experiments: Log service stdout/stderr * run_service_experiments.py: Fix --dry_run * run_service_experiments: Timestamp results folder * run_service_experiments: Fix hang on exception Fixes an issue where if start-master.sh or start-worker.sh exits with a nonzero code, or more generally if an exception happens in Service.__enter__(), run_service_experiments.py hangs and doesn't report the exception. * Spark service: Correctly log stats on shutdown When the last application is deregistered from the spark service, execute all remaining events from the simulator. This allows the final LOG_STATS event to be processed so we can calculate the SLO attainment. Unlike normal runs of the simulator, a SIMULATOR_END event is not inserted as some tasks might not have finished in the simulator and it's unclear when they will finish. The simulator is patched to allow an empty event queue in Simulator.simulate(). * Set correct completion time for finishing Tasks On a TASK_FINISH event, set the task completion time to the time of the event rather than the last time the task was stepped. Resolves a bug in the service where tasks that finish later than the simulator's profiled runtime predicts get assigned the wrong completion time. * remove extraneous print * fix non-determinism in deadlines * Revert "fix non-determinism in deadlines" This reverts commit a22e406. * Ensure consistent deadlines between simulator and Spark (hack) We found that deadlines for task graphs weren't consistent between the simulator and Spark even with the same RNG seed being used, due to the fact that EventTime keeps a global RNG it uses for all of its fuzzing and both deadlines and runtime variances are fuzzed. Since in simulator runs, task deadlines are all calculated at the start and runtime variances are calculated later, and in Spark, task deadlines and runtime variances are calculated throughout the experiment lifecycle, different deadline variances are obtained between simulator and Spark runs on the same experiment. Our solution is to pass a unique RNG used just for calculating deadline variances to the fuzzer. This RNG is hardcoded with a seed of 42; this is fine for experiments but it should probably be changed to the random_seed command line flag. * run_service_experiments: print args to service and launcher * Fix some minor errors with deadline fuzzer * Spark service: Make job graph names the same as in TpchLoader Originally, the service named job graphs in the form Q<query>[<spark-app-id>], where Spark sets the app id to app-<timestamp>-<index>, while TPC-H data loader named job graphs in the form Q<query>[<index>]. This commit changes how the service names job graphs by passing the index as an argument to the TpchQuery Spark application, which will then be forwarded into the Servicer through RegisterTaskGraph as a part of the query name. RegisterTaskGraph then uses the index to name the job graph. This ensures that the job graph names are always the same between a Spark run and a simulator run, irrespective of when the task graphs are actually released during a Spark run (which can be nondeterministic). The intent is to use these names to generate deadlines for the task graphs, so that deadlines are always consistent between Spark and simulator runs. This change requires a corresponding change to tpch-spark to forward the index to the Servicer. * Deterministic task graph deadlines based on task graph names * TetriSchedScheduler: Reconsider tasks that couldn't be placed To avoid needless reruns, TetriSchedScheduler does not run the scheduler if there are no tasks which are not scheduled, not part of a task graph that has been previously considered, and not part of a task graph that has been cancelled. We remove this second condition to account for situations in which a task graph is considered and its tasks scheduled, but the tasks failed to be placed (for instance, if another task on the same worker finished late, taking up resources). In such cases, the task graph would not be cancelled and might still be able to be completed, so we need to run the scheduler again to try to schedule the tasks that could not be placed before. * Get rid of old service * address comments --------- Co-authored-by: Dhruv Garg <dhruvsgarg@hotmail.com> Co-authored-by: Rohan Bafna <130247393+rohanbafna@users.noreply.github.com>
* Implement TPC-H data loader * Bug fix: convert job graph to task graph * Make loop_timeout configurable * Make profile path configurable * release time handling on workload * Wrap up tpch loader implementation * scale runtime based on max number of tasks * fix bug in runtime calc * rename optimization_passes flag to opt_passes * add cloudlab support, fix runtime rounding bug, make rng gen match service * restore tpch_utils to main version * split tpch_replay config files * remove opt_passes flag * update tpch_utils.py * setup new service.py * update rpc proto dir hierarchy to resolve module import issue with sys.path hack * checkout dhruv's version of service.py * make workload_loader optional, make step and get_time_until_next_event public * implement register/deregister framework * refactor sim time calculation * implement RegisterWorker * factor out __get_worker_pool * refactor tpch loader * implement register task graph * add testing for service * implement register environment ready * init impl for get placements, readme with spark-erdos setup * WIP: service changes to handle first tpch taskgraph * Fix the tick() function to dequeue all events upto n, pass runtime_unit in tpch_loader * refactor runtime unit setting * reomve microsecond granularity for now, will add support later * rename sim_time to stime to make it less confusing * refactor tick * update rpc service to invoke new tick, update test * add comment explaining scheduler runtime zero setting * oops forget to add comment on update workload * refactor naming in tpch loader * fix documentation error in workload/tasks.py * add support for returning current task graph placements from simulator * fix logging * construct and return placements response * do not return placements if task graph is complete * fix placement time bug in simulator * implement orchestrated mode * remove redundant checks * enqueue scheduler start event in register task graph * bug fixes * add a lock, haven't checked all places * Add tests for notify task completion * add cancelled field to proto * populate cancelled field in rpc response * Updates to spark erdos service documentation * implement register driver * quick fixes in regsister and de-register driver * Updates to spark erdos documentation * register driver bug fix * Add override_worker_cpu_count flag * More documentation: tpch-spark and spark-mirror related * add delay in test_service for register env ready * doc update for tpch spark * Add test to invoke getPlacements before task registration * create task graph after environment is ready * update test * Separate out internal state mgmt for driver and application * re-add impl for override_worker_cpu_count * fix: correct enqueue of task_finished and sched_start in notifyTaskCompletion * enable task cancellations to be sent back to backend * update test script to include cancel_task scenario (needs to be sped up) * allow service to use different schedulers based on args * format service * add enforce deadlines flag * step workers during a tick * update placement time only if it is in the past * refactor file * document refactored simulate_f * handle case where task graph cannot be accommodated in worker pool * Potpourri of bug fixes and improvements - Undo stepping of worker every tick; it lead to some errors pertaining to task eviction. Punt investigation for later - Fix bug associated with mapping task ids to spark stage ids - Add a flag to control how the minimum amount by which a placement can be pushed into the future * Add log stats method to simulator * check for worker registration * update documentation in service * updating the service to operate in US instead of S time unit * misc improvements * override flags in service * fix workload release bug * update docs about sched time discretization * added comments about setting deadlines in generate_task_graph * updates to test script (verified across edf, fifo, dsched) * update documentation for the service * correctly handle task cancellations * simulator bug fixes and improvements - update placement time of pushed placement - dedup scheduler events with the same timestamp * add launch script * add shutdown rpc method * add service experiment runner * sleep for a bit before launching queries * remove extraneous sleep * set start_time of task to its placement time * add testcase to verify new taskgraph termination approach * nit documentation in erdos-spark setup * more explainable msgs from GetPlacements * reorder event queue priority to process scheduler events before task placement * improvements to experiment runner * fix profile path in tpch loader * add spark-master-ip flag * reinstate previous eventQueue priority order (task_placement before scheduer) * [simulator] Unschedule subtree rooted at task if task is unable to run at timestep * [service] log line to track tasks that get delayed in execution * sleep for some time before signalling shutdown * hack analyze pipeline to work with tpch output * [simulator] check task state before invoking unschedule on it * add support for tpch query partitioning * run_service_experiments: Log service stdout/stderr * run_service_experiments.py: Fix --dry_run * run_service_experiments: Timestamp results folder * run_service_experiments: Fix hang on exception Fixes an issue where if start-master.sh or start-worker.sh exits with a nonzero code, or more generally if an exception happens in Service.__enter__(), run_service_experiments.py hangs and doesn't report the exception. * Spark service: Correctly log stats on shutdown When the last application is deregistered from the spark service, execute all remaining events from the simulator. This allows the final LOG_STATS event to be processed so we can calculate the SLO attainment. Unlike normal runs of the simulator, a SIMULATOR_END event is not inserted as some tasks might not have finished in the simulator and it's unclear when they will finish. The simulator is patched to allow an empty event queue in Simulator.simulate(). * Set correct completion time for finishing Tasks On a TASK_FINISH event, set the task completion time to the time of the event rather than the last time the task was stepped. Resolves a bug in the service where tasks that finish later than the simulator's profiled runtime predicts get assigned the wrong completion time. * remove extraneous print * fix non-determinism in deadlines * Revert "fix non-determinism in deadlines" This reverts commit a22e406. * Ensure consistent deadlines between simulator and Spark (hack) We found that deadlines for task graphs weren't consistent between the simulator and Spark even with the same RNG seed being used, due to the fact that EventTime keeps a global RNG it uses for all of its fuzzing and both deadlines and runtime variances are fuzzed. Since in simulator runs, task deadlines are all calculated at the start and runtime variances are calculated later, and in Spark, task deadlines and runtime variances are calculated throughout the experiment lifecycle, different deadline variances are obtained between simulator and Spark runs on the same experiment. Our solution is to pass a unique RNG used just for calculating deadline variances to the fuzzer. This RNG is hardcoded with a seed of 42; this is fine for experiments but it should probably be changed to the random_seed command line flag. * run_service_experiments: print args to service and launcher * Fix some minor errors with deadline fuzzer * Spark service: Make job graph names the same as in TpchLoader Originally, the service named job graphs in the form Q<query>[<spark-app-id>], where Spark sets the app id to app-<timestamp>-<index>, while TPC-H data loader named job graphs in the form Q<query>[<index>]. This commit changes how the service names job graphs by passing the index as an argument to the TpchQuery Spark application, which will then be forwarded into the Servicer through RegisterTaskGraph as a part of the query name. RegisterTaskGraph then uses the index to name the job graph. This ensures that the job graph names are always the same between a Spark run and a simulator run, irrespective of when the task graphs are actually released during a Spark run (which can be nondeterministic). The intent is to use these names to generate deadlines for the task graphs, so that deadlines are always consistent between Spark and simulator runs. This change requires a corresponding change to tpch-spark to forward the index to the Servicer. * Deterministic task graph deadlines based on task graph names * TetriSchedScheduler: Reconsider tasks that couldn't be placed To avoid needless reruns, TetriSchedScheduler does not run the scheduler if there are no tasks which are not scheduled, not part of a task graph that has been previously considered, and not part of a task graph that has been cancelled. We remove this second condition to account for situations in which a task graph is considered and its tasks scheduled, but the tasks failed to be placed (for instance, if another task on the same worker finished late, taking up resources). In such cases, the task graph would not be cancelled and might still be able to be completed, so we need to run the scheduler again to try to schedule the tasks that could not be placed before. * Get rid of old service * Add TPCHSpark as a submodule * Move tpch-spark submodule to rpc/ --------- Co-authored-by: Elton Leander Pinto <eltonp3103@gmail.com> Co-authored-by: Dhruv Garg <dhruvsgarg@hotmail.com>
Setup for config searches with Ray Tune
* Update tpch-spark to include changes from pass-deadlines branch * Read workload spec json in launch_tpch_queries.py Changes the interface of launch_tpch_queries.py to accept a JSON file that contains the specification of the workload to launch. This avoids complications with simulator/service parity by making the workload fully deterministic. * specify deadline when creating task graph * Create script to generate a workload spec * modify loader to read from spec file * remove comment about metadata field * Spark service: accept deadlines from TPC-H query info * Update ray search script to use workload specs * bug fixes * add release time to deadline * corrected setup for config search 3/5 * fix dumb bug in ray search script * more dumb search script fixes also only apply the dsched slo penalty calculation in the metric if dsched's slo < 80% * Fix bug in service when deadline is set The deadline needs to be converted to an EventTime; otherwise the call to get_next_task_graph fails and the task graph is never created. * Spark service experiment setup for 3/06 * Experiment setup for 03/08 E/M/H mixture; extra datapoints for DSched and EDF * make tpch_plot work again * update ray search * add graphene config * tpch plot graphene + tetrisched baselines --------- Co-authored-by: Rohan Bafna <130247393+rohanbafna@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Helper script to spawn simulator experiments.
It uses a YAML config file format to specify matrices of experiments.
In a future PR, I will patch in support for running partial matrices. It will include a utility to split an experiment run into several subcommands that can be run on separate machines.