#+TITLE: Lab book StarPU+Simgrid #+AUTHOR: Luka Stanisic #+LANGUAGE: en #+TAGS: IMPORTANT(I) TEST(T) DEPRECATED(D) #+TAGS: @PAUL(P) @LUKA(L) #+TAGS: _WINNETOU(W) _PAUL-BDX(P) _ATTILA(A) _HANNIBAL(H) _CONAN(C) _FROGGY(F) _MIRAGE(M) _FOURMI(O) * README: :@LUKA: ** General: - This file corresponds to the lab books like the ones used by biologist, chemists etc. - It contains explanations of how things are organized, what is the workflow for doing experiments, changes made to the code (as accurate as possible) and the observed behavior in the "* Data" section. Note that this last section is available only in /data/ branch, as we want to keep /master/ branch clean from any results-it contains only source code and the analysis (R code) ** Experiments workflow: 1) Create a new branch 2) Make sure everything is commited 3) Run /run_bench_StarPU/ script with the desired parameters 4) Run /run_inject_StarPUSG/ script with the desired parameters 5) Do the analysis 6) Add to this file into "* Data" section the entry for the results, using the template described below 7) Commit/push the results of this separate branch 8) Merge this new branch with the remote "data" branch ** Paul help: :@PAUL: git branch nom_de_branche git checkout nom_de_branche : aller dans la branche correspondante ./run_bench_StarPU.sh -t : testing. ne met rien dans le git -d nom_rep_data : ajoute le .org dans le git git add data/nom_rep_data git commit -am "blabla" git push -u origin nom_de_branche : envoie sur le remote les data. Premiere fois. ** Adding histograms to all data files: for i in {1..90} ; do ./R_add_hist.sh data/dataBord2/SoloStarpuData${i}.org scott && echo $i/90 ; done ** Plotting all histograms of a certain file: ./Rplot_hist.sh data/testing/SoloStarpuData39.org scott ** Extracting all GFlops as .csv and plotting results: ./R_allgflops.sh data/testing ** Example for using run_bench_StarPU.sh: :@LUKA: STARPU_NCPU=0 STARPU_NCUDA=3 STARPU_NOPECL=0 STARPU_SIZE=72000 STARPU_BLK=75 STARPU_SCHED=dmda STARPU_CALIBRATE=1 STARPU_PROGRAM=cholesky ./run_bench_StarPU.sh -d data/dataNew -c -f -v ** Example for using run_inject_StarPUSG.sh: :@LUKA: STARPU_SCHED=dmda STARPU_HOSTNAME=attila STARPU_PROGRAM=cholesky ./run_inject_StarPUSG.sh -n data/dataNew/SoloStarpuData0.org -d data/dataNew -c -f -v ** Example for running benchmarking experiments in a for loop: for i in {1..3} ; do STARPU_NCPU=0 STARPU_NCUDA=3 STARPU_NOPECL=0 STARPU_SIZE=$(($i*960)) STARPU_BLK=$i STARPU_SCHED=dmda STARPU_CALIBRATE=1 STARPU_PROGRAM=lu ./run_bench_StarPU.sh -t -f -v && echo $i/3 ; done ** Example for running simulation experiments in a for loop: for i in {0..71} ; do STARPU_SCHED=dmda STARPU_HOSTNAME=attila STARPU_PROGRAM=lu ./run_inject_StarPUSG.sh -n data/dataNew/SoloStarpuData${i}.org -d data/dataNew -f -v && echo $i/71 ; done ** Running qrm benchmarking: STARPU_NCPU=1 STARPU_CALIBRATE=1 STARPU_PROGRAM=qrm ./run_bench_StarPU.sh -t -c -f -v ** Running qrm simulation: STARPU_HOSTNAME=fourmi STARPU_PROGRAM=qrm ./run_inject_StarPUSG.sh -n data/testing/SoloStarpuData0.org -t -c -f -v * Template for data entry: ##################### ** data# *** git: #+begin_src sh git log -1 #+end_src *** Notes: ###################### * Organization of git ** remote/origin/master branch: - Has all the source, analysis, scripts ** remote/origin/data# branches: - Have all the data connected to specific experiments - Also some important (not all) .pdf files ** remote/origin/data branch: - Merging all the data and source branches - It is cloned only on my local machine, never clone it on a remote one * Git TAGs: ** Stable versions: *** stable1.0: - First stable version *** stable2.0: - Version at the end of the SUD (SUS) in Lyon in June. Should be stable *** stable3.0: - Stable version before going to Bordeaux in November 2013 *** stable4.1: - Stable version after going to Bordeaux in November 2013 *** stable4.2 - Stable version end of February, before going for the new version of Simgrid and StarPU *** stable5.0 - Stable version after Europar article and moving to new StarPU (rev12302 trunk branch) and Simgrid 3.11 (devel 2014-02-26 10:52:32) *** stable6.0 StarPU version: trunk r12421 Simgrid: 6683debe0efeb6371e0cbafb4f0d461325dd59a5 *** stable7.0 StarPU version: trunk r12433 Simgrid: 6683debe0efeb6371e0cbafb4f0d461325dd59a5 *** stable8.0 StarPU version: trunk r12463 Simgrid: 6683debe0efeb6371e0cbafb4f0d461325dd59a5 *** stable9.0 StarPU version: trunk r12463 Simgrid: 9a83532b45e512e41210c7784e5ad22adbc7e442 ** starpu_bench: - Our changes to StarPU code to have more sophisticated benchmarking * Organization of code ** scripts: *** DONE run_bench_StarPU.sh [4/4]: :@LUKA: - Runs benchmarking of StarPU without Simgrid - [X] Write a usage/help part, add environment variables - [X] Upgrade for interective mode - [X] Change verbose - [X] Add frequency scaling only if the file exists, otherwise write "unknown *** DONE run_inject_StarPUSG.sh [2/2]: :@LUKA: - [X] Make it work - [X] Execute R_add_hist automatically, if needed *** Rhist.R: Rscript that produces histogram :DEPRECATED: *** flags scripts: - flags_starpu.sh: Setup flags for running StarPU (run_bench_StarPU) - flags_clean.sh: Reset all flags - flags_simgrid_starpu.sh: Setup flags for running StarPU with Simgrid (run_inject_StarPUSG.sh) *** R_add_hist.sh: - Script (calls R script) used for SoloStarpuData files that didn't compute HISTOGRAM, if the machine experiment was executed did have R installed - Needs to be used by hand (for now-change it) - Example of usage: ./R_add_hist.sh data/dataNew/SoloStarpuData0.org *** Rplots_hist.sh: - Script for showing .pdf file with plots of all injection histograms for a certain .org file *** R_allgflops.sh: Script for extracting Gflop values from all data files in the folder into .csv and making R figure *** Extract_calibration.sh: - Script that reads and creates new files from calibration of SoloStarpuData.org. These calibration files are used later for the simulation - It is used automatically by /run_inject_StarPUSG.sh/ - Creates files: .starpu/sampling/codelets/*.bench_hostname .starpu/sampling/bus/bench_hostname.* - Example of usage: ./Extract_calibration.sh data/dataNew/SoloStarpuData.org *** get_trace.sh: - Script for related to paje traces. It is producing .csv from .org files, more precisely from "* PAJE" part of .org file - .csv is later used by Knitr/Org-babel for the analysis - Example of usage: ./get_trace.sh data/dataNew/SoloStarpuData0.org analysis/folder/paje_native.csv ./get_trace.sh data/dataNew/SoloStarpuData0.org analysis/folder/paje_simgrid.csv *** valgrind: - Script to launch execution with valgrind ** src/ *** simgrid **** contrib/benchmarking_code_block *** StarPU **** build-native **** build-simgrid ** analysis/ *** hist_scripts: - R code in charge of computing histograms for injecting pseudo-random values in the StarPU+Simgrid version of execution - R code in charge of plotting all histogram of a certain experiment *** arnaud_traces: - Two Arnaud .Rnw files with the analysis of traces generated by StarPU and StarPU+Simgrid - Good examples for future analysis files *** comparisonBabel: - Knitr and Org-babel examples for analysis and comparison of paje traces generated by native (solo starpu) and simulation (starpu+simgrid) - It is still not doing anything much useful, this is more template for development of "real" analysis file *** makespan: - One of the first analysis for comparing makespans *** makespan2: - Comparing makespans and paje traces *** newpaje: - Different babel analysis file, NewPaje.org being the biggest, all for comparing makespans and Paje traces of native and simgrid for differnet matrix size *** compare_simulations: - For comparing only diffrent simulations *** lu_makespans: - Analysis based on /newpaje/, only this time for LU application instead of cholesky *** MSG_sleep: - Analysis investigating using different MSG_process_sleep just after the trasfer of data in StarPU when using Simgrid *** used_size: - Investigating /used_size/ of memory when using different CUDA limits on both native and simgrid *** bord1: - Analysis used in Bordeaux during the visit November 2013 to finally obtain the matching results for large matrix size *** hist_analysis: - Analysis of different types of histograms ** .starpu/ - Folder where all StarPU calibrating is stored ** backup/ :DEPRECATED: - Backup of the version that is working * Additional feature: * Changes: ** 2013-02-18 - Created project and added this file - Setting up remote repository - Fixing remote branch - Adding scripts - Creating folders - Adding global "gitignore" file - Adding branches ** 2013-02-19 - Adding StarPU and Simgrid initial files - Adding Paul's and my changes to the files - Add local repository exclude list - Installing StarPU - Installing Simgrid ** 2013-03-11 - Reorganization of code and git repository ** 2013-03-12 :@PAUL: - Fixing git repository ** 2013-03-13::2013-03-15 - Fixing scripts, flags and workflow ** 2013-03-18 - Updating LabBook according to the last week changes ** 2013-05-27 - Adding new analysis both with knitr and with org+babel ** 2013-06-05::2013-06-07 *** SUS-Simgrid User Sprint: :@PAUL:@LUKA: - Running R script for computing HISTOGRAM in run_bench_StarPU.sh is now run only if R is installed on that machine. If not it can be executed later, on local machine, before doing simulation - Adding HOSTNAME information (org header) to .org data results. Also information about input file (SoloStarpuData.org) for simulation output (SimgridStarpuData*.org) - Now simulation (run_inject_StarPUSG.sh) is using calibration and command line parameters from SoloStarpuData*.org. For calibration it creates new files and puts them into .stapu/sampling/ - Some modification of LU example code from src/starpu in order for that code to be able to work with Simgrid - Fixed (hoping now everything will work fine) flags_* scripts, .gitignore and excluded files list - Being able to run LU and Cholesky using same script, by using STARPU_PROGRAM="". Add this before ./run_bench_StarPU.sh - Changed small things about data.org files (verbose, frequency scaling etc.) ** 2014-02-25 - Going for the new versions of StarPU and Simgrid * Data: ** data1: :TEST:@LUKA: *** git: #Generated by hand: e3191df890057163e12ae9c632c835e04b0500dc *** Notes: - Testing the git workflow ** data1403 :TEST:@PAUL: *** git: #Generated by hand: 21da58106ce43756a987b8fbad736cc88c345fac *** Notes: - Testing the workflow with Paul ** dataOld :IMPORTANT:@PAUL: *** git: #Generated by hand: 43166ee83eccf566e736360abc80a383ac650e96 *** Notes: - Added by hand into /data/ branch directly - Pdfs are already generated from this data and can be found in /pdfs/ directory - Both results from Samuel (few months ago) and Paul (more recent) - Analysis and explanation of these results is done by Arnaud and can be found in: 1) Analysis of results obtained by Samuel: [[git:~/Repository/git_gforge/StarPU_Simgrid/data/dataOld/analyze.Rnw::][analyze.Rnw]] 2) Analysis of results obtained by Paul: [[git:~/Repository/git_gforge/StarPU_Simgrid/data/dataOld/analyze2.Rnw::][analyze2.Rnw]] ** dataBabeltest :TEST:@LUKA: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | dc979ffa4da96065828344bfee50b649fa6f8e8c | | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Fri | May | 17 | 15:20:28 | 2013 | +0200 | | | | | | | | | | Adding | comparison | analysis | with | org+babel | | | *** Notes: - Testing the whole workflow and generating new data with paje traces - Making an example of org+babel analysis, that should be expanded for some future use ** dataLUSUS :TEST:@PAUL:_ATTILA: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: 8ce72c7c3b27997a0d40dd224902f9883d12ff32 *** Notes: - Testing, upgrading the code and doing some unimportant measurements during SUS in Lyon ** dataCholeskySUS :TEST:@PAUL:_ATTILA: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: c274cdc225247b6f7712749e28ee550178b2ef7a *** Notes: - Testing, upgrading the code and doing some unimportant measurements during SUS in Lyon ** dataLUHannibal :TEST:@PAUL:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: c890758cca5d9527c77263a2dfdcec0bdbfbc852 *** Notes: - Testing, upgrading the code and doing some unimportant measurements during SUS in Lyon ** dataLUHannibalok :TEST:@PAUL:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: 257a5d6517131dc632b2b5898f57cbbaf1a133fd *** Notes: - Testing, upgrading the code and doing some unimportant measurements during SUS in Lyon ** dataCholwscripts :TEST:@PAUL:_ATTILA: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: 000528effe19f6d5bd69e99644c0b721e684ebeb *** Notes: - Testing, upgrading the code and doing some unimportant measurements during SUS in Lyon ** dataChol1306 :TEST:@PAUL:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: 73328f40e817064bc3d2eb3f34304f69b5aa1c66 *** Notes: - Testing, upgrading the code and doing some unimportant measurements during SUS in Lyon ** dataCompMeanHisto :TEST:@PAUL:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: 3eefbfd9c6cc7c2bc4d056c0fe351f2af23824ce *** Notes: - Testing, upgrading the code and doing some unimportant measurements during SUS in Lyon ** dataChol2006 :@LUKA:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | dea94c7d9ebe919cff6ea121ea7efa2e244a49b3 | | | | | | | Merge: | 24cef0f | b14f9fa | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Mon | Sep | 9 | 14:24:19 | 2013 | +0200 | | | | | | | | | | Merging | with | dataChol2006 | branch | | | | *** Notes: - Trying to reproduce the experiments performed by Paul (and earlier by Samuel) - First problems with the results - There is some nice analysis code here, but it was later reused in larger and more complex reports ** dataCholPaje1 :IMPORTANT:@LUKA:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 45cd4347a10c1ff8ac4e3d9ccc477eed1e0dc014 | | | | | | | Merge: | a6dfff3 | 90f74c0 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Mon | Sep | 9 | 15:48:29 | 2013 | +0200 | | | | | | | | | | Merging | with | dataCholPaje1 | branch | | | | *** Notes: - Important measurements that show in a good way the problems with different makespans between native and simgrid, DriverCopy and FetchingInput issue etc. - All details can be found in the .pdfs: 1) /NewPaje.pdf/ full report. Very long but contains all information 2) /Short.pdf/ shorter version, sent to Arnaud 3) /Report_Samuel.pdf/ another shorter version, sent to Samuel - Bare in mind that the execution of some chunks can take very long time - There is another analysis file that emphasizes the issue in [[git:analysis/newpaje/ZoomDriverCopy.org::1ff65ab920d04a14712a4bc87121df790564f9b8][Zoom]] ** dataTest3007 :TEST:@LUKA:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | d30d7c35ebe902b04f769007d4f85a5deb66dee0 | | | | | | | Merge: | 44961f9 | 1d79d0b | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Mon | Sep | 9 | 16:09:52 | 2013 | +0200 | | | | | | | | | | Merging | with | dataTest3007 | branch | | | | *** Notes: - Doing some test by adding printf counters in order to understand better the way communication works ** dataLU2808 :@LUKA:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 4a821379e3f93d4f494b2beb099c968e04147823 | | | | | | | Merge: | a4449a7 | 8f4d922 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Mon | Sep | 9 | 16:41:18 | 2013 | +0200 | | | | | | | | | | Merging | with | dataLU2808 | branch | | | | | | | | | | | | | Conflicts: | | | | | | | | | .starpu/sampling/codelets/chol_model_11.winnetou | | | | | | | | .starpu/sampling/codelets/chol_model_21.winnetou | | | | | | | | .starpu/sampling/codelets/chol_model_22.winnetou | | | | | | *** Notes: - Experiments similar to /dataCholPaje1/ only this time using another application-LU decomposition - The results resemble to the ones obtained with /cholesky/ application. for more details look at the analysis file /LU_makespans.pdf/ ** dataCUDAfix :@LUKA:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 4cb99bf7b0d445aedeb98f3dff297927a68f384c | | | | | | | Merge: | e0103f1 | 0d37b99 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Mon | Sep | 9 | 17:08:44 | 2013 | +0200 | | | | | | | | | | Merging | with | dataCUDAfix | branch | | | | *** Notes: - Trying to add MSG_process_sleep() after transfer, in order to get more accurate simulation of communications. Unfortunately, this didnt solve the problem, actually it maybe introduced even some new ones - More details in /MSG_sleep.pdf/ ** dataSleep1 :IMPORTANT:@LUKA:_WINNETOU: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 1ff65ab920d04a14712a4bc87121df790564f9b8 | | | | | | | Merge: | 5ac653d | fb24231 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Thu | Sep | 19 | 16:24:27 | 2013 | +0200 | | | | | | | | | | Merging | with | dataSleep1 | branch | | | | *** Notes: - These were just simulations. We wanted to investigate why there is a deadlock(livelock) when we add one MSG_process_sleep(x) after transfer_submit, where x>=1.25ms - There are some callgrind files that show this issue, also the analysis files [[git:analysis/compare_simulations/CompSimSleep.org::1ff65ab920d04a14712a4bc87121df790564f9b8][Comparation]] - Much more details can be found in journal ** dataSleep2 :TEST:@LUKA:_WINNETOU: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | c6e5aec6e4640e4490ecf23ceb357fdcd58667fe | | | | | | | Merge: | 42d68c3 | 49ae077 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Thu | Sep | 26 | 15:59:50 | 2013 | +0200 | | | | | | | | | | Merging | with | dataSleep2 | branch | | | | *** Notes: - Testing TIMEDWAIT, trying to find best value for sleep - In this branch we also deleted deprecated version of StarPU from src code ** dataSleep3 :TEST:@LUKA: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 8cb1d83dbf9a62fad5fb412c2946c0476b474d6d | | | | | | | Merge: | 0758b58 | 9f0b5c3 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Fri | Oct | 25 | 17:20:33 | 2013 | +0200 | | | | | | | | | | Merging | with | dataSleep3 | branch | | | | *** Notes: - Testing different sleeps for MSG_process_sleep after transfer submit ** dataT2 :TEST:@LUKA:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 2f1a4485a180d8f678edff1601d6e031e0270f09 | | | | | | | Merge: | 97ade70 | 327d1d2 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Mon | Oct | 28 | 10:54:03 | 2013 | +0100 | | | | | | | | | | Merging | with | dataT2 | branch | | | | *** Notes: - Trying to execute simgrid with newer starpu code ** dataUsedSize :IMPORTANT:@LUKA:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 6621d356a3a75427800300ac462057f4dab0738e | | | | | | | Merge: | 645166c | 50bc038 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Mon | Oct | 28 | 13:56:48 | 2013 | +0100 | | | | | | | | | | Merging | with | dataUsedSize | branch | | | | | | | | | | | | | Conflicts: | | | | | | | *** Notes: - Experiments that show how there is nothing strange with the evolution of used size for CUDA memory - Look for more details in the [[git:data/dataUsedSize/UsedSize.pdf::6621d356a3a75427800300ac462057f4dab0738e][.pdf]] ** dataBord1 :IMPORTANT:@LUKA:_ATTILA:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | e17e70f5972860d2715d2e5e1e6a28aec3c0f0a0 | | | | | | | Merge: | c4b232b | 700a0d2 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Fri | Nov | 22 | 15:02:23 | 2013 | +0100 | | | | | | | | | | Merging | with | dataBord1 | branch | | | | | | | | | | | | | Conflicts: | | | | | | | | | src/starpu/AUTHORS | | | | | | | | src/starpu/ChangeLog | | | | | | | | src/starpu/Makefile.in | | | | | | | | src/starpu/doc/Makefile.in | | | | | | | | src/starpu/doc/chapters/.svn/text-base/perf-optimization.texi.svn-base | | | | | | | | src/starpu/doc/doxygen/chapters/advanced_examples.doxy | | | | | | | | src/starpu/doc/doxygen/chapters/api/insert_task.doxy | | | | | | | | src/starpu/doc/doxygen/chapters/environment_variables.doxy | | | | | | | | src/starpu/doc/doxygen/chapters/tips_and_tricks.doxy | | | | | | | | src/starpu/include/starpu_task_list.h | | | | | | | | src/starpu/include/starpu_thread.h | | | | | | | | src/starpu/include/starpu_thread_util.h | | | | | | | | src/starpu/include/starpu_util.h | | | | | | | | src/starpu/src/Makefile.in | | | | | | | | src/starpu/src/common/thread.c | | | | | | | | src/starpu/src/common/utils.h | | | | | | | | src/starpu/src/core/dependencies/tags.c | | | | | | | | src/starpu/src/core/perfmodel/perfmodel_bus.c | | | | | | | | src/starpu/src/core/progress_hook.c | | | | | | | | src/starpu/src/core/sched_ctx.c | | | | | | | | src/starpu/src/core/sched_ctx.h | | | | | | | | src/starpu/src/core/sched_policy.c | | | | | | | | src/starpu/src/core/task.c | | | | | | | | src/starpu/src/core/task.h | | | | | | | | src/starpu/src/core/workers.c | | | | | | | | src/starpu/src/datawizard/data_request.c | | | | | | | | src/starpu/src/datawizard/malloc.c | | | | | | | | src/starpu/src/datawizard/memalloc.c | | | | | | | | src/starpu/src/debug/traces/starpu_fxt.c | | | | | | | | src/starpu/src/debug/traces/starpu_paje.c | | | | | | | | src/starpu/src/drivers/cpu/driver_cpu.c | | | | | | | | src/starpu/src/drivers/cuda/driver_cuda.c | | | | | | | | src/starpu/src/drivers/driver_common/driver_common.c | | | | | | | | src/starpu/src/drivers/opencl/driver_opencl.c | | | | | | | | src/starpu/src/sched_policies/deque_modeling_policy_data_aware.c | | | | | | | | src/starpu/src/sched_policies/eager_central_policy.c | | | | | | | | src/starpu/src/sched_policies/eager_central_priority_policy.c | | | | | | | | src/starpu/src/sched_policies/parallel_eager.c | | | | | | | | src/starpu/src/sched_policies/stack_queues.c | | | | | | | | src/starpu/src/sched_policies/work_stealing_policy.c | | | | | | | | src/starpu/src/util/starpu_insert_task_utils.c | | | | | | | | src/starpu/src/util/starpu_task_list_inline.h | | | | | | | | src/starpu/tests/datawizard/allocate.c | | | | | | | | src/starpu/tests/perfmodels/value_nan.c | | | | | | *** Notes: - These are the experiments we performed when I was in Bordeaux in November 2013 - Finally we got native and simgrid matching for large matrix size - Everything is very well explained in [[git:data/dataBord1/Bord1Paje.pdf::e17e70f5972860d2715d2e5e1e6a28aec3c0f0a0][Report]] ** dataBord2 :IMPORTANT:@LUKA:_ATTILA:_CONAN: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | be434a975e3834d66c215a91fa02c99b43caa05a | | | | | | | Merge: | 6236c24 | 3b20ffc | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Tue | Dec | 3 | 17:25:59 | 2013 | +0100 | | | | | | | | | | Merging | with | dataBord2 | branch | | | | | | | | | | | | | Conflicts: | | | | | | | | | src/starpu/doc/Makefile.am | | | | | | | | src/starpu/tests/main/driver_api/run_driver.c | | | | | | *** Notes: - Measurement of makespans that finally match even for big matrix size for /attila/ and /conan/ machines ** dataBord3 :IMPORTANT:@LUKA:_HANNIBAL: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 7a72ace614401acd5dce76f4213a8e4ac5c809ed | | | | | | | Merge: | c8f20b4 | 6ed6a6e | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Fri | Dec | 6 | 17:13:45 | 2013 | +0100 | | | | | | | | | | Merging | with | dataBord3 | branch | | | | | | | | | | | | | Conflicts: | | | | | | | | | .starpu/sampling/bus/paul-bdx.platform.xml | | | | | | *** Notes: - Adding new measurements of /hannibal/ and simulations with modified platform.xml(by hand) ** dataLU0912* :IMPORTANT:@LUKA:_ATTILA:_HANNIBAL:_CONAN: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 38b4cf3d2729b13c10243024773da1ec165d2d1e | | | | | | | Merge: | b1598dd | 32b8727 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Thu | Dec | 12 | 11:24:19 | 2013 | +0100 | | | | | | | | | | Merging | with | dataLU0912 | branch | | | | *** Notes: - Measurements of makespans for LU algorithm on 3 machines from Bordeaux - Results from /hannibal/ and /attila/ are great, but the ones from /conan/ are strange (bogus), maybe due to some external effects ** dataCUDAbench :IMPORTANT:@LUKA:_ATTILA:_HANNIBAL:_CONAN:_FROGGY: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 8bb880efb6478b76520a63c0190d636a502900f5 | | | | | | | Merge: | d9743a3 | f79c89e | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Fri | Dec | 20 | 15:50:10 | 2013 | +0100 | | | | | | | | | | Merging | with | master | branch | | | | *** Notes: - Measurements done by Samuel's /cudabench.c/ script to see different effects of function /cudaMemCpy2D()/ on different GPUs ** art** :IMPORTANT:@LUKA:_ATTILA:_HANNIBAL:_CONAN:_FROGGY: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | ba88cbb8a443b412c3b7996508c03d1c8c4b93d2 | | | | | | | | | Author: | Luka | Stanisic | | | | | | | | Date: | Mon | Feb | 24 | 17:20:49 | 2014 | +0100 | | | | | | | | | | | | | | Adding | data | for | experiments | on | froggy | using | CPU | only | *** Notes: - Various measurements for done for the Europar article in order to verify the results and potentially get nicer plots ** dataK20* :IMPORTANT:@LUKA:_FROGGY: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | ba88cbb8a443b412c3b7996508c03d1c8c4b93d2 | | | | | | | | | Author: | Luka | Stanisic | | | | | | | | Date: | Mon | Feb | 24 | 17:20:49 | 2014 | +0100 | | | | | | | | | | | | | | Adding | data | for | experiments | on | froggy | using | CPU | only | *** Notes: - Various measurements on K20 froggy machine for both LU and cholesky in order to get some results for the Europar article ** dataFrogcpu :IMPORTANT:@LUKA:_FROGGY: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | ba88cbb8a443b412c3b7996508c03d1c8c4b93d2 | | | | | | | | | Author: | Luka | Stanisic | | | | | | | | Date: | Mon | Feb | 24 | 17:20:49 | 2014 | +0100 | | | | | | | | | | | | | | Adding | data | for | experiments | on | froggy | using | CPU | only | *** Notes: - Testing cholesky application on froggy machine with CPU only. Results are quite good, Simgrid is matching native execution perfectly (less 1% difference) ** dataTestN :TEST:@LUKA:_ATTILA: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 363f56e02f1038b41776964576a30d9b835d2c12 | | | | | | | Merge: | 513e2ef | c880903 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Fri | Mar | 7 | 15:28:50 | 2014 | +0100 | | | | | | | | | | Merging | with | dataTestN | branch | | | | *** Notes: - Testing version changes of StarPU and Simgrid, code upgrades, new scripts etc. ** dataMirT :TEST:@LUKA:_MIRAGE: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 934e0dcf9df058ef724c108bde551dd755a95e57 | | | | | | | Merge: | 1edac91 | 00e5784 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Wed | Mar | 12 | 16:10:18 | 2014 | +0100 | | | | | | | | | | Merging | with | dataMirT | branch | | | | *** Notes: - Exploratory measurements on mirage cluster. Evrything works fine for only 3GPUs, but for 12CPUs+3GPUs there is a difference between native and simgrid of ~13% - Actually if using only 8CPU things are much better. Explanaiton from my email: As we discussed with Arnaud, the schedule (and problem) of using all 12 CPUs is the following: 1) 3GPUs are doing only computation. Most of the computation is done by them. 2) 8CPUs are doing only computation. However they not as powerful as GPUs. 3) 1CPU is doing computation and scheduling of the whole program. Therefore, both computation and scheduling are slightly perturbated (and this is something we are not modelling in the simulation). 4) 3CPUs are doing both computation and managing GPUs. Therefore, both of these things are perturbated (and this is something we are not modelling in the simulation). We are not sure which of these issues has the biggest impact on the overall results. I guess it would be the fact that 3 CPUs are not completely dedicated for managing GPUs, so as a consequence GPUs are slowed down. Or it is a question of StarPU scheduler that is not taking optimal decisions (I have only tested with "dmda")... Anyway, I am not sure if it is worth investing time in modelling these issues, as the results with "less CPU workers" are much better and we simulate them correctly. ** dataMirT2 :IMPORTANT:@LUKA:_MIRAGE: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | fc652dd4d7e04357ed504e7979aa0ce5f917db0a | | | | | | | Merge: | f89541d | d582753 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Fri | Mar | 21 | 14:24:56 | 2014 | +0100 | | | | | | | | | | Merging | with | dataMirT2 | branch | | | | *** Notes: - Measurements on /mirage/ cluster with STARPU_NCPU=8 STARPU_NCUDA=3, that is to say real hybrid machine. Executions are matching quite nicely. ** dataQt1 :TEST:@LUKA:_FOURMI: *** git: #+begin_src sh git log -1 #+end_src #+RESULTS: | commit | 142893227b7e16cef187bbd2ca89fc97e0d10428 | | | | | | | Merge: | df21a7e | c6e9585 | | | | | | Author: | Luka | Stanisic | | | | | | Date: | Wed | Apr | 2 | 14:43:53 | 2014 | +0200 | | | | | | | | | | Merging | with | dataQt1 | branch | | | | *** Notes: - Initial measurements of /qrm_starpu/ that is producing fine results for 8 threads on fourmi, but need to compare traces to see it