CTWatch
November 2006 B
High Productivity Computing Systems and the Path Towards Usable Petascale Computing
Andrew Funk, MIT Lincoln Laboratory
John R. Gilbert, UC Santa Barbara
David Mizell, Cray Inc.
Viral Shah, UC Santa Barbara

3
3. Timed Markov models of programmer workflows

Figure 2

Figure 2. Lone programmer workflow.

In our earlier work,9 we hypothesize a simple timed Markov model of a researcher developing a new application. It represents our assumption that the researcher begins by formulating a new algorithm for solving the problem of interest and then writes a program implementing the algorithm. Following that, the researcher enters a correctness–debugging loop, around which the process cycles until the program is deemed to be free of programming errors. Next is a performance–tuning loop, which cycles until the program has been tuned enough that it gets adequate performance for large problems to be run on an HPC system. This is the workflow shown in Figure 2. In this workflow:

  • Tf represents the time taken to formulate the new algorithmic approach.
  • Tp is the time necessary to implement the new algorithm in a program.
  • Tc is the compile time.
  • Tt is the time necessary to run a test case during the debugging phase.
  • Td is the time the programmer takes to diagnose and correct the bug.
  • Tr is the execution time for the performance tuning runs. This is the most obvious candidate
    of a constant that should be replaced by a random variable.
  • To is the time the programmer takes to identify the performance bottleneck and program an
    intended improvement.
  • Pp is the probability that debugging reveals a necessity to redesign the program.
  • Pd is the probability that more debugging is necessary.
  • Po is the probability that more performance optimization is necessary.
  • qp, qd, and qo are 1 - Pp, 1 - Pd, and 1 - Po, respectively.

This model can also be used to describe the workflow of graduate students programming homework in a parallel computing class. We instrumented their programs, collected workflow data and fit it to such a timed Markov model in our 2004 classroom experiment.9 While fitting the experimental data to the model, we discovered that in addition to the transitions described above, there were two more transitions. A programmer may introduce or discover a bug while attempting to optimize the program. As a result, there is a transition from the Run state to the Debug state. Sometimes, the last run may not be successful, perhaps because of a failed attempt to optimize. In such a case, an earlier correct run is treated as the final program – all the data we present is from programs that were eventually correct. Hence, there is another transition from Test to Finish. Figure 3 shows the result of our analysis of the 2004 classroom experiment.

Figure 3

Figure 3. Fitting data from the 2004 classroom experiment to a timed Markov model.

Pages: 1 2 3 4 5 6 7 8

Reference this article
"Modelling Programmer Workflows with Timed Markov Models ," CTWatch Quarterly, Volume 2, Number 4B, November 2006 B. http://www.ctwatch.org/quarterly/articles/2006/11/modelling-programmer-workflows-with-timed-markov-models/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.