    <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:content="http://purl.org/rss/1.0/modules/content/">
     <channel>
        <title>ACCU  :: I_mean_something_to_somebody</title>
        <link>https://members.accu.org/index.php/articles/1252</link>
        <description>Professionalism in Programming</description>
        <dc:language>en-us</dc:language> 
        <dc:creator>Administrator</dc:creator> 
        <admin:generatorAgent rdf:resource="http://www.xaraya.org" /> 
        <admin:errorReportsTo rdf:resource="mailto:webeditor@accu.org" />
       <sy:updatePeriod>hourly</sy:updatePeriod>
       <sy:updateFrequency>1</sy:updateFrequency>
       <docs>http://backend.userland.com/rss</docs>




<div class="xar-mod-head"><span class="xar-mod-title">Design of applications and programs + CVu Journal Vol 15, #6 - Dec 2003</span></div>

<table border="0" cellpadding="1" cellspacing="0">
    <tbody>
    <tr>
        <td valign="top">
            Browse in :
       </td>
       <td valign="top">

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c13/">Topics</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c67/">Design</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/">All</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c76/">Journals</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c77/">CVu</a>

                     &gt;                         <a href="https://members.accu.org/index.php/articles/c105/">156</a>
<br />

                                            <a href="https://members.accu.org/index.php/articles/c67-105/">Any of these categories</a>

                    -                        <a href="https://members.accu.org/index.php/articles/c67+105/">All of these categories</a>
<br />
</td>
   </tr>
   </tbody>
</table>




<div class="xar-error">
   <p>
 <strong>Note:</strong> when you create a new publication type,
the articles module will automatically use the templates
<em>user-display-[publicationtype].xt</em>
and <em>user-summary-[publicationtype].xt</em>.
If those templates do not exist when you try to preview or display a new article,
you'll get this warning :-)  Please place your own templates in themes/<em>yourtheme</em>/modules/articles . The templates will get the extension .xt there. </p>
</div>
<div class="xar-norm xar-standard-box-padding">
   <h1><strong>Title:</strong>&nbsp;I_mean_something_to_somebody</h1>
<p><strong>Author:</strong>&nbsp;</p>
<p>
<strong>Date:</strong> 03 December 2003 13:16:02 +00:00 or Wed, 03 December 2003 13:16:02 +00:00</p>
<p><strong>Summary:</strong>&nbsp;<p>This article reports on an experimental study, performed during the 2003 ACCU conference, that attempted to measure one particular aspect of developer identifier meaning assignment behavior. The study investigated the extent to which belief in the applicable application domain affects the meaning assigned to identifier names.</p></p>
<p><strong>Body:</strong>&nbsp;<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e20" id="d0e20"></a></h2>
</div>
<p>Software developers are constantly exhorted to use <span class=
"emphasis"><em>meaningful</em></span> identifier names. Unless they
are chosen at random all identifiers are likely to be have some
meaning, at least to the person who created them. The implicit
assumptions in the exhortation is that the names are meaningful to
subsequent readers of the source code and also that all readers of
the source code containing them agree on what that meaning is.</p>
<p>Surprisingly there have been no studies investigating whether
developers agree on the meaning assigned to uses of particular
identifier names, although there have been studies that have
investigated related issues. These studies include the meanings
assigned to (human language) words, and words/phrases invented by
people to describe something.</p>
<p>This article reports on an experimental study, performed during
the 2003 ACCU conference, that attempted to measure one particular
aspect of developer identifier meaning assignment behavior. The
study investigated the extent to which belief in the applicable
application domain affects the meaning assigned to identifier
names.</p>
<p>The experiment is discussed in two articles. The first, this
one, discusses the background to the experiment and some of the
applicable characteristics of the subjects taking part; the second
provides a review of other studies that have investigated meaning
assignment and discusses the results of the ACCU experiment.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e33" id="d0e33"></a>Why is an
experiment necessary?</h2>
</div>
<p>Developers are often unable to give a coherent answer to how
they assign a meaning to an identifier. This kind of human behavior
(knowing something without being able to state what it is) has been
duplicated in many studies.</p>
<p>A study by Reber and Kassin [1] compared implicit and explicit
pattern detection. Subjects were asked to memorise sets of words
containing the letters <span class="emphasis"><em>P</em></span>,
<span class="emphasis"><em>S</em></span>, <span class=
"emphasis"><em>T</em></span>, <span class=
"emphasis"><em>V</em></span>, or <span class=
"emphasis"><em>X</em></span>. Most of these words had been
generated using a finite state grammar. However, some of the sets
contained words that had not been generated according to the rules
of this grammar. One group of subjects thought they were taking
part in a purely memory based experiment, the other group were also
told to memorise the words but were also told of the existence of a
pattern to the letter sequences and that it would help them in the
task if they could deduce this pattern. The performance of the
group that had not been told about the presence of a pattern almost
exactly mirrored that of the group who had been told on all sets of
words (pattern words only, pattern plus non-pattern words,
non-pattern words only). Without being told to do so, subjects had
used patterns in the words to help perform the memorisation
task.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e55" id="d0e55"></a>How do
developers assign a meaning to an identifier?</h2>
</div>
<p>This issue is discussed in some detail in part 2 of this
article. For the time being we assume that the meaning assigned to
an identifier by a developer is created using a repertoire of
previously learned techniques operating on a substantial body of
knowledge they have in their head. The significant techniques and
knowledge are assumed to be:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>experience of human and computer languages,</p>
</li>
<li>
<p>knowledge of particular domains (e.g., software engineering
concepts, or knowledge of an application domain such as
accounting),</p>
</li>
<li>
<p>experience in reading and writing source code,</p>
</li>
<li>
<p>information obtained from the context in which the identifier
occurs. This issue is discussed further in part 2.</p>
</li>
</ul>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e73" id="d0e73"></a>Prior
experience</h2>
</div>
<p>Studies have found that nearly every task that exhibits a
practice affect follows the power law of learning. This law has the
form:</p>
<pre class="programlisting">
RT = a + bN<sup>-c</sup>
</pre>
<p>where <tt class="varname">RT</tt> is the response time,
<tt class="varname">N</tt> is the number of times the task has been
performed (not the amount of time spent performing the task), and
<tt class="constant">a</tt>, <tt class="constant">b</tt>, and
<tt class="constant">c</tt> are constants.</p>
<p>The value of <tt class="constant">c</tt> is usually greater than
1, which means that ever larger amounts of practice are needed to
obtain the same increase in performance.</p>
<p>Studies have found that practice effects exist in peoples use of
language. For instance, a large number of studies have verified
that a <span class="emphasis"><em>word frequency effect</em></span>
exists. The number of times a person has been exposed to a (natural
language) word has a significant effect on their recognition of and
performance in handling that word.</p>
<p>Education within specific knowledge domains has also been found
to affect word handling performance. For instance, a study by
Gardner, Rothkopf, Lapan, and Lafferty [2] used 10 engineering, 10
nursing, and 10 law students as subjects. These subjects were asked
to indicate whether a letter sequence was a word or a nonword. The
words were drawn from a sample of high frequency words (more than
100 per million), medium frequency (10-99 per million), low
frequency (less than 10 per million), and occupationally related
engineering or medical words. The nonwords were created by
rearranging letters of existing words, while maintaining English
rules of pronounceability and orthography.</p>
<p>The results showed engineering subjects could more quickly and
accurately identify the words related to engineering (but not
medicine). The nursing subjects could more quickly and accurately
identify the words related to medicine (but not engineering). The
law students showed no response differences for either group of
occupationally related words. There were no response differences on
identifying nonwords. The performance of the engineering and
nursing students on their respective occupational words was almost
as good as their performance on the medium frequency words.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e113" id="d0e113"></a>Domain
knowledge</h2>
</div>
<p>Given the likelihood that subjects taking part in the experiment
would have a wide variety of software related backgrounds it was
decided to attempt to restrict the domains they considered when
answering experimental questions. Two domains that subjects were
likely to be broadly familiar with were chosen. These two domains
were operating systems (the Linux Kernel) and computer games (Doom,
a product of ID Software).</p>
<p>It was hoped that specifying a particular domain would provide a
global context within which it would be possible for subjects to
provide a few answers, that they considered to be the likely
meanings. For instance, the word &quot;bank&quot; has a number of possible
meanings. Being told that it occurs in a financial context is
likely to cause people to associate a meaning with it that is
different than if they had been told it occurred in a discussion
about rivers. It is also possible for a localised context to
override a global context. For instance, discussing the watery view
from an accountancy company's offices might suggest a river bank,
while a discussion of the cost of restocking a river with fish
might trigger finance related thoughts.</p>
<p>A number of subjects indicated (both after taking part in the
experiment and through writing on their response sheet) that they
were unable to provide a meaning to some identifiers because
insufficient context was available to them.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e122" id="d0e122"></a>Source code
experience</h2>
</div>
<p>Given that a subject's performance is driven by the amount of
time they have spent performing the task, we need some way of
measuring the amount of time spent working directly with source
code.</p>
<p>Traditionally, developer experience is measured in number of
years of employment (performing some software related activity). It
is relatively easy for a person to calculate the amount of time
they have been employed in a software development related role.
However, the extent to which the amount of time spent in software
related employment correlates with experience working on source
code is not known (there are many software related employment
activities that do not involve a person working at the source code
level).</p>
<p>The quantity of source code (measured in lines, not time spent)
read and written by a developer (developer interaction with source
code overwhelmingly occurs in its written, rather than spoken,
form) is a more direct measure of experience. Interaction with
source code is rarely a social activity (one situation where it
does occur socially is during code reviews) and the time spent on
these activities may be small enough to ignore. The problem with
this measure is that it is very difficult to obtain reliable
estimates of the amount of source read and written by a developer.
The problems include:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>readers don't always read code on a line by line basis. For
instance, they may scan the source looking for some construct, or
may only read part of a line,</p>
</li>
<li>
<p>the same code is often read several times. Does each instance of
reading have the same learning affect?</p>
</li>
<li>
<p>code may be written and then rewritten or deleted (existing
productivity measures are based on final number of lines of
debugged code),</p>
</li>
<li>
<p>few developers regularly measure the amount of code they write.
This means they are very unlikely to be able to make an informed
estimate of the total amount of code they have read or written.</p>
</li>
</ul>
</div>
<p>Even although there appear to be significant problems in
obtaining reliable answers from developers on the amount of source
they have read and written your author believed something could be
learned from the subjects' responses.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e146" id="d0e146"></a>Experimental
setup</h2>
</div>
<p>The experiment was run by your author during two 30 minute
sessions (on different days) of the 2003 ACCU conference held in
Oxford, UK. Approximately 250 people attended the conference, 45
(18%) of whom took part in the experiment. Subjects were given a
brief introduction to the experiment, during which they filled out
background information about themselves, and they then spent 15
minutes working on the identifier list. All subjects (31 on the
first day, 15 on the second) volunteered their time and were
anonymous.</p>
<p>The requested subject background information was as follows:</p>
<div class="itemizedlist">
<ul type="disc">
<li>
<p>What is is your native language?</p>
</li>
<li>
<p>Please list any human languages that you speak fluently:</p>
</li>
<li>
<p>Please list any of these languages that you use at least once a
week:</p>
</li>
<li>
<p>Please list the computer languages that you have spent a
significant amount of time reading and writing (at least 100 hours,
i.e., 3 work weeks) over the last two years:</p>
</li>
<li>
<p>How many lines of code would you estimate you have <span class=
"emphasis"><em>written</em></span>, in total, over your career:</p>
<div class="itemizedlist">
<ul type="circle">
<li>
<p>10,000</p>
</li>
<li>
<p>25,000</p>
</li>
<li>
<p>50,000</p>
</li>
<li>
<p>75,000</p>
</li>
<li>
<p>100,000</p>
</li>
<li>
<p>150,000+</p>
</li>
</ul>
</div>
</li>
<li>
<p>How many lines of code would you estimate you have <span class=
"emphasis"><em>read</em></span>, in total, over your career:</p>
<div class="itemizedlist">
<ul type="circle">
<li>
<p>10,000</p>
</li>
<li>
<p>25,000</p>
</li>
<li>
<p>50,000</p>
</li>
<li>
<p>75,000</p>
</li>
<li>
<p>100,000</p>
</li>
<li>
<p>150,000+</p>
</li>
</ul>
</div>
</li>
<li>
<p>How many years have you been writing software
professionally?</p>
</li>
</ul>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e219" id="d0e219"></a>Subjects'
background</h2>
</div>
<p>On the first day 30 subjects handed in their completed response
sheets (one subject was not happy with their performance and it was
agreed that they could keep the sheet containing their responses),
on the second day 15 response sheets were handed in.</p>
<p>Most of the subjects were native speakers of English:</p>
<div class="table"><a name="d0e226" id="d0e226"></a>
<p class="title c2">Table 1. Number of subjects having the given
language as their native language</p>
<table summary=
"Number of subjects having the given language as their native language"
border="1">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Native Language</th>
<th align="center">Number of subjects</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>English</td>
<td align="center">35</td>
</tr>
<tr>
<td>French</td>
<td align="center">2</td>
</tr>
<tr>
<td>German</td>
<td align="center">2</td>
</tr>
<tr>
<td>Italian</td>
<td align="center">1</td>
</tr>
<tr>
<td>NL<sup>[<a name="d0e260" href="#ftn.d0e260" id=
"d0e260">a</a>]</sup></td>
<td align="center">1</td>
</tr>
<tr>
<td>Russian</td>
<td align="center">2</td>
</tr>
<tr>
<td>Slovenian</td>
<td align="center">1</td>
</tr>
<tr>
<td>Swis German</td>
<td align="center">1</td>
</tr>
&lt;/tbody&gt;
&lt;tbody class=&quot;footnotes&quot;&gt;
<tr>
<td colspan="2">
<div class="footnote">
<p><sup>[<a name="ftn.d0e260" href="#d0e260" id=
"ftn.d0e260">a</a>]</sup> The response &quot;NL&quot; is assumed to refer to
the country code for the Netherlands.</p>
</div>
</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>The subjects regularly used a wide range of computer languages,
see table 2. The most commonly used language was C++, followed
someway down the list by C, and Java (the ACCU is the Association
of C and C++ Users, and a Python conference was also taking place
in the same place at the same time).</p>
<div class="table"><a name="d0e282" id="d0e282"></a>
<p class="title c2">Table 2. Number of subjects regularly using a
particular computer language.</p>
<table summary=
"Number of subjects regularly using a particular computer language."
border="1">
&lt;colgroup&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;
&lt;col&gt;&lt;/colgroup&gt;
&lt;thead&gt;
<tr>
<th>Computer Language</th>
<th align="center">Number of subjects</th>
<th>Computer Language</th>
<th align="center">Number of subjects</th>
</tr>
&lt;/thead&gt;
&lt;tbody&gt;
<tr>
<td>Assembler</td>
<td align="center">1</td>
<td>ML</td>
<td align="center">1</td>
</tr>
<tr>
<td>Basic</td>
<td align="center">1</td>
<td>OLC</td>
<td align="center">1</td>
</tr>
<tr>
<td>C</td>
<td align="center">20</td>
<td>Pascal</td>
<td align="center">2</td>
</tr>
<tr>
<td>C#</td>
<td align="center">2</td>
<td>Perl</td>
<td align="center">9</td>
</tr>
<tr>
<td>C++</td>
<td align="center">41</td>
<td>PHP</td>
<td align="center">3</td>
</tr>
<tr>
<td>Cobol</td>
<td align="center">1</td>
<td>Python</td>
<td align="center">6</td>
</tr>
<tr>
<td>Fortran</td>
<td align="center">3</td>
<td>Rebol</td>
<td align="center">1</td>
</tr>
<tr>
<td>Haskell</td>
<td align="center">1</td>
<td>shell<sup>[<a name="d0e367" href="#ftn.d0e367" id=
"d0e367">a</a>]</sup></td>
<td align="center">6</td>
</tr>
<tr>
<td>HTML</td>
<td align="center">3</td>
<td>SQL</td>
<td align="center">3</td>
</tr>
<tr>
<td>IDL</td>
<td align="center">1</td>
<td>TCL</td>
<td align="center">1</td>
</tr>
<tr>
<td>Java</td>
<td align="center">12</td>
<td>VB</td>
<td align="center">7</td>
</tr>
<tr>
<td>Javascript</td>
<td align="center">4</td>
<td>VBA</td>
<td align="center">1</td>
</tr>
<tr>
<td>Lisp</td>
<td align="center">1</td>
<td>VBScript</td>
<td align="center">2</td>
</tr>
<tr>
<td>make</td>
<td align="center">1</td>
<td>XML</td>
<td align="center">3</td>
</tr>
&lt;/tbody&gt;
&lt;tbody class=&quot;footnotes&quot;&gt;
<tr>
<td colspan="4">
<div class="footnote">
<p><sup>[<a name="ftn.d0e367" href="#d0e367" id=
"ftn.d0e367">a</a>]</sup> &quot;shell&quot; was the generic term used for a
variety of command line shells.</p>
</div>
</td>
</tr>
&lt;/tbody&gt;
</table>
</div>
<p>Over 50% of the subjects had more than 11 years of experience in
software development (see figure 1). Whether this figure is
representative of the general population of developers (or even of
those attending the conference; there were other events taking
place throughout the extended lunch break during which the
experiments were run) is not known.</p>
<div class="figure"><a name="d0e428" id="d0e428"></a>
<p class="title c2">Figure 1. Number of years experience.</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>Cumulative percentage (line) of subjects who have been writing
software professionally for the given number of years. The crosses
represent the relative number of subjects having the given number
of years employment. Four subjects did not provide an answer to
this question.</p>
</blockquote>
</div>
<div class="mediaobject"><img src="/var/uploads/journals/resources/jones-figure-1.png"
alt="Number of years experience."></div>
</div>
<p>Cumulative percentage (line) of subjects who have been writing
software professionally for the given number of years. The crosses
represent the relative number of subjects having the given number
of years employment. Four subjects did not provide an answer to
this question.</p>
<p>As figure 2 (see next page) shows, your author clearly
underestimated the number of lines of code that subjects believe
they have read and written (he also has to admit to thinking that
the average number of years of professional software development
would be lower).</p>
<div class="figure"><a name="d0e441" id="d0e441"></a>
<p class="title c2">Figure 2. Lines of code read and written.</p>
<div class="blockquote">
<blockquote class="blockquote">
<p>The graph on the left depicts number of line of code read
against number of years of writing software professionally. The
graph on the right depicts the number of lines of code read against
the number of lines of code written for each subject. The size of
the circle indicates the number of subjects specifying the given
values. In those cases where subjects listed a range of values
(i.e., 50,000-75,000) the median of that range was used.</p>
</blockquote>
</div>
<div class="mediaobject"><img src="/var/uploads/journals/resources/jones-figure-2.png"
alt="Lines of code read and written."></div>
</div>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e450" id=
"d0e450"></a>Conclusions</h2>
</div>
<p>Based on years of employment the majority of the subjects have a
significant amount of software development experience. The
measurements based on lines of code read are likely to suffer from
incorrect selfcalibration on the part of subjects and a ceiling
effect caused by overly restrictive response options.</p>
<p>Further conclusions will be given in part 2 of this article.</p>
</div>
<div class="sect1" lang="en">
<div class="titlepage">
<h2><a name="d0e457" id=
"d0e457"></a>Acknowledgements</h2>
</div>
<p>The author wishes to thank everybody who volunteered their time
to take part in the experiment, and ACCU for making conference
slots available to run it.</p>
</div>
<div class="bibliography">
<div class="titlepage">
<h2><a name="d0e462" id="d0e462"></a>References and
further reading</h2>
</div>
<div class="bibliomixed">
<p class="bibliomixed">[1] Reber and Kassin</p>
</div>
<div class="bibliomixed">
<p class="bibliomixed">[2] Gardner et al</p>
</div>
<div class="bibliomixed">
<p class="bibliomixed">The <span class="citetitle"><i class=
"citetitle">Psychology of Language</i></span> by Trevor Harley is
an undergraduate level introduction to the subject. Those wanting a
lighter read might like to try <span class="citetitle"><i class=
"citetitle">Word in the Mind</i></span> by Jean Aitchison,</p>
</div>
<div class="bibliomixed">
<p class="bibliomixed"><span class="citetitle"><i class=
"citetitle">Learning and Memory</i></span> by John Anderson is an
undergraduate level introduction to the subject. Those wanting a
lighter read might like to try <span class="citetitle"><i class=
"citetitle">Essentials of Human Memory</i></span> by Alan
Baddeley.</p>
</div>
</div>
</p>
<p><strong>Notes:</strong>&nbsp;</p>
<p><em>More fields may be available via dynamicdata ..</em></p>
</div>
</channel>
</rss>
