/ EXERCISE, PROGRAMMING, STYLE

Introducing Exercises in Programming Style

Recently, my colleague Bertrand lent me a copy of the book Exercises in Programming Style by Cristina Videira Lopes. Among all the books that sit on my reading pile, I decided to put it on top, and started reading right away.

The concept behind the book is pretty simple, but very interesting: there is a problem to solve with code - search for the 25 most common words in a text file. But here’s a twist: there’s a dedicated set of constraints that the code needs to conform to, each chapter having a different set of such constraints.

TIL

This technique is inspired by French author Raymond Queneau’s book Exercises in Style. Who said computer programming-related books and general culture are two separate things?

A chapter introduces:

  • A set of constraints, and puts them in regard to a historical and technical context
  • The Python code conforming to those constraints
  • Comments about the code, and how it conforms
  • Additional assignments, some depending on the specific chapter, some generic. For example, one of the generic assignments is to re-write the code using another programming language.

My goal is to read through each chapter, do some of the assignments - at least write the code in Kotlin - and write about what I learned.

In this post, I’ll go through the first chapter, called "Good Old Times". The constraints of this chapter revolve around the same theme - memory is limited:

  1. There’s only one single variable available, of type array. It’s a huge constraint, but better than just having a single string variable whose value is comma-separated…​
  2. Word-count pairs are first to be written on the filesystem: the later is used as a second-level storage system.

Also, in the original code, there are two additional usages of variables - in loops. One of the additional assignments is to completely remove them.

This is the 1st post in the Exercises in Programming Style focus series.Other posts include:

  1. Introducing Exercises in Programming Style (this post)
  2. Exercises in Programming Style, stacking things up
  3. Exercises in Programming Style, Kwisatz Haderach-style
  4. Exercises in Programming Style, recursion
  5. Exercises in Programming Style with higher-order functions
  6. Composing Exercises in Programming Style
  7. Exercises in Programming Style, back to Object-Oriented Programming
  8. Exercises in Programming Style: maps are objects too
  9. Exercises in Programming Style: Event-Driven Programming
  10. Exercises in Programming Style and the Event Bus
  11. Reflecting over Exercises in Programming Style
  12. Exercises in Aspect-Oriented Programming Style
  13. Exercises in Programming Style: FP & I/O
  14. Exercises in Relational Database Style
  15. Exercises in Programming Style: spreadsheets
  16. Exercises in Concurrent Programming Style
  17. Exercises in Programming Style: sharing data among threads
  18. Exercises in Programming Style with Hazelcast
  19. Exercises in MapReduce Style
  20. Conclusion of Exercises in Programming Style

Without further ado, here are the TIL in this chapter.

Random file access

This item has nothing to do with the limited memory constraint. In my career, I sometimes add to write files, but in a very limited capacity: a new file had to be written, or appended to, but there was no need not make changes to the content of an existing file.

In the context of the assignments, the file’s content sometimes needs to be updated. There are different cases to handle:

  1. The count is 1, hence the file doesn’t contain the word. The file can just be appended with the word and its frequency. This is the easiest case.
  2. The count is higher than 1. The word frequency has already been written in the file: one needs to find it, and update its count - replace all characters from the line.
  3. A special case occurs when the new count has a more characters than the previous one e.g. from 9 to 10, from 99 to 100, etc. In that case, one additional character needs to be written. That means that if the word-frequency pair is not on the last line - a very likely likehood, one should shift the rest of the content by 1.

Standard File and Path objects allow to write-replace or append to an existing file. They don’t provide any capability to seek a specific position inside the file and start writing from it. This requires a class that exists, but that I never used before: RandomFileAccess - available since JDK 1.0.

binary-streams is a Kotlin wrapper that makes using RandomFileAccess much easier in a Kotlin codebase. An embryo of documentation can be found here.

Statically-typed vs dynamically-typed

I’m a big fan of statically-typed languages: I believe that everything that can be checked by the compiler should be. That translates into less time I need to do it myself.

However, given the constraint of this specific exercice - a single array variable, types are nothing but a hindrance. Because data of various types need to be stored in the same array, the array type can only be of type Any. Getting from that array yields objects of type Any, which need to be cast to the correct type before using type-dependent functions on it.

Consider the following snippets, the first one is the original in Python, the second one is my Kotlin version. Callouts allow to compare the relevant code lines:

if len(data[5]) >= 2 and data[5] not in data[0]:                            (1)
  while True:
    data[6] = str(word_freqs.readline().strip(), 'utf-8')
    if data[6] == '':
      break;
    data[7] = int(data[6].split(',')[1])                                    (2)
    data[6] = data[6].split(',')[0].strip()                                 (3)
    if data[5] == data[6]:
      data[7] += 1
      data[4] = True
      break
  if not data[4]:
    word_freqs.seek(0, 1) # Needed in Windows
    word_freqs.write(bytes("%20s,%04d\n" % (data[5], 1), 'utf-8'))          (4)
  else:
    word_freqs.seek(-26, 1)
    word_freqs.write(bytes("%20s,%04d\n" % (data[5], data[7]), 'utf-8'))    (5)
  word_freqs.seek(0,0)
if ((data[5] as String).length >= 2 && !(data[0] as List<*>).contains(data[5])) {(1)
  val g = read("word_freqs_$millis").iterator()
  while (g.hasNext()) {
    data[6] = g.next()
    data[7] = (data[6] as String).split(',')[1].toInt()                     (2)
    data[6] = (data[6] as String).split(',')[0].trim()                      (3)
    if (data[5] == data[6]) {
      data[7] = (data[7] as Int) + 1
      data[4] = true
      break
    }
  }
  if (!(data[4] as Boolean))
    updateCount("word_freqs_$millis", data[5] as String)                    (4)
  else
    updateCount("word_freqs_$millis", data[5] as String, data[7] as Int)    (5)
}

As seen from the above snippets, the statically-typed variant doesn’t bring anything to the table. On the opposite, it makes the code harder to read.

Removing variables in loops

One of the assignments of this chapter is to remove variables that are used in loops. For example, the following sample uses i as a loop counter:

for (i in 0..24) {                                           (1)
  if ((data[i] as Array<*>).isEmpty() ||
     ((data[i] as Array<*>)[1] as Int) < data[26] as Int) {
    data.add(i, arrayOf(data[25], data[26]))
  }
}
1 i is an "extra" variable, in addition to data It needs to be removed to comply with the unique array variable constraint.

The variable needs to be removed: to achieve that, let’s replace it with an additional item in the single array variable. As a corollary, the if loop also needs to be replaced with a while loop. Finally, every time the variable value is used, it needs to be cast to the proper Int type.

data[27] = 0                                                                 (1)
while ((data[27] as Int) < 25) {                                             (2)
  if ((data[(data[27] as Int)] as Array<*>).isEmpty() ||                     (2)
     ((data[(data[27] as Int)] as Array<*>)[1] as Int) < data[26] as Int) {  (2)
    data.add((data[27] as Int), arrayOf(data[25], data[26]))                 (2)
    data.removeAt(25)
    break
  }
  data[27] = (data[27] as Int) + 1                                           (3)
}
1 Declare the additional item, and replace if with while
2 Every time the value is used, it needs to be cast to the proper Int type
3 Increment the value manually - the previous for construct had it handled for us

Conclusion

This chapter taught me:

  • The existence of the RandomFileAccess class. Though it was probably not in the part of what the author had in mind, it’s an important class.
  • Obviously, statically-typed languages are more a hindrance than a help in a context where there are no types - a single array variable.

This was an interesting assignment, I hope the next chapters will be as entertaining.

The complete source code for this post can be found on Github.
Nicolas Fränkel

Nicolas Fränkel

Nicolas Fränkel is a Developer Advocate with 15+ years experience consulting for many different customers, in a wide range of contexts (such as telecoms, banking, insurances, large retail and public sector). Usually working on Java/Java EE and Spring technologies, but with focused interests like Rich Internet Applications, Testing, CI/CD and DevOps. Currently working for Hazelcast. Also double as a teacher in universities and higher education schools, a trainer and triples as a book author.

Read More