Word frequencies and using collections
Hi! I found one of the examples of Processing which analyzes two books (Dracula and Frankenstein), gets the word frequencies, then draws the words moving at different speeds and font sizes depending on those frequencies. It only shows words which appear more than 5 times in one of the books while not appearing at all in the other book.
I like how this can be expressed in a concise way in Kotlin.
Note this is not a literal port of the Processing program to OPENRNDR, but more a reinterpretation with a similar output and simple code.
Processing / Java
The code is split in two files. I suggest opening them in new browser windows to see them side by side with the Kotlin code.
OPENRNDR / Kotlin
Imports
import org.openrndr.application
import org.openrndr.color.ColorRGBa
import org.openrndr.draw.loadFont
import org.openrndr.extra.noise.uniform
import org.openrndr.math.Vector2
import org.openrndr.math.map
import org.openrndr.shape.IntRectangle
import java.io.File
fun main() = application {
program {
val fonts = List(10) {
loadFont("data/fonts/SourceCodePro-Regular.ttf", 4.0 + 5 * it)
}
// Make spawnArea tall so it's not too crowded with words
val spawnArea = Rectangle(-50.0, 0.0, width * 1.0, height * 3.0)
// Render area is larger than the window so appearing objects don't pop at the top
// but start rendering above the edge and then slide in
val renderArea = drawer.bounds.offsetEdges(30.0)
class Word(val word: String, count: Int, val color: ColorRGBa) {
// Pick initial random position inside spawnArea
private var position = Vector2.uniform(spawnArea)
// Calculate speed based on word count
private var speed = Vector2(
0.0, count.toDouble().map(5.0, 25.0, 0.1, 5.0, true)
)
// Pick font (size) based on word count
private var font = fonts[count.toDouble().map(
5.0, 25.0, 0.0, fonts.size - 1.0, true
).toInt()]
fun display() {
// Move Word down
position += speed
// If too far down, bring back up
if (position.y > spawnArea.height) {
position -= Vector2(0.0, spawnArea.height)
}
// If `position` inside `renderArea`, draw it
if (renderArea.contains(position)) {
drawer.fill = color
drawer.fontMap = font
drawer.text(word, position)
}
}
}
// Create a `Map<String, Int>` with words and their counts
val freqsDracula = File("data/texts/dracula.txt")
.readText().lowercase().split(Regex("\\W+"))
.groupingBy { it }.eachCount().filter { it.value >= 5 }
val freqsFranken = File("data/texts/frankenstein.txt")
.readText().lowercase().split(Regex("\\W+"))
.groupingBy { it }.eachCount().filter { it.value >= 5 }
// Make sure no words appear in both Maps. Unique words only.
val uniqueDracula = freqsDracula - freqsFranken.keys
val uniqueFranken = freqsFranken - freqsDracula.keys
// Finally create a List<Word> to be displayed
val words = uniqueDracula.map { (word, count) ->
Word(word, count, ColorRGBa.WHITE)
} + uniqueFranken.map { (word, count) ->
Word(word, count, ColorRGBa.BLACK)
}
configure {
width = 640
height = 360
}
extend {
drawer.clear(ColorRGBa.GRAY)
words.forEach { it.display() }
}
}
}
Notice how I changed the approach making the Word
class simpler. It has now just one method called display()
.
In the original example the Word
class was used to keep various counts: how many times it appeared in book A, in book B, and in both together, with the goal of later filtering those words that appear 5 times or more only in one of the books but not in the other.
Let’s take a look at how the Kotlin version works by jumping to the line that starts with val freqsDracula =
. It reads like this: take a txt file, read its content, make it lower case, split (by white space) into words (likely repeated words), group by word, count words in each group, finally discard words with less than 5 repetitions.
At this point we have one Map object per book linking String
(the word) to Int
(its count). Now, according to the original example, I want to remove words that are present in the other book, which I do just by subtracting the keys
(words) from the other book. This gives me the words appearing 5 or more times in each book which never appear in the other book. In 4 lines of code.
As a last step I transform our two Map of unique words into one list of Word
objects which I called words
. This way I don’t need to execute qualify()
on each word to figure out if it is worth displaying: every item in words
is displayable and it has a speed
, a color
and a font size pre-calculated. The original version calculates those three properties each time move
or display
are called.
Manipulating collections this way is one of my favorite aspects of Kotlin
I hope you find it as concise and readable as I do.
Text size and text centering
One difference I want to point out is that one does not specify the text size when drawing text in OPENRNDR, the size is set when loading the font. Therefore I created an array of fonts with different sizes. An alternative would have been to call drawer.scale()
to scale down the text and use just one font size.
Another difference regarding text is that there’s isn’t a direct equivalent to Processing’s textAlign
so the left edge of the spawnArea
is at -50 to make sure the left margin of the window is not empty.
About IDEs
The Kotlin code in this page may be a bit harder to grasp than when inside a good IDE because the IDE shows type hints and tooltips describing what is under the mouse:
thick lines under type hints, thin orange line shows a tooltip for
freqsDracula
Share your questions and comments below . Find other OPENRNDR & Processing posts