This so-called R Markdown file accompanies a 3-hour Reproducible Open Coding Kit (ROCK) workshop developed by Szilvia Zörgő & Gjalt-Jorn Peters. More details are available below in section Links and resources.

Getting started

Posit Cloud

The easiest way to get started is to copy this file to a Posit Cloud project of your own. To do that, first visit the shared Posit Cloud project at https://posit.cloud/content/6434221. Note that to view it, you will need to be logged in with a Posit Cloud account, so create that first if you don’t have one yet.

Once it has loaded, click “Save a permanent copy” at the top:

Figure 1: A screenshot of Posit Cloud
Figure 1: A screenshot of Posit Cloud

This will store the project in your account’s workspace, so you that your changes are preserved and you can always return to it. If you do not save a permanent copy, you will be ejected from the temporary project after a while and will have to start over.

Alternative (more advanced): Local RStudio Desktop

If you are already familiar with R, RStudio, and Git, you can also download this project and use your local RStudio Desktop installation. For the URL to the Git repository, see the Appendix.

Understanding R Markdown

The script below contains R commands (in the gray sections called “chunks”), which can be run individually by pressing the green “play button” in the chunk’s upper right corner. Note, you will only see this option if you open the script in posit/RStudio.

Exercises

Basic setup

Run this chunk every time you start a session!

The chunk below will install all R packages needed to run the commands in the script. It also contains default options for {rock} and paths to subdirectories. Run it by clicking on the green play button in the top right corner of the chunk.

### package installs and updates
packagesToCheck <- c("rock", "here", "knitr", "writexl");
for (currentPkg in packagesToCheck) {
  if (!requireNamespace(currentPkg, quietly = TRUE)) {
    install.packages(currentPkg, repos="http://cran.rstudio.com");
  }
}

knitr::opts_chunk$set(
  echo = TRUE,
  comment = ""
);

rock::opts$set(
  silent = TRUE,
  idRegexes = list(
    cid = "\\[\\[cid[=:]([a-zA-Z][a-zA-Z0-9_]*)\\]\\]",
    coderId = "\\[\\[coderid[=:]([a-zA-Z][a-zA-Z0-9_]*)\\]\\]"
  ),
  persistentIds = c("cid", "coderId")
);

### Set paths for later
basePath <- here::here();
dataPath <- file.path(basePath, "data");
scriptsPath <- file.path(basePath, "scripts");
resultsPath <- file.path(basePath, "results");

Exercise 1: Data preparation

Examining preadded data

Three plain text files containing data (i.e., “sources”) have been placed into the “010---raw-sources” subdirectory located within the data directory. Also, there are also some attributes of the mock data providers listed in the file called “attributes.rock”.

Cleaning data

The cleaning command places each of the sentences in your data on a new line. The {rock} package enables you to code data line-by-line, and recognizes newline characters as indicators of this, lowest level of segmentation. The chunk below will write the cleaned sources found in “010---raw-sources” into the subdirectory “020---cleaned-sources”.

rock::clean_sources(
  input = file.path(dataPath, "010---raw-sources"),
  output = file.path(dataPath, "020---cleaned-sources")
);

Adding unique utterance identifiers

If it makes sense for your project, you may choose to add a unique identifier to each line of data (i.e., “utterances”). This is helpful, for example, if you want to merge different versions of the coded sources into a source that contains all codes applied by multiple researchers. The chunk below will write the sources with uids into the subdirectory “030---sources-with-uids”.

rock::prepend_ids_to_sources(
  input = file.path(dataPath, "020---cleaned-sources"),
  output = file.path(dataPath, "030---sources-with-uids")
);

Exercise 2: initial coding with iROCK

Manual coding

Please visit the rudimentary graphical user interface, iROCK (available at https://i.rock.science). This interface allows you to upload your sources, as well as codes and section breaks (for higher levels of segmentation), then drag and drop those into the data.

Figure 2: A screenshot of a fresh instance of iROCK
Figure 2: A screenshot of a fresh instance of iROCK

Click the ‘Sources’ button at the top to load a source. It will show you a dialogue similar to that shown in Figure 3. To load the example source, copy-paste the following URL into the field as shown in Figure 3 and press [ENTER].

Then repeat that to load the example codes and section breaks, this time copy-pasting these two URLs:

Figure 3: A screenshot of loading a source into iROCK
Figure 3: A screenshot of loading a source into iROCK

When you loaded all three the files into the right place, you should see something similar to what is shown in Figure 4:

Figure 4: A screenshot of iROCK with the example source, codes, and breaks loaded
Figure 4: A screenshot of iROCK with the example source, codes, and breaks loaded

You can now start coding and segmenting. To use one of the codes or section breaks you loaded, drag them from the right-hand panel and drop them where you want them in the source. If you make a mistake, simply click the section break or code to delete it again.

When you are done coding, you can download the coded source by clicking Download. Normally, it is vital to not forget that, but in this workshop, you will be working with pre-added coded sources.

Parse sources

Run this chunk every session during which you want to employ the functionality below (e.g., inspecting fragments, code frequencies, heatmaps)!

This command will assemble all your coded sources and attributes into an R object that can be employed to run analyses and other commands below. Note, coded sources and attributes have been pre-added for your convenience.

dat <-
  rock::parse_sources(
    dataPath,
    regex = "_coded|attributes"
  );

Exercise 3: inspect coded fragments for specific code(s)

This command allows you to collect and inspect coded fragments for certain codes, you can use the command below by changing the code labels “CodeA” and “CodeB” to the codes you’d like to inspect. You can modify the amount of context you wish to have around the coded utterance by changing “2” to any other number.

rock::inspect_coded_sources(
  path = here::here("data", "040---coded-sources"),
  fragments_args = list(
    codes = "CodeA|CodeB",
    context = 2
  )
);

Collected coded fragments for codes ‘CodeA’ & ‘CodeB’ with 4 lines of context

CodeA (path: codes>CodeA)


Source: 001_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb2bb]] Pellentesque ut blandit diam.
[[uid=7q0xb2bc]] Sed placerat velit arcu, nec rutrum elit varius et.
[[uid=7q0xb2bd]] Pellentesque vehicula purus sit amet velit laoreet ultricies. [[CodeA]] [[CodeD]]
[[uid=7q0xb2bf]] Donec a mollis ipsum.
[[uid=7q0xb2bg]] Aliquam eget metus vel ante porttitor fringilla in quis purus. [[CodeB]]

Source: 002_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb2df]] Aliquam eget metus vel ante porttitor fringilla in quis purus.
[[uid=7q0xb2dg]] Nunc placerat semper ultrices.
[[uid=7q0xb2dh]] Aliquam ac congue nunc. [[CodeA]]
[[uid=7q0xb2dj]] Morbi sit amet tempus turpis, quis tempus massa.
[[uid=7q0xb2dk]] Duis interdum diam enim.

Source: 003_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb2g7]] Ut sed mi purus.
[[uid=7q0xb2g8]] Pellentesque ut blandit diam.
[[uid=7q0xb2g9]] Sed placerat velit arcu, nec rutrum elit varius et. [[CodeA]] [[CodeB]] [[CodeC]]
[[uid=7q0xb2gb]] Pellentesque vehicula purus sit amet velit laoreet ultricies.
[[uid=7q0xb2gc]] Donec a mollis ipsum.

Source: 003_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb2gq]] Cras consequat lacus in augue pretium, et egestas lectus pellentesque.
[[uid=7q0xb2gr]] Mauris faucibus nec metus vel convallis.
[[uid=7q0xb2gs]] Fusce vulputate, orci nec sodales fringilla, turpis diam varius tellus, rhoncus pulvinar tellus odio egestas turpis. [[CodeA]]
[[uid=7q0xb2gt]] Praesent luctus sapien luctus odio bibendum, eu suscipit sem convallis. [[CodeD]]
[[uid=7q0xb2gw]] Vivamus tincidunt ante quis finibus feugiat.

Source: 003_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb2gx]] In tincidunt lorem vel dolor eleifend dictum.
[[uid=7q0xb2gy]]
[[uid=7q0xb2gz]] Fusce sit amet elit et justo facilisis finibus ut id dui. [[CodeC]] [[CodeA]]
[[uid=7q0xb2h0]] Vestibulum dolor diam, commodo non eros non, posuere gravida orci.
[[uid=7q0xb2h1]] Nam id efficitur metus, ut eleifend nulla. [[CodeA]]

Source: 003_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb2gz]] Fusce sit amet elit et justo facilisis finibus ut id dui. [[CodeC]] [[CodeA]]
[[uid=7q0xb2h0]] Vestibulum dolor diam, commodo non eros non, posuere gravida orci.
[[uid=7q0xb2h1]] Nam id efficitur metus, ut eleifend nulla. [[CodeA]]
[[uid=7q0xb2h2]] Vestibulum id porta erat. [[CodeD]]
[[uid=7q0xb2h3]] Curabitur bibendum varius auctor.

CodeB (path: codes>CodeB)


Source: 001_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb29g]] Nam facilisis id magna non facilisis.
[[uid=7q0xb29h]] Nulla facilisi.
[[uid=7q0xb29j]] Vivamus ullamcorper ligula magna, non blandit augue imperdiet in. [[CodeD]] [[CodeB]]
[[uid=7q0xb29k]] Sed felis sem, euismod et tincidunt quis, venenatis vel tellus.
[[uid=7q0xb29l]] Fusce non nisl tristique, ultricies augue eget, aliquet nisi.

Source: 001_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb29k]] Sed felis sem, euismod et tincidunt quis, venenatis vel tellus.
[[uid=7q0xb29l]] Fusce non nisl tristique, ultricies augue eget, aliquet nisi.
[[uid=7q0xb29m]] Sed elementum turpis eu tempus tincidunt. [[CodeB]] [[CodeD]]
[[uid=7q0xb29n]] Donec vel quam vel eros elementum ullamcorper non sed velit.
[[uid=7q0xb29p]] Suspendisse in finibus tortor, vehicula volutpat nulla. [[CodeC]]

Source: 001_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb2bd]] Pellentesque vehicula purus sit amet velit laoreet ultricies. [[CodeA]] [[CodeD]]
[[uid=7q0xb2bf]] Donec a mollis ipsum.
[[uid=7q0xb2bg]] Aliquam eget metus vel ante porttitor fringilla in quis purus. [[CodeB]]
[[uid=7q0xb2bh]] Nunc placerat semper ultrices.
[[uid=7q0xb2bj]] Aliquam ac congue nunc.

Source: 001_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb2bp]] Morbi et viverra libero, vitae convallis est.
[[uid=7q0xb2bq]] Orci varius natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.
[[uid=7q0xb2br]] In quis leo vel risus convallis fringilla id vitae odio. [[CodeB]]
[[uid=7q0xb2bs]] Cras consequat lacus in augue pretium, et egestas lectus pellentesque.
[[uid=7q0xb2bt]] Mauris faucibus nec metus vel convallis.

Source: 002_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb2d8]] Ut sed mi purus.
[[uid=7q0xb2d9]] Pellentesque ut blandit diam.
[[uid=7q0xb2db]] Sed placerat velit arcu, nec rutrum elit varius et. [[CodeB]] [[CodeD]]
[[uid=7q0xb2dc]] Pellentesque vehicula purus sit amet velit laoreet ultricies. [[CodeD]]
[[uid=7q0xb2dd]] Donec a mollis ipsum.

Source: 002_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb2ds]] Mauris faucibus nec metus vel convallis.
[[uid=7q0xb2dt]] Fusce vulputate, orci nec sodales fringilla, turpis diam varius tellus, rhoncus pulvinar tellus odio egestas turpis.
[[uid=7q0xb2dw]] Praesent luctus sapien luctus odio bibendum, eu suscipit sem convallis. [[CodeB]]
[[uid=7q0xb2dx]] Vivamus tincidunt ante quis finibus feugiat.
[[uid=7q0xb2dy]] In tincidunt lorem vel dolor eleifend dictum.

Source: 003_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb2g7]] Ut sed mi purus.
[[uid=7q0xb2g8]] Pellentesque ut blandit diam.
[[uid=7q0xb2g9]] Sed placerat velit arcu, nec rutrum elit varius et. [[CodeA]] [[CodeB]] [[CodeC]]
[[uid=7q0xb2gb]] Pellentesque vehicula purus sit amet velit laoreet ultricies.
[[uid=7q0xb2gc]] Donec a mollis ipsum.

Source: 003_Source_cleaned_withUIDs_coded.rock

[[uid=7q0xb2gm]] Morbi et viverra libero, vitae convallis est.
[[uid=7q0xb2gn]] Orci varius natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.
[[uid=7q0xb2gp]] In quis leo vel risus convallis fringilla id vitae odio. [[CodeB]]
[[uid=7q0xb2gq]] Cras consequat lacus in augue pretium, et egestas lectus pellentesque.
[[uid=7q0xb2gr]] Mauris faucibus nec metus vel convallis.

Exercise 4: view code structure

With this command, the {rock} package creates a code tree, which can be flat or hierarchical depending on the employed codes. In this workshop, we use a flat code structure.

rock::show_fullyMergedCodeTrees(dat)

Exercise 5: inspect code frequencies

This command will allow you to see a bar chart of the code frequencies within the various sources they were applied. The command also produces a legend at the bottom of the visual to help identify the sources based on color.

rock::code_freq_hist(
  dat
);

Exercise 6: inspect code co-occurrences (heatmap)

Code co-occurrences can be visualized with a heatmap. This representation will use colors to indicate the code co-occurrence frequencies. Co-occurrences are defined as two or more codes occurring on the same line of data (utterance).

rock::create_cooccurrence_matrix(
  dat,
  plotHeatmap = TRUE
);

      CodeA CodeB CodeC CodeD
CodeA     6     1     2     1
CodeB     1     8     1     3
CodeC     2     1     4     0
CodeD     1     3     0    11

Exercise 7: export qualitative data table (excel)

This command will enable a tabularized version of your dataset, which for example, can be employed to further process your data with software such as Epistemic Network Analysis (https://www.epistemicnetwork.org), or “merely” represent your coded data in a single file. In this dataset, rows are constituted by utterances, columns by variables and data. The file will be an Excel called “mergedSourceDf” located in the results subdirectory.

Beware, when re-generating the qualitative data table the {rock} default is to prevent overwriting, so either allow overwrite within the script, or delete the old excel before you run this chunk. (The Posit Cloud version of this script allows overwriting.)

rock::export_mergedSourceDf_to_xlsx(
  dat,
  file.path(resultsPath,
            "mergedSourceDf.xlsx")
)
Warning in export_mergedSourceDf_to_file(x = x, file = file, exportArgs =
exportArgs, : The file you specified to export to
(C:/pC/git/quarry/rock-workshop-3hr/results/mergedSourceDf.xlsx) already
exists, and `preventOverwriting` is set to `TRUE`, so I'm not writing to disk.
To override this, pass `preventOverwriting=FALSE`.

Exercise 8: merge coded sources

If multiple coders are applying different codes or coding schemes to the same dataset, or if a single coder is applying different codes in different rounds of coding, then merging coded sources may be useful. Merging means that you combine different coded versions of the same source into a “master” source that contains all applied codes. Merging is made possible via unique utterance identifiers (uids).

Some pre-coded versions of the data have been added to the subdirectory “041—coded-sources-for-merging”. A good practice is to create a “slug” for each coded version of the sources, for example, “_coder1” and “_coder2”, which you will see for the mock data. You need to choose a version of the coded source to be the foundation upon which the other versions are merged (indicated by “primarySourcesRegex” in the code below). For example, the command below says that all versions of each source should be “collapsed” onto the version with the slug: “_coder1”. The command below will write the merged sources into the same directory as where it found them, resulting in a merged version for each source that you placed into that directory.

rock::merge_sources(
  input = here::here(
    "data",
    "041---coded-sources-for-merging"
  ),
  output = "same",
  primarySourcesPath = here::here(
    "data",
    "041---coded-sources-for-merging"
  ),
  primarySourcesRegex = "_coder1\\.rock"
);

Appendix

Commanding the rock package

To command the rock package (or use other R functionality), you usually use functions. Functions are small programs that do things for you. For them to know what to do, you have to pass so-called arguments or parameters when you call them. If you get everything right, the function will do its job and return its result to you. You will usually want to store that result, so you can do other things with it.

To illustrate this, let us create a simple source using a function. The following command creates a character vector with two elements:

firstSourceBit <-
  c("this is the first element",
    "this is the second element");

We now called a function called c() to combine two elements into a vector (a list of elements). We pass two arguments to this function (the two text strings), and the functions returns the result to us (the vector), which we store in a variable called firstSourceBit with the assignment operator, <-.

If you are viewing the source code of this R Markdown file in RStudio (either Desktop, on your PC, or Posit Cloud, in a web browser), you can select the three lines with R commands above and copy-paste them into the console in the bottom-left corner to try it out. If instead you are reading the rendered version of this R Markdown file, why not use the link above to open the associated project in Posit Cloud so you can play along?

We can check that this worked by telling R to display the contents of the firstSourceBit object, simply by specifying its name in the console:

firstSourceBit

R then shows its contents, and should show:

[1] "this is the first element"  "this is the second element"

We can now use another function to combine these two elements into a single character value again, using a so-called ‘newline character’, \n, as separator. The R function to paste several character strings together is called paste(). You usually call it the same way we called the c() function above, by specifying all character strings as separate arguments, but we can also pass them all in a vector: then we pass another argument called collapse to tell paste() the separator it should use when collapsing the vector into a single string:

firstSourceBit_collapsed <-
  paste(
    firstSourceBit,
    collapse = "\n"
  );

If you let R print the contents of this new object firstSourceBit_collapsed, you see:

[1] "this is the first element\nthis is the second element"

Here, R shows the newline character as a newline character (n), instead of as the newline it represents. To force R to display the newline character as a new line, use the cat() command:

cat(firstSourceBit_collapsed);

Which should show:

this is the first element
this is the second element

You now succesfully used your first three functions. Below, as we use the rock package, we will use more functions, usually with more arguments.

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. R chunks always start with a line containing three backticks (`) and two accolades ({ and }), with the chunk’s language (usually r), an optional chunk label, and the chunk options in between the accolades. When knitting the R Markdown document, the R chunks are executed and the results are inserted into the final rendered HTML (or PDF, or Word) file.

Installing the rock package

If you want to use the rock R package on your own computer, you will first have to download and install it in R. The following R chunk contains commands you can use to install the rock package.

### Note the `eval=FALSE` chunk option on the line above; this tells the `knitr`
### package to *not* execute the R code in this chunk. This has been added
### because you normally will not want to reinstall that package *every time*
### you run this script.

### To install the version on R's CRAN repository network, use:
install.packages('rock');

### The next two commands require the `remotes` package to be installed;
### if you don't have that yet, you can install it with:
install.packages('remotes');

### To install the latest version of the package (at your own risk), use:
remotes::install_gitlab("r-packages/rock");

### To install the cutting edge version (at even more of your own risk), use:
remotes::install_gitlab("r-packages/rock@dev");

### Note that because the `install_gitlab()` function comes from a package,
### we tell R from which package to get it using the `::` operator.

Terms used in the workshop

Code
Representation of a construct of interest in a qualitative study
Code ID
Machine-readable code identifier
Coding structure
Type of coding scheme, e.g., flat or hierarchical
Code label
Human-readable name of code
Coding scheme
Group of codes to be applied to qualitative data
Section break
Indicator of the end of a meaningful chunk of data (higher-level segmentation)
Segmentation
Dividing the data into meaningful chunks (for further analysis)
Unique utterance identifier
Identifies a single line in the dataset
Utterance
Smallest meaningful fragment of data (segmentation level where coding is performed)

For more on ROCK terminology, see: https://sci-ops.gitlab.io/rockbook/vocab.html.

Citation and licensing

The Reproducible Open Coding Kit (ROCK) standard is licensed under CC0 1.0 Universal. The {rock} R package is licensed under a GNU General Public License; for more see: https://rock.science.

ROCK citation: Gjalt-Jorn Ygram Peters and Szilvia Zörgő (2023). rock: Reproducible Open Coding Kit. R package version 0.7.1. https://rock.opens.science

For more on ROCK materials licensing and citation, please see: https://rock.opens.science/authors.html#citation.

Feedback

Thank you for considering to use ‘rock’ for your qualitative project. If you have any questions or would like to make suggestions on how to improve ‘rock’, feel free to write to: info@rock.science.