A Median Chocolate Chip Cookie Recipe & Styling ggplot Text
Today’s post is a recipe for median chocolate chip cookies that’s also a ggplot chart with colored-coded text in the subtitle. The recipe is based on summary stats from 200+ recipes from eightportions.com’s “Recipe Box” data where the recipe title contains the text: “chocolate chip cookies” and the colored text was made possible with ggtext
.
Tl;dr: The recipe:
I will briefly cover the source data, preparation, and one way to add colored text and unicode symbols ◍ to ggplot texts like caption
, subtitle
, title
, etc. with help of ggtext
.
Recipe Box Data
The original data came in a zip file containing three (3) .json
files downloaded from eightportions.com. The files contain about 125K recipes from three different food websites. I then queried these for titles containing the text "chocolate chip cookies’ and used tidyjson
to combine the results into a single data frame. The columns of the data frame include title
and id
along with named lists for ingredients
and instructions
. Here’s a preview of what it looks like:
## Rows: 230
## Columns: 4
## $ title <chr> "Whole-Grain Chocolate Chip Cookies", "Chocolat~
## $ ingredients <named list> <"1 1/4 cups all-purpose flour", "1 cup ~
## $ instructions <named list> "Combine the flours, wheat germ, baking ~
## $ id <chr> "o.rcUMxY8gpETc6XU9EKrnDqGWFUB3i", "vv2lfdBzwfy~
Data preparation and clean-up relied heavily on regex to extract the mixed and proper fractions and convert them into numerical values. For example, here’s the regex pattern I used to extract the baking temperature from the instructions.
"\\d+\\b(?=(\\sdegrees|°F|°F.|°.))"
As with most data projects, data cleanup and preparation took up the majority of the time. In this case I feel it yielded a pretty rich data set that’s got me interested in possible future work.
Styling ggplot
Text
The ggtext
library allows for the pass-through of markdown into just about any portion of ggplot
that displays text. If you know how to add colored text to your markdown documents with html
, adding colored text to your plot text follows the same path. See the package vignette for several examples on how to use it.
For example, one could add colored text, like this right here, to their markdown documents with a function that uses glue
to insert values into an html
chunk like this:
text_sc <- function(x, ptsz = "14", col = "#F2E750") {
glue::glue("<span style ='color:{col};font-size:{ptsz}pt;'>{x}</span>")
}
Running the function like the following:
text_sc("TINY TEXT!", ptsz = "7", col = "#B84841")
within the knitted markdown document results in this: TINY TEXT.
And coded in markdown, it even knits with unicode characters. It’s still a bit buggy with pasting unicode directly into the markdown but if you use the utf8
package, you can write your inline markdown by referencing the hex code for the unicode character you’re interested in. So, in-line code like this:
text_sc( utf8::as_utf8("\u25cd"), ptsz = "20")
would look like this in the knitted output: ◍.
And you’re able to pass the the unicode symbol through ggtext
and it will render the plot in Rstudio - note the yellow dot used to call out NTH in an alternate version of the recipe I created:
Unfortunately, knitting to html
is limited and certain special characters are not supported - including the dot. Fortunately, you can still save the plot as .png
and then post it as an image as I’ve done here, above. Note the first recipe plot - which was knitted to html
for this post - uses an O to represent NTH in the subtitle because of said limitation.
Unicode in ggplot
Maybe you’re in a situation where you don’t need the features of ggtext
but still would like to use a unicode symbol? That can work but your options will be limited to standard ggplot
options. One option is to add the unicode symbol to your data
argument as a column translated with utf8::as_utf8()
. Another option is to add unicode directly into ggplot
arguments, as I do for shape
and axis.title.x
in the simple example below:
crown <- utf8::as_utf8("\u265a") # ♚
mtcars %>%
ggplot(aes(wt, mpg)) +
geom_point(shape = crown, size = 10, alpha = .5) +
labs(title = "adding styled unicode to x-axis title",
x = glue("{crown}")) +
theme(axis.title.x = element_text(size = 30, color = "red"))
But remember, without ggtext
, you’ll be limited to one color per text field.
Despite the knitting-to-html limitation, ggtext
makes it possible to easily adjust the styling of the text feeding into your ggplot. If you must knit to html
, adding unicode to standard ggplots is straight-forward using utf8
. These styling options open up a wide-variety plot enhancements from replacing your plot’s legend to color-coding and styling your axis text for improved clarity or information density.
That concludes today’s post! It’s been almost a year since the start of the pandemic and with cases dropping and vaccine distribution increasing, I’m feeling hopeful for the summer. While you wait to get your vaccine, why not bake a batch of median chocolate chip cookies? - they are not half bad (intentional pun and accurate description).