In cooking I alternate between following recipes exactly, for fear that any sort of deviation might ruin the outcome, and trying to throw things together arbitrarily, with occasionally edible results. Could this problem be solved the way I like to approach other problems, i.e. by analyzing a nice data set, preferably of user contributed knowledge?
So a little over a year ago, I proposed the idea of using ingredient networks to evaluate recipes at a “Wacky Wednesday” faculty meeting, where School of Information faculty gather and pitch ideas to each other. The mix of interest and skepticism with which the idea was greeted was enough to motivate me to work on the problem with my PhD student Edwin Teng. Soon thereafter, Yu-Ru Lin, from Northeastern and Harvard, joined us on the project, and lent it her insight and machine learning expertise.
A lot of fun findings ensued (you can download the paper on arxiv):
1) If one examines complementary ingredients, two main communities fall out, one sweet, the other savory (see image above).
And there is a smaller, third community of ingredients for mixed-drinks.
2) Recipe reviews are a goldmine of data. There are ample suggestions for modifications (additions, deletions, increases, decreases, substitutions). These could be used to create “flexible” recipes, suggesting a range for the quantity of an ingredient, and possible substitutes. In fact, a substitute network reveals global communities of interchangeable ingredients.
3) Ingredient networks can be used to predict recipe ratings. “These networks encode which ingredients go well together, and which can be substituted to obtain superior results, and permit one to predict, given a pair of related recipes, which one will be more highly rated by users.” It appears that the substitute network in particular encodes nutrition information, e.g. users’ preferences for “healthier” variants for a recipe.
4) The hypothesis presented in Catching Fire, that humans have evolved to prefer cooking methods that extract more energy value from food, is consistent with recipe ratings. Recipes that call for heating (baking, boiling, grilling), are rated on average more highly than those that only call for mechanical preparation methods (chopping, mixing). Chemical methods (marinating & brining) give a slight additional boost.
5) US regional preferences are easily discernable, e.g. frying being popular in the south, and grilling being popular on the west coast and in the mountain regions. It would be interesting to study how these are affected by the availability of ingredients and cultural influences.
Also, stay tuned for some fantastic related work by YY Ahn, Sebastian Ahnert, James Bagrow and Laszlo Barabasi, getting to the bottom of recipe preferences by analyzing networks of flavor compounds in food pairings.
Finally, a short thanks for some of the tools we used:
Gephi for visualizing the networks
Map generator for detecting communities, here are two examples: