top of page

Quantifying Cupping – Part 2: The Art and Science of Commercial Coffee Tasting




The CTA Cupping System 

The Coffee Trading Academy Cupping System is a hybrid of the two methods we talked about in part 1, that evaluates using a simple system across 4 categories. 


The Four Basic Categories (Acidity, Body, Flavor, Defects) 

The four categories evaluated are Acidity, Body, Flavor, and Defects.  


Acidity is the tartness experienced on the sides of the tongue in a good, black coffee. There are different types of acidity, for example citric is the bright lemony acidity found in citric fruits whereas maltic acidity is the tartness found in a tart apple. However, for our purposes we are simply evaluating the intensity of general acidity in the coffee.  


Consumers have different preferences for acidity. Personally, I enjoy an acidic, winey black coffee when drinking it hot, but I prefer a low acidity, chocolatey, full bodied coffee for iced coffee with oatmilk. However, acidity is almost universally considered a positive characteristic for coffee and it is most commonly associated with washed, fermented, high grown arabica coffees. 


Body is the fullness of the coffee mouthfeel. It is associated with the physical materials in the coffee drink, this includes oil content and dissolved solids from the roasted bean. A full-bodied coffee feels pleasantly thick and viscous on the tongue and coats the mouth after the sip is swallowed. 


This is again associated with denser, higher grown beans but naturals often have higher body than washed coffees. 


Flavor is the amount of coffee taste and aroma in the coffee. This has to do with the amount of dissolved solids in the beverage but also the flavor profile of the region where the coffee is grown. The environmental characteristics of certain soils, altitudes and microclimates can affect the development of the coffee bean and generate specific flavor profiles for different regions. 




Defects fall into a few different sub-categories but generally fall into one of two categories, external contaminates or improper processing. External contaminates add unwanted external flavors to the coffee this includes dirty, phenol and baggy. This comes when the coffee comes into contact or is mixed with earth, is stored near chemicals or gasoline, or is left in jute bags for too long. Improper processing defects can occur if the immature or pest infected beans are not removed, or if the beans are fermented too long or become moldy.   

Defects are of course a negative for all coffees, although what is considered a defect in some coffees may be a desirable feature in others.   


For example, “earthy” is typically a negative defective term to describe a dirty coffee but can be used as a positive characteristic for certain Indonesian coffees. Simlarly, “Rioy” is a defect in coffee terms that can mean oldish and even medicinal/phenol tasting but some Rio coffee can also be sold as Rio coffee and was even desired by some Middle Eastern markets particularly for this flavor.  

Monsooned Malabar is another, washed out coffee that is dried under excessive moisture conditions that produces a distinct flavor that would be considered a defect in other coffees but is specifically cultivated in this particular preparation. 


Quantification 

Quantification is the process of turning experiences into numbers. I’ve become a big believer in quantification over my years in the coffee industry, it is essential for being able to draw meaningful conclusions, complex analysis and avoid ambiguity. 


In my system for quantifying the experience of cupping, I use 3 numbers: 1, 2 and 3, along with an optional +/- modifier. This allows for 9 possible scores ranging from very weak to very strong. 


Each of these is scored 1-3 with 2 being average, 1 being weak and 3 being strong. The words weak and strong are important here. What we are measuring is intensity. Intensity is more objective, whereas if we used words like “poor” or “good”, that is more subjective. As noted in my preference for iced coffee above, one individual may prefer weaker acidity and thus might score a coffee differently from someone who prefers a stronger acidity.  However, if measuring intensity, then we could trust the average of these two cuppers scores as representative of the actual intensity of the coffee. 


Scoring Defects 

Scoring defects is done similarly to the other scores but instead we are scoring the amount of defects in a cup, so this will be counted as a negative. If 0 defects then this would be a score of 0. if 2 cups had defects it would be a score of 2, if one mild defect it would be a 1-. 


Now a scale of 3 scores using integers only (1,2,3) may seem to be rather blunt strokes to paint all of the complexity and nuance of a particular coffee with, and that is true, but it is also an asset. Using blunt strokes does a few important things in your favor: speed, accuracy, and training


Speed 

Well established scientific research (footnote: Cowan, 2010 The Magical Mystery Four: How Is Working Memory Capacity Limited, and Why?”) suggests that humans can only hold 3-4 concepts in short-term working memory at a time. So to hold mental concepts of a 6-10 scale with decimals is mentally taxing if not impossible. 




Instead, the way that we think about these coffees if we grade this way is that we “chunk” information. We think to ourselves that certain coffee experiences that are bad are in the 6-7 range, average are in the 7-9 range and good coffees are in the 9-10 range. Then we apply a second layer of analysis where compare the coffee we are tasting to others in that range. So if we are tasting a mediocre acidity coffee, we might think ok this is below average, but its not horrible, so we rate it as a high poor acidity like 6.75. 


It is actually more difficult than this because this a linear scale. So a 6.75 is only 0.25 points away from a 6.5 or a 7. So the cupper has to think hard about other coffees that were 6.75,6.5 and 7 to decide which of those 3 categories it truly belongs in. 


It is much easier and faster to think this is a weak acidity coffee = a 1 on a 1-3 scale. 


Accuracy 

Now it may seem that scoring only 3 dimensions would make us less accurate, than using 6-10 scale with decimals. However, I would suggest that our system is more accurate if we assume that coffee falls along a normal distribution whereas a uniform scale like 6-10 with decimals. If you are not up to speed on your statistics think of it like this. If all coffees were rated 0-3 in 0.25 increments (eg 1.25, 2.5, 2.75, 0.75, 2, etc) and there were equal numbers of each possible score, then a uniform scale would be more accurate. 


However, if we assume there is a normal distribution, this would mean that the vast majority of coffees (some 66%) would be within 1 standard deviation from the average score. This makes more sense, and matches my experiences in the coffee market. Most coffees are commercial grade coffees. 


Therefore if you are cupping commercial coffees, this is the most important criteria to evaluate: is this a standard commercial coffee, or better or worse? Beyond that we can add in a little bit of nuance with the + or – addition to the score. 


When we translate this into a numerical value, we can add +/- 0.1. So a 3+ would be a 3.1 and 2- would be a 1.9. While a 1 would just be a 1.0. This allows us to add additional nuance to our coffees, it takes into consideration the broader nuance in commercial coffees without over emphasizing it, nor expending a lot of mental energy. 


Training 

The final benefit in using a simple system like this is that it allows a novice cupper to easily categorize their sensory experiences into very simple categories. Since the majority of coffees will fall into the 2 category, this will allow new cuppers to easily put most of the coffees that they experience into this category. A particularly strong or weak cup will stand out and easily fall into the extreme categories of 1 or 3, and this will be the primary focus for a new cupper. 


Over time as the cupper gains experience, they will gain references among each category (but especially the 2s) and will be able to differentiate between strong 2s and weak 2s, and eventually between strong and weak 1s and 3s. 

It may seem that this is overly simplified for a coffee, merely below, average, or above average.


However, keep in mind three things:  

First, this also includes the +/- differentiator so when talking about an average coffee (score "2”) we are also able to differentiate it from both low average and high average (2- and 2+). Second, this scoring is across 4 dimensions (acidity, body, flavor and defects) so, if we want more complexity and depth, the data is available to us. Third and last, this is a system for scoring commercial coffees rather than specialty coffees, so adding additional scales, dimensions and complexity is excessive and unnecessary. 


[Enjoying this blog? Start a free trial and access our in-depth Coffee Research and analysis.]


One final benefit of learning from this system is that it does translate well into other systems. If a cupper trained on the 3 scale system above needs to then use an SCA form with a 6-10 scale, they can quickly identify which mental bucket the coffee fits into on a 3 scale and then calibrate with others to find where those mental references fit into the 6-10 scale. 

 

Tracking and Trading Coffee 

Once your coffees have been scored on paper, you should enter the scores into a spreadsheet, database or your enterprise software to associate with the representative samples. If you want to keep it simple, you can create an online form or app to record the scores directly. 



Download a free copy of our cupping form, here:


It is essential to incorporate all of the notes and data into the spreadsheet or database, because this will provide you insight into trends, averages and history for different coffees.  


Additionally, this will provide you with rich data to share with your clients or suppliers about what you need, what your experiences with specific coffees is and what you might have an offer. 


If a client is looking for a full-bodied coffee with medium acidity, you can easily go through your data base and filter for 3- body (and above) coffees with acidity between 2- and 2+. 


Similarly, if you are looking for a coffee with more acidity, you can reference a recent sample and say that you are looking for a coffee with more acidity than sample X or a similar acidity to sample Y. 


Summarizing and Analyzing 


Scoring 

Once the coffee has been scored across the 4 different dimensions, you add the 3 positive characteristics together and divide by 3. This will give you the average score and provide a simple metric that can be used to give an overall view on the intensity of a coffee.  


Defects 

Defects can be subtracted from the overall score, but in general if you are getting consistent defects from a coffee, it will probably need to be earmarked as a low-grade coffee and traded accordingly. So if you have 3 cups and there is one average defect cup you would subtract 1. If one strong defect you would subtract 1.1. if a minor defect you could subtract 0.9. If two moderate defects you would subtract 2 (1 + 1). If one moderate defect and one strong defect you would subtract 2.1 (1 + 1.1). 


Analyzing the Data 

Once you have started to collect some data, the super-power of quantifying cupping becomes available: analysis. 


Analyzing your cupping data enables your roasting, exporting or trading operation to gather insights about your suppliers, origin operations, availability in the market as well as developing a communication language with colleagues in your business and throughout the industry. 


Looking at the scores from different suppliers will tell you which are best able to match your expectations, or which clients you are able to match expectations for. Moreover it will allow you to isolate exactly where any discrepancies are, since you can point to the dimensions (acidity, body, flavor) where there might be disagreement. 


Patterns 

There are 3 primary patterns that we will want to look out for when analyzing our cupping data. Intraseasonal, Interseasonal and Supplier Trends. 


IntraSeasonal 

First, is evolutions of an origin over a season. If we are cupping a single origin over the course of a season, we will typically find a quality curve that starts off poor, then improves towards the peak of the season and then tapers off in quality again. This is because the first beans picked may include more premature beans, or farmers hurrying processing to get the coffee to market to gain some cash flow. As the flow progresses, there is typically more ripe beans and more care can be taken during harvesting and preparation. Finally, towards the end of the season, farmers may be simply picking whatever coffee is left, struggling to fill out the last bags with whatever dubious quality remains or perhaps cutting corners as they wrap up the season. 



In an origin operation, this curve can provide insights into what proportion of the coffee that they are purchasing will fit into different qualities as the season progresses.  In destination markets, your origin supplier may insulate you from this curve somewhat through their quality control, but tracking the samples in this manner will give you mroe specific insight into the typical availabilities.  


[Stay informed as the coffee market evolves. Sign up for a free trial of our Premium Coffee Research today.]



InterSeasonal 

Second, is the evolutions of origins from season to season. If we are cupping a single origin over the course of several seasons, we will typically find that the standard curves persist, but that the overall quality may be higher or lower in a given season, or that a season that starts later or earlier will see the quality curve shift.  


Most often this variation will occur because of weather patterns that are particular to a given season, but we may also see how general farming practices or trends in climate or labor conditions may impact quality as well. If a trend is clear, this may give us insight that our strategy may need to be adjusted with a particular origin. 



Supplier Trends 

Finally, we can see for trends, averages and evolutions in quality among suppliers. There is a mindset among some suppliers that “if you’re not getting rejections, then your shipping too high quality.” Therefore you will want to be aware which suppliers may try and toe the line of what is acceptable or not. 


We may also see changes within a supplier if there is turn over of key personnel. 

Additionally, if we are making efforts to improve quality with a particular supplier, perhaps a farmers coop or just a relationship with an exporter, it will be helpful to see if the quality scorers are moving in the correct direction. 


Ultimately, we will be using similar types of analysis with suppliers as we might use for origins more generally such as trendlines, moving averages, overall averages, mean, median and bell curves. 


Trading Your Scores 

My background and perspective on the physical coffee market is as a trader and if you are buying and selling coffees too (as a roaster, exporter or importer) then you also need to think like a trader. 


For that reason, I encourage coffee industry to cup their coffee “open”, that is knowing exactly what the coffee is being sold as. If you are buying a coffee that is being marketed to you as a Kenyan AA, then you will want to know how this stacks up against your reference points of other Kenyan AAs that you have traded. 


Cup quality is not the only factor that determines price, but it is an important one. Cup quality is often the reason that suppliers will point to if they are demanding a premium price, and similarly cup quality is often what will be sacrificed if a buyer demands a low price. 


However, cup quality can also be used simply as a benchmark. A buyer may look for the cheapest quality that passes certain minimum cupping requirements. Or we may cup past crop coffees to see if there are any that might be neutral enough to safely blend into a bargain blended roast. 


Regardless of the exact strategy, having a quantifiable cupping system will allow you to set up objective criteria amongst your team and suppliers to meet your needs. 


Conclusion 

We’ve described coffee cupping as both art, and science. 

Those of us in the coffee business take great pride in the “art” side of coffee cupping. Cupping requires diverse experience, knowledge, training, dedication and repetition from human traders. 


However, this art is not enough when it comes to our business. The science, or strict methodology that we apply to our subjective sensory experience allows us not only to improve our skill, but perhaps more importantly, to communicate it. 

Quantifying our sensory analysis enables us to communicate, across QC departments, trade desks, origins, languages, seasons, and markets. Adapting our sensory experiences in this way empowers better analysis, clearer conversations with suppliers and clients, and more disciplined trading decisions. 

 

Contrary to what we might expect, quantifying cupping is not about removing the human experience from coffee, but rather about being explicit. When we are explicit about our sensory experience of cupping it actually improves our ability to both understand and communicate it. And if you are serious about trading coffee, those are two skills that you cannot do without. 


This is part 2 of a two-part blog. Read part 1 here. Download a free copy of our cupping form, here:


[Explore our Coffee Research with a free trial.]



bottom of page