Count number of occurrences based on 2 conditions or Regexp

NGix picture NGix · Aug 20, 2013 · Viewed 11.8k times · Source

How can I get the number of occurrences for some range based on

  1. A regular expression

  2. 2+ conditions; let's say cells that contain "yes" and / or "no"

What I've got for the moment:

COUNTIF(B5:O5; "*yes*")

I tried to use COUNTIF(B5:O5; {"*yes*", "*no*"}) or COUNTIF(B5:O5; "(*yes*)|(*no*)"), but neither of them worked.

Or, how do I count cells that contain some domain names—yahoo.com, hotmail.com, and gmail.com—using regexp? e.g.:

(\W|^)[\w.+\-]{0,25}@(yahoo|hotmail|gmail)\.com(\W|$)

Answer

Floris picture Floris · Aug 21, 2013

The most pedestrian solution to your problem (tested in Excel and Google Docs) is to simply add the result of several countif formulas:

=COUNTIF(B5:O5, "*yes*") + COUNTIF(B5:O5, "*no*")

This expression will count the total of cells with "yes" or "no". It will double count a cell with "yesno" or "noyes" since it matches both expressions. You could try to take out the doubles with

=COUNTIF(B5:O5, "*yes*") + COUNTIF(B5:O5, "*no*") - COUNTIF(B5:O5, "*no*yes*") - COUNTIF(B5:O5, "*yes*no*")

But that will still get you in trouble with a string like noyesno.

However there is a rather clever trick in Google Docs that may just be a hint of the solution you are looking for:

=COUNTA(QUERY(A1:A9, "select A where A matches '(.*yes.*)|(.*no.*)'"))

The QUERY function is like a mini database thing. In this case it looks at the table in range A1:A9, and selects only elements in column A where the corresponding element in column A matches (in the preg regex sense of the word) the expression that follows - in this case, "anything followed by yes followed by anything, or anything followed by no followed by anything". In a simple example I made, this counts a yesnoyes only once - making it exactly what you were asking for (I think...)

Right now your range B5:O5 is several columns wide, and only one row high; that makes it hard to use the QUERY trick. Something rather less elegant (but that works regardless of the shape of the range) is this:

=countif(arrayformula(isnumber(find("yes",A1:A9))+isnumber(find("no",A1:A9))),">0")

The sum of the isnumber functions acts as an element-wise OR - unfortunately, the regular OR function doesn't seem to work on individual elements of an array. As before, this finds cells that contain either "yes" or "no", and counts the ones that have either of these strings contained within.