Citizen Science: What, Why, and How

This is a post I’ve been intending to write for a long, long time.  It’s a lot easier to write about my day to day life as an ecologist and PhD student.  In fact, any time I turn my computer on to write something that isn’t about me or about my personal research, I get this super intense surge of imposter syndrome.  I’ll stop the unnecessary preamble there for now.  It’s just my attempt to keep my writing in this space authentic, as I think it’s important to be honest about the struggles we face, even if they are mundane (Ermahgerd, writing a blog?  What if someone *gasp* reads it?!)

Training citizen scientists out at Stebbins Cold Canyon UC NRS

 The term citizen science has been buzzing around in scientific circles for a number of years now.  When I first drafted this last week, the first annual conference of the Citizen Science Association was just wrapping up in San Jose, CA.  This conference showcased the amazing body of scholarly research concerning citizen science, which is telling us a larger and more coherent story about the practice every day.  I have had the pleasure of working with professional scientists and educators whose whole course of study revolves around the design and training of participants for these endeavors.  I will offer here the briefest of introductions based on my own reading and experience and a little anecdote about a citizen science group I help facilitate in my area.  For a peer-reviewed take on the matter, look to the fabulous overview by Bonney and colleagues’ from 2014 in Science (so it’s short and sweet) entitled, Next Steps for Citizen Science1.

What is Citizen Science anyway?

First things first, what is citizen science anyway?  Well, first of all, it is science.  That’s a major point to emphasize.  Data collected by these projects should answer scientific questions or test specific hypotheses.  Second, this is science being performed by individuals who are (in most cases) not formally trained as research scientists.  There is a huge variety within the citizen science genre, but, in my experience, most projects fall into three main categories:  atlas/survey, monitoring, and manipulative/experimental.  

Atlas or survey projects use the increased person power provided by citizen scientists to attempt to catalog all of something.  Whether it’s ants in your backyard, critters in a park, or bees in your garden, these sorts of projects tap into our collective observation skills to gather useful data about when and where things occur.  For me, personally, online games that help map things (proteins, neurons!!) fall into this same category.  Others might disagree.  These sorts of projects don’t generally ask you to make observations at any set interval, they just want to know what you saw.  Monitoring based citizen science projects are also observation based, but time and place are more important.  Most monitoring projects are hoping to capture any changes that are occurring over time.  These changes could be in timing of events, composition of plant and animal communities, or an indicator of environmental health.  Actual manipulative citizen science projects are more rare, and tend to occur in conjunction with specific researchers.  If anyone has cool examples of projects like these, put them in the comments below!  And, because things rarely fit into three neat little categories, feel free to share other examples.     

Wait, is Rachel out of a job now?

Expanding leaves. Photo Cred: Allie Weill,
hand model Rachel Wigginton
Some might ask the point of such a practice, as scientists, myself included, spend years training to do what we do.  I wanted to point out here at the get-go that there are many studies looking at the accuracy of data collected by citizen science projects (see 2, 3, and 4 for examples), and with proper training, data collected by these projects can absolutely be used to answer scientific questions1.  So, formally trained scientists are still needed to design the protocols, set-up sites, and perform proper training for volunteers.  I can tell you from personal experience, doing this is a lot of effort up front.  There are also the back end costs of coordinating volunteers, keeping citizens engaged in the project, and curating the data that is gathered.  That said, I’m clearly not out of job thanks to these projects, but why would I even bother?

There are a few really great incentives for scientists to make citizen science work.  First, and non-research related, you are doing some amazing outreach.  We all remember that moment during our scientific training when things started to click, and I can almost 100% guarantee you that “click” didn’t come in the classroom.  It came when you were taking your first baby steps into actually doing science.  By working with citizen scientists, you are allowing people who might never get to have a hands on experience with science to get up close and personal with a study system.  You will be amazed how quickly these citizens start thinking like scientists.  I can imagine few things more gratifying than having a volunteer tell you a project has changed how they look at the world.   

The second, and totally research related, reason to get involved in citizen science is the increase in data resolution these sorts of projects afford to us.  Certainly, myself and a few intrepid undergraduates will know a lot about native Spartina restoration in the SF Bay by the end of my PhD research (at least I hope!).  But when we attempt to scale up, both in time and space, it’s hard to get a handle one some questions without a bigger team.  Think about one of the oldest citizen science project, the Audubon Christmas Bird Count.  Obviously, no single researcher or even a team of researchers, could have maintained that many survey points each year or kept observations going all the way back to 1900!  And before you start poo-pooing all this data, return to my above comment.  With the proper training, citizen science data can be used for papers that get published in peer reviewed journals.  Take, for example, the 90 or so publications based on data from eBird1.  We need to stop seeing citizen science as just a way to pad a broader impacts statement and start treating it as a scientific opportunity.   

The inspiring folks of the CPP Stebbins
Case Study: California Phenology Project at Stebbins Cold Canyon

Now, I’ll be the first to admit that not every research question, or even every research program, has an appropriate place for citizen science to fit.  Like every tool, there is no need to make a round peg fit a square hole.  For example, I work in a very sensitive habitat type on plants and invertebrates living in the soil.  There are several endangered plants and animals in my system.  Getting citizens involved in a meaningful way isn’t in the cards for me, at least not yet.  So, here is my pitch for why, even if you can’t address your own research, you should still stick a toe in the citizen science pool.  

During the spring of the first year of my PhD program, I was working on a Conservation Management degree certificate.  This certificate entails taking some course work, which I was going to take anyway, and doing a group project.  Our group was approached by some staff from the UC Natural Reserve System, who had a bit of extra funding for a community outreach project.  We decided a citizen science project would fit that bill.  So, in the fall of 2013, a group of students and two faculty advisers (one ecology, one education) started meeting to discuss what sort of project would fit well at our nearby UC NRS site, Stebbins Cold Canyon.  Stebbins is unique for several reasons.  First, it’s one of the few UC NRS sites that is open to the public, with some pretty stellar hiking trails.  Second, one of the said hiking trails recently got listed on some online hiking forum, and the rate of visitation has gotten pretty high in the last few years.  Last, most of those visitors have no idea that this is a research site where science is actively happening.

We went through a large list of existing citizen science projects and also emailed all the researchers doing work out at Stebbins.  We really wanted to make sure any data we gathered ended up with a scientists at the other end.  There is already a complete species list for the site, so we nixed any atlas style citizen science projects.  In the end, we settled on starting a new site for the California Phenology Project, which is a subset of the National Phenology Network.  This project is part of a nation wide effort to capture changes in the timing of life events for plants and animals (ie: phenology) as they relate to climate change.  I think this is a prime example of the type of question that can really only be addressed with a literal army of data points.  

Water in the creek at Stebbins Cold Canyon UC NRS
If you foolishly think, like I did, that simply starting a new site for an existing project wouldn’t be that much work, you’re very wrong.  Because this is such a well thought out project, with some pretty specific scientific questions, the requirements for establishing a phenology trail are fairly in depth.  Then you have recruiting and training volunteers, making sure they are entering data, coordinating all the monitoring sessions, etc., etc., ect.    

And if you’ll recall, this isn’t even the system I study.  I don’t even really study climate change.  But you know what, being involved with this project has been one of the most rewarding parts of my graduate career to date.  First, I feel like I’m really addressing the needs of a land manager (UC NRS) by educating visitors about the scientific use of Stebbins.  Not only do our volunteers now know tons more about the site, but they are always telling us about conversations they have had with others on the trail.  More importantly, I absolutely feel like our team has helped to change the world views of all our volunteers.  I feel as though I have seen these people start to think and interact with the world like scientists.  They have told us they notice the differences in flowering times between locations.  They think about what our recent rain storm will mean for the phenology of the plants up at the canyon.  They note the differing amounts of pollinators on plants at different phenological stages.  I could go on.  It’s magnificent.  And very, very humbling.  Because sometimes, I forget to be awed by the first Toyon berries of the year, then we get an email from a volunteer, who is so excited to be the first one with a “yes” data point in the ripe fruit column.  

We have started to see conference presentations utilizing the NPN and CPP data sets.  This is real science ya’ll.  

Sold.  How can I get involved?

As a scientist, you can start your own citizen science project to assist you with your research!  Look for the standards and best practices explained by the Citizen Science Association.  Or, you can do like we did, and expand an existing project.  If you are an interested citizen, look for a project going on in your area.  Zooniverse is a great online repository of the many, many projects going on right now.  There are all levels of involvement, from coding video from the comfort of your own home, to hitting the trail with a data sheet after a weekend of rather intense training.  

And hey, if you are a scientists or a citizen who wants to get involved with the CPP at Stebbins Cold Canyon, check out our website and shoot us a line!  We’d love to have you on our team.

1. Bonney, R., Shirk, J. L., Phillips, T. B., Wiggins, A., Ballard, H. L., Miller-rushing, A. J., & Parrish, J. K. (2014). Next Steps for Citizen Science. Science, 343(March), 1436–1437.
2. Crall, A. W., Newman, G. J., Stohlgren, T. J., Holfelder, K. A., Graham, J., & Waller, D. M. (2011). Assessing citizen science data quality: an invasive species case study. Conservation Letters, 4(6), 433-442.
3. Delaney, D. G., Sperling, C. D., Adams, C. S., & Leung, B. (2008). Marine invasive species: validation of citizen science and implications for national monitoring networks. Biological Invasions, 10(1), 117-128.
4. Galloway, A. W., Tudor, M. T., & HAEGEN, W. M. V. (2006). The reliability of citizen science: a case study of Oregon white oak stand surveys. Wildlife Society Bulletin, 34(5), 1425-1429.
5. Gardiner, M. M., Allee, L. L., Brown, P. M., Losey, J. E., Roy, H. E., & Smyth, R. R. (2012). Lessons from lady beetles: accuracy of monitoring data from US and UK citizen-science programs. Frontiers in Ecology and the Environment,10(9), 471-476.

Modeling Logistic Growth Data in R

My dog rocks. Wilson is friendly to almost everyone (mailmen excepted) and he’s very soft. We’ve had him since he was a puppy and because the wife and I are dorky scientists, we’ve collected (non-invasive) data from him since day one. So today we’ll be modeling growth data, courtesy of Wilson, using R, the “nls” function, and the packages “car” and “ggplot2”. For reference, I drew on this appendix from Fox and Weisburg (2010).


library(“car”); library(“ggplot2”)
76,75) #Wilson’s mass in pounds
452,482,923, 955,1308) #days since Wilson’s birth
data<-data.frame(mass,days.since.birth) #create the data frame
plot(mass~days.since.birth, data=data) #always look at your data first!

wb growth curve

Wilson’s growth looks like a logistic function. As a puppy, he put on the pounds quickly (yep, I remember that), and he has flattened out around 75 lbs (thank god). Although I will say that he still thinks he is a lap dog.

A logistic growth model can be implemented in R using the nls function. “nls” stands for non-linear least squares.

The logistic growth function can be written as

y <-phi1/(1+exp(-(phi2+phi3*x)))

y = Wilson’s mass, or could be a population, or any response variable exhibiting logistic growth
phi1 = the first parameter and is the asymptote (e.g. Wilson’s stable adult mass)
phi2 = the second parameter and there’s not much else to say about it
phi3 = the third parameter and is also known as the growth parameter, describes how quickly y approaches the asymptote
x = the input variable, in our case, days since Wilson’s birth

One important difference between “nls” and other models (e.g. ordinary least squares) is that “nls” requires initial starting parameters. This is because R will iteratively evaluate and tweak model parameters to minimize model error (hence the least squares part), but R needs a place to start. There are functions in R that obviate the need for imputing the initial parameters, these are called “self starting” functions and in our case it would be the “SSlogis” function. But for now, we’ll skip that and give R some initial parameters manually.


This calls for the coefficients of a linear model (slope and intercept) using the logit transform (log of the odds) and scaling the y by a reasonable first approximation of the asymptote (e.g. 100 lbs)

     (Intercept) days.since.birth 
    -1.096091866      0.002362622

Then, we can plug these values into the nls function as starting parameters.

#build the model, start is the starting parameters, trace=TRUE will return the iterations


3825.744 :  100.000  -1.096   0.002
3716.262 :  81.463877204 -0.886292540  0.002512256
3489.696 :  66.115027751 -0.731862991  0.003791928
1927.422 :  63.447368011 -1.036245947  0.007113719
204.0813 :  71.10755282 -2.06528377  0.01379199
123.3499 :  71.76966631 -2.39321544  0.01639836
121.3052 :  71.62701380 -2.45163194  0.01692932
121.2515 :  71.58084567 -2.46029145  0.01701556
121.2502 :  71.57256235 -2.46167633  0.01702943
121.2501 :  71.57121657 -2.46189524  0.01703163
121.2501 :  71.57100257 -2.46192995  0.01703198
121.2501 :  71.57096862 -2.46193546  0.01703204

The first column is the error (sums of squared error?) and the remaining columns are the model parameters. R took 11 iterations to reach model parameters it was happy with.



Formula: mass ~ phi1/(1 + exp(-(phi2 + phi3 * days.since.birth)))

      Estimate Std. Error t value Pr(>|t|)    
phi1 71.570969   1.201983   59.54  < 2e-16 ***
phi2 -2.461935   0.162985  -15.11 6.88e-11 ***
phi3  0.017032   0.001227   13.88 2.42e-10 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 2.753 on 16 degrees of freedom

Number of iterations to convergence: 11 
Achieved convergence tolerance: 1.985e-06

We can see that our initial parameters weren’t too far off.
Next, let’s create the model predictions and plot the data. (I’ve noticed that copying and pasting this ggplot script isn’t working in R because of the quotation marks. Anybody know the solution for this? Temporarily, just substitute the quotation marks from this text with regular ones within R or R Studio.

#set parameters
x<-c(min(data$days.since.birth):max(data$days.since.birth)) #construct a range of x values bounded by the data
y<-phi1/(1+exp(-(phi2+phi3*x))) #predicted mass
predict<-data.frame(x,y) #create the prediction data frame#And add a nice plot (I cheated and added the awesome inset jpg in another program)
labs(x=’Days Since Birth’,y=’Mass (lbs)’)+
scale_x_continuous(breaks=c(0,250,500,750, 1000,1250))+
geom_line(data=predict,aes(x=x,y=y), size=1)

Wilson growth

Hey, that looks like a pretty good model! It suggests that Wilson will asymptote at 71.57 lbs (Wilson, lose some weight buddy!). The model is under predicting his weight in the later region of the data, probably because of the two data points near the inflection point. Overall, I’m pretty happy with the model though!

There are some other packages (e.g. package “grofit” looks promising) and growth functions (e.g. gompertz) worth exploring because they can streamline some of the code, but we’ll save that for a future post.

Thanks for the data Wilson!

A Caterpillar Mystery in the Bahamas

I’ve been in the Bahamas for the last two weeks, studying the effect of resource pulses (hurricane-wrecked seaweed) on island communities with this project. In doing so, we kept coming upon these strange shelters on wild guava (Psidium longipes), locally called Bahama stopper, since the hard wooded bush/tree would apparently stop any progress you try to make into the thick coppice.

What on earth is this? ~  4 cm tall (pretty damn big by insect standards).
Inside some of these shelters (2/13), there was an odd larva, apparently a beetle larva, or so I initially thought. Because I am mostly a caterpillar person, I didn’t really pay it much mind. 
A really terrible picture, but notice the “antennae”. About 2-3 cm long. 
Louie, while processing insect samples one night, noticed that some things were not right about the apparent beetle larva – namely it had prolegs, the fleshy appendages that give caterpillars the appearance of having more than the six legs all insects have. I then looked at the “antennae” and found that they were not segmented, a dead giveaway that this was, in fact, a caterpillar and the “antennae” were actually tentacles (yes, that is the technical term for the fleshy projections that many caterpillars have – monarchs for instance).
Several of the shelters were torn like this, suggesting predation (by a bird [?]). This was the
only shelter with lines affixing it at the top – many had lower lines. 
I got much more interested after that, and sent along these pictures to Charley Eisemann, a good friend and probably the person on earth with the most knowledge about insect shelters. His blog – linked above – is simply phenomenal and if anyone was going to know the answer, he would. Very quickly (within a few minutes), he had correctly found the family of the moth – Mimallonidae. The amazing part here is that Charley has never seen a member of this family! Mimallonidae is an extremely small family by Lepidoptera standards, ~200 spp. – only 3 of which occur regularly in the US, a fourth is described from the US in Brownsville, TX, but is probably a tropical stray. He even dug pretty deep and found a very likely species identity, Ciccinus packardii – known from Cuba and known to feed on other Psidium species. While I do not know this for sure, it seems that is the most likely candidate as the larva matches very well the few images of Ciccinus online, and less so the other mimallonid genera. 

After a bit more searching, we came upon a young larvae feeding in a leaf press on P. longipes, which was not what I expected. This family is known as the “sack-bearers” and I was expecting something more along the lines of a bagworm (Psychidae), instead of a leaf presser.

A young (2nd, 3rd instar?) larva of this Ciccinus sp. ~ 8mm

Which brings us to the strange, pitcher plant-like shelters. The larva is oriented vertically inside the shelter, with a strange butt plate plugging up the bottom hole and the head just below the upper hole. What function the little hood forms is mysterious – perhaps shading the larva from the hot Bahamian sun or fierce rains? The better-known Ciccinus species of the US, C. melshiemeri, feeds on old oak leaves (too tannic for most caterpillars) and constructs a shelter, sort of like the pictured ones of frass pellets, silk and oak leaves in which it spends the winter as a larva, prior to pupation in the spring. This seems to be the case for this species as well – in two cases, I found spent pupal skins.

Spent pupal skin (successful emergence!) inside one of the shelters. You can also see the construction of silk and what
appears to be finely ground frass (caterpillar poop – a common building material for cats). 

Interestingly, I did find one that fit the description of the C. melshiemeri shelters well.

This was the only shelter anchored into leaves (it was vacant, unfortunately). You can see well the frass pellets forming the top of the shelter here.
The same shelter, with a Psidium leaf forming one side. 

These guys kept me occupied for quite awhile (I even dreamt about them!) and seem like a worthy avenue for future rearing efforts… there are a great deal of questions that remain about the shelters: Why the strange shape with a hood? Why build a free standing shelter, as opposed to anchoring it to a stem like most moths? Why wait around in a shelter instead of pupating right away? Do the shelters protect inhabitants from predators and parasitoids?

perhaps the prettiest of all found. I like the subtle banding.

Many thanks to Charley, Julia Blyth, John De Benedictus, Louie Yang, Jonah Piova-Scott and Jenn Mckenzie (who was the only one that could find occupied shelters) for help with the identification and finding of these guys.