9 Lab Practices

This section is intended as a starting point toward establishing more formal lab policy regarding methodological transparency, data management practices, and research ethics above and beyond the mandated baseline. Best practices in these domains are constantly evolving so I expect this document will be regularly updated. These policies (and less formal recommendations) will be mostly applicable to graduate students who will be conducting original research in the lab over a longer term, but honours students will also be expected to adopt some of these practices, and others interested in research may find some tidbits of interest.

These guidelines have been heavily influenced by the open science movement and the avalanche of best practice recommendations that have come out of it. As mentioned earlier, we are, as a lab, currently in a bit of a period of transition in attempting to bring our own practices in line with what we believe should be done from an open science perspective. Some of the steps toward improving practices may involve a steep learning curve such that it may take weeks, months, or even years for it to be feasible for you to fully implement them; shifting toward a script-based workflow, for example, may be especially daunting for those with little or no programming experience. Some steps can also be easy to forget if you are not used to incorporating them into your research workflow (such as hitting the “register” button on a pre-registration). There are bound to be stumbles, but as they say, “perfect is the enemy of good” – you might not be able to make your work 100% open, transparent, and reproducible tomorrow, but there are still steps you can take to make it more open, transparent, and reproducible. My main expectation of students and others conducting original research in the lab is thus that you be willing to learn and strive to improve your research and data management practices, and of course feel free to push me to improve mine when needed.

9.1 Summary of Expectations

The sections below are organized roughly according to the stages of the research process, from hypothesis generation and experiment planning to long-term data management. These sections go into some detail and discuss a number of considerations/points of advice that should be considered suggestions, or things I want you to think critically about, rather than formal policy. To make it clear from the outset what I do consider policy for anyone conducting independent research in the lab:

All research and analysis plans should be pre-registered, preferably on the Open Science Framework (OSF) unless you have reason to prefer a different option.
To the extent that is possible, applicable, and ethical, your data, analysis scripts, and experimental materials should be made available to others in a way that does not require them to contact you for access (e.g., posted on the OSF or deposited in a dedicated data repository). These should be shared as publicly as possible given the constraints of your materials in terms of confidentiality, participant consent, and copyright considerations. Some examples of complete lab projects hosted on OSF: Variability in Free and Cued Recall, Animacy and Memory.
You should get into the habit of maintaining detailed records and organizing all files – physical and digital – related to your lab work in a fashion that will be straightforward for others (and your future self) to navigate and understand. Good practices include clear documentation of your storage system/organizational approach (e.g., in the form of “readme” files that detail the folder structure) and things like the meaning of variable names within files; using descriptive and consistent names for files, folders, and variables; extensively annotating any scripts used for data analysis or experiment programming, and keeping records of project milestones and timelines (OSF wiki pages for projects are a useful medium for this). A formal data management plan is encouraged and may be helpful to you, but is not required outright.
Back up all lab data regularly, and ensure they are always stored in at least 3 locations. Your office computer and the backup server available through your office computer can count as two; the third may be a personal machine or hard drive, or the lab hard drive if needed. Cloud-based services are convenient and may be acceptable for some purposes, but should not be used for identifiable or potentially sensitive data, or anything else you would prefer kept private.
Related to (3) and (4), you will be expected to create a well-organized and extensively documented archive comprising all critical materials each time a project is completed. This too should be stored in multiple locations, including the lab hard drive and at least one on-site computer to facilitate access by others when you are not around or move on to other things.
Finally, and less concrete than the others, I hope that you will be ever mindful of the barriers to scientific participation, access to knowledge, and the pursuit of scientific careers, and think critically about the ways you can help break these down in your own work.

9.2 Open Science in Brief

For those who are unfamiliar it may be useful to start with a brief introduction to the “open science” movement, as I will use much of the associated terminology throughout the rest of this document. In brief, open science encompasses a philosophy regarding what science should be –a collaborative, transparent, and accessible endeavour, with researchers accountable to both each other and the public – and a set of ever-evolving methodological practices aimed at bringing us closer to this goal. The utopia envisioned by open science proponents (e.g., Nosek & Bar-Anan, 2012; Nosek, Spies, & Motyl, 2012; OSF Strategic Plan) is in many ways similar to what children are taught about what science is and how it works. Unfortunately, most who are familiar with its inner workings, whether in academia or industry, are rapidly disabused of this rosy view of science as guided solely by curiosity, conscientiousness, and a commitment to sharing and updating knowledge for the betterment of all. Societal and economic realities, existing incentive structures within research and academia, and standard human weaknesses and follies such as desire for fame, power, and wealth intermix to create a science that often falls well short of what we may want it to be.

The open science movement in its current form is often traced back to the late 2000s/early 2010s, when concern regarding a number of high-profile failures to replicate research results, as well as appreciation and understanding of a panoply of poor research and analytic practices that were (and perhaps still are) in widespread use, began to grow in psychology and other fields (e.g., Collaboration for Open Science, 2015; Ioannidis, 2005; Simmons, Nelson, & Simonsohn, 2011). Organizations such as the multidisciplinary Center for Open Science and the psychology-focused Society for the Improvement of Psychological Science (SIPS) began to spring up in the mid-2010s with the goals of finding solutions to the various problems that had been identified and advocating for the adoption of better methodological, analytic, and communicative practices as well as larger scale structural changes.

The open science movement is not a monolith, and many who would count themselves among its proponents disagree on its definition and the relative importance of various kinds of “openness” that have ended up under the open science umbrella. These include open in the sense of transparency regarding research plans (stating hypotheses and planned analyses from the beginning, before analyses are undertaken) and methods (sharing data, analysis code, and other research materials publicly, or at least openly with other researchers, to facilitate reproducibility); open source (the adoption of software and systems that make their source code freely available for scrutiny and modification); open access (making research outputs such as published manuscripts freely available); and openness in the sense of facilitating collaboration, taking steps to make science as a whole more inclusive of groups who have been historically excluded or marginalized, and making knowledge more available and accessible to the public. I will mostly focus on the first sense in this document as it is the one most amenable to lab-level policy changes, but some of the others will crop up here and there.

9.3 Hypothesis and Experiment Planning

9.3.1 Pre-registration

Although the formal act of pre-registration will probably be one of the last steps you undertake in the experiment-planning process, this step will be much easier if you have it in mind from the start, and as it is one of the most important from an open science perspective it is worth discussing first. Much of the below is adapted from Lindsay, Simons, and Lilienfeld (2016).

9.3.1.1 What is pre-registration?

Pre-registering a study involves creating an unalterable, time-stamped copy of your research plan, ideally before you start data collection (but at the very least before you look at your data).

9.3.1.2 Why should I pre-register?

Pre-registration is one of the major methodological pillars of the recent push for more open science, and for good reason; done properly, it addresses a number of known problems simultaneously. One of the major benefits of pre-registering hypotheses and analyses is that it limits questionable research practices like HARKing (Hypothesizing After the Results are Known; Kerr, 1998) by removing any doubt as to which of your results were predicted a priori and which were unexpected. The act of compiling a thorough pre-registration is also very likely to benefit you personally by forcing you to really think through the specifics of your hypotheses, design, and analysis plans before you put them into practice, and reminding you of exactly what those hypotheses and plans were when a particular study wraps up some weeks or months after you originally generated them.

9.3.1.3 What should I pre-register?

The very short answer: as much as possible, but anything is better than nothing. If time has gotten away from you and you are supposed to start analyzing your data tomorrow (not that this has ever happened to me, of course), it is probably better to pre-register a few lines stating your hypotheses than nothing at all. As cultural and institutional expectations change, so too will this advice; I expect (and hope) that in ~5 years, detailed pre-registrations will be so entrenched that any forum where you might publish or disseminate your research will require them, and there will be little point in anything that falls short of those standards. But for now I think even baby steps have some benefit, if nothing else for the purpose of getting used to adding the pre-registration step into your workflow.

The more detailed answer:

1. Research question(s). At the very least, a pre-registration should include a clear statement of the research question(s) your experiment and/or analysis is designed to address.

a. Confirmatory research: If you are conducting a study with the goal of testing a specific hypothesis or replicating a known effect, you should state your hypotheses as precisely as possible. “I expect [variable A] to be higher in [condition X] than [condition Y]” is better than “I expect [variable A] to differ between conditions”. Arguably best of all is predicting an effect within a specific range of effect sizes (Velicer et al., 2008), if you have some theoretical or empirical basis for doing so. If your prediction involves an interaction, you should describe the form you expect this interaction to take. For complex predictions of this sort you may find it easier to provide a figure than to describe the predicted pattern verbally.

b. Exploratory research: If you are doing exploratory research and don’t really have a specific hypothesis, it is perfectly fine to say so, and still worth pre-registering. Confirmatory studies often also have exploratory elements, so it is common to have an assortment of explicit confirmatory hypotheses and vaguer, more exploratory questions.

c. Purely analytic approaches: Most pre-registration advice focuses on original research, but there is also merit in pre-registering secondary analyses of existing data, computational modelling projects, and (perhaps especially) large-scale analytic projects such as meta-analyses. These will also often have both confirmatory and exploratory components.

2. Data analysis plan. In part, this goes hand in hand with the first point; especially if you have confirmatory hypotheses, you should go into the project with some idea of how exactly you will determine whether your data support them or not. Ideally, this plan should include:

a. Which statistical test(s) you plan to conduct with which variable(s). It is most important to provide this information as pertains to your primary hypotheses, but if you have exploratory tests in mind it is worth including these as well. For any analyses you have planned, make sure to state your independent and dependent variables, as well as moderators, covariates, etc. as applicable. Describe any steps you will need to take to get from your raw data to the variables you will be analyzing (e.g., adding up multiple responses on a questionnaire, transforming variables to different scales, determining parameters via model fitting, etc.).

b. Exclusion and/or outlier criteria. Decisions to exclude outliers or other data from further analyses represent a major source of researcher degrees of freedom, so it is good practice to get into the habit of thinking about how you will do this before your data even exist. You should state:

i. How you will identify outliers, and how you plan to deal with them. How you should determine what constitutes an outlier on some measure and whether or not you should exclude them from further analysis is a separate question. What matters at this stage is that you are transparent regarding your plans, even if you have none, so that anyone evaluating what you end up doing can compare with the original plan (and decide whether they think your decision was justifiable).

ii. A priori exclusion criteria. If you foresee any other reasons you might want to exclude a participant’s data – in whole or in part – from further analysis, these should also be pre-registered. Common criteria at the participant level in the kind of research typical of our lab might include performance below a certain threshold, failing an attention check question, or reported lack of naiveté at debriefing. At the within-participant level, examples might include responses made too quickly or too slowly, certain kinds of errors, or some subset of trials (e.g., the first n trials in some task that requires practice). Unforeseen reasons may well arise throughout data collection or come to mind at the analysis stage; pre-registration does not prevent you from implementing these later, but again makes the whole decision-making process transparent to others.

c. Relevant evidentiary “thresholds”. E.g., for NHST approaches, you should include your planned alpha level and how you will correct for multiple comparisons.

d. Any planned stopping rules, or other less principled data “peeking” plans.

e. Plans for dealing with missing data. This will tend to be heavily intertwined with both points (a) and (b), as common approaches to missing data handling involve exclusion from further analyses and transformation (e.g., imputing missing values from other participants). Additionally, data might be missing on some measure crucial to your exclusion criteria (e.g., if you plan to exclude left-handed individuals from analysis, you ought to consider what you will do if someone skips the handedness question).

f. How you plan to evaluate whether your data violate any statistical assumptions, and how you will deal with this.

3. Method and design.

a. Participants: type (e.g., UVic students), any restrictions on participation, intended sample size (with justification, e.g., power analysis based on designated effect size) or stopping rule,

b. Variables: Describe any variables you will manipulate and/or collect data for, especially those central to your hypotheses. For independent variables, describe their levels and the nature of the manipulation (including whether it is within- or between- subjects).

c. Materials and procedural details: Essentially a standard methods section of the sort you might include in a scientific paper, but preferably with a level of detail exceeding what is often seen in the literature.

4. Background information and justification. This is the least crucial for pre-registration purposes, but given adequate time you may wish to include it.

For examples of lab preregistrations, see the following OSF projects:

· Variability in Free and Cued Recall

· Animacy and Memory

9.3.1.4 How do I create a pre-registration?

The most popular platform for filing pre-registrations, and the only one our lab has used so far, is the Open Science Framework (OSF). You are not obligated to use the OSF if you have a preferred alternative, but if you are new to pre-registration and/or open science more generally I highly recommend it as a starting point. The OSF also facilitates other beneficial research practices, such as collaboration, version control, and sharing data, so it will be referenced in other sections of this document as well.

The basic steps you will have to follow to file a pre-registration on the OSF are (1) create a project, (2) upload your pre-registration plan to the project page (or link the project with some other service where the document is located; more detail below), and (3) create and approve a formal registrationbefore you analyze your data (ideally before you start collecting your data). This third step is the actual act of pre-registration that creates a permanent, time-stamped copy of your project and any accompanying documents, so although the vast majority of the work and time that go into a pre-registration will be at the writing stage, it is crucial not to skip this last step. In addition to the links above, the OSF provides a number of other detailed, step-by-step guides to using the service, and there is no one “correct” way to use it as far as project organization and storage, so I will not attempt to provide a tutorial here.

If you are pre-registering a study for a specific purpose, e.g., a competition such as the OSF’s preregistration challenge or to comply with the guidelines of a journal you intend to publish in, there may be specific requirements as far as the contents and formatting of your pre-registration document(s). For general purposes, though, this document can be written and formatted however, and using whichever program(s), you want. Try not to worry too much about writing quality; although you’ll still want to write clearly enough to make sure someone outside the project could read the pre-registration and understand your plans/easily evaluate whether or not you stuck to them after the study has been conducted, it doesn’t need to be a publication-quality document. Bullet points and sentence fragments are fine.

There are a number of pre-registration templates available that you might consider using. I find the AsPredicted approach a particularly accessible starting point. AsPredicted is a preregistration service that asks the author to answer a series of 7 pre-set questions (with additional space for information that does not neatly fit with any of these questions), and then generates a nicely formatted pdf along with a unique URL at which the pdf will be permanently stored; this can serve as a standalone pre-registration, or it can be used in concert with the OSF. The OSF actually has the AsPredicted template and several other such templates, including some more specialized ones (for the prereg challenge, for replication studies, etc.) built in at the registration stage. When you initiate a project registration on the OSF you will have the option of selecting one of these templates, so you can, if you want, skip the step of uploading a separate document altogether and just answer the necessary questions here. The Pre-Registration in Social Psychology template developed by van ’t Veer and Giner-Sorolla (2016) is a particularly solid option (and although developed with social psychology in mind is just as applicable to most research in our lab).

9.3.1.5 What if I change my mind about something I’ve already pre-registered?

You cannot change the registered copy of your plan, but you can make changes to the project itself and file a new registration. It only makes sense to do this as long as data analysis has not yet begun; for example, if you come across a new analytic method halfway through data collection and decide it makes more sense than what you were originally planning, it would be worthwhile to make that change and re-register the project with a descriptive comment so there is a clear record of when that change was made (and evidence it was not made on the basis of how the planned analyses turned out). You are still free to make changes or conduct unplanned, exploratory analyses once data collection has begun, but at this point there is nothing “pre” about such changes.

The OSF facilitates updating your original registration via:

1. Straightforward version control. You can edit the original file detailing your preregistration, upload it to the same project, and, providing it has the same name as the original, OSF will recognize the new file as the “current version”. Copies of all previous versions will still be retained.

2. Allowing multiple registrations for the same project. You can do the above regardless of whether you have already registered the project or not; the difference if the project is already pre-registered is that any changed/added files will not appear in the pre-registration copy. If you make changes following the initial registration but still haven’t analyzed your data, it is worth re-registering the project, thus creating a new time-stamped copy (but also maintaining the original). You can, as far as I know, do this as many times as you want. Although all versions will be retained such that anyone could in theory figure out what had changed each time, it is good practice to include a brief description of the changes when prompted (e.g., “changed planned sample size”, “lowered presentation time at study to get performance off ceiling”).

A personal, admittedly anecdotally-rooted recommendation for anyone getting used to pre-registration or who may be a bit unnerved by the thought of creating a permanent record of your no doubt error-ridden first draft: get into the habit of registering early and often. Register as soon as you fill out/upload the first version of your plan, even if data collection/analysis will not start anytime soon. Re-register if you change anything prior to data collection, and each time you change something after data collection has begun. There may be some downsides to this in the form of confusion on your end as to which is the “correct” pre-registration for a particular purpose, and potentially raised eyebrows if others see you have a long list of pre-registrations for the same project (but this should be mitigated by the total transparency of this process). But if you are at all intimidated by the process it may help entrench the habit and get rid of any intimidation associated with the finality of hitting that “Register” button.

Preregistration is a major step toward making your research more transparent, and composing a thorough preregistration can also encourage you to think more critically about the whats, whys, and hows of the research you’re conducting. You might catch logical errors in the arguments underlying your hypotheses and/or planned methodological approach, recognize certain elements of your research questions are vague or ill-formed and sharpen them accordingly, or realize you aren’t exactly sure of the best way to analyze your data to address your primary hypotheses. The potential benefit to science of doing things this way thus extends beyond preventing (or at least rendering transparent) more obviously questionable research practices.

The remainder of this section focuses on this theme of thinking critically about the decisions made at the stage of hypothesis and experiment planning, including decisions about research participants, experimental setup, and stimulus selection. Some of these decisions are often made with so little thought they do not even feel like decisions; they are simply the default, or how things are usually done in a particular lab or the field as a whole. I will not get into higher-level questions of how to formulate better hypotheses nor basic research methods/experimental design principles (although these are both worthwhile topics). Instead, I will focus specifically on lower-level changes that can be made at each step to make your research more open in the open science sense – i.e., more reproducible, transparent, and easily accessible – and in the sense of inclusivity.

9.4 Human Research Ethics

The dominant ethics framework governing scientific research involving humans in Canada is the Tri-Council Policy Statement, currently in its second edition (TCPS 2). Eligibility for tri-council agency funding (NSERC, SSHRC, or CIHR) is contingent on compliance with this policy (and other tri-council policies pertaining to other kinds of research), and institutions involved in research in Canada have their own research ethics committees responsible for ensuring all research conducted under their jurisdiction complies with these guidelines (and relevant laws). Prior to conducting any data collection in the lab, lab members must complete the TCPS 2: CORE certification course. At UVic the relevant committee, from whom approval must be obtained prior to initiating any data collection (aside from informal pilot testing with friends, other lab members, etc.), is the Human Research Ethics Board (HREB).

Although seeking HREB approval will not likely be one of the first chronological steps in your research planning process, the decisions discussed in later sections of this manual must of course be made with ethical principles in mind, so I have opted to start that discussion here. The vast majority of research conducted in this lab is in the form of experiments or surveys administered on campus or online, all of which unambiguously require formal HREB approval. The only possible exceptions are:

Projects based on the analysis of existing data from our lab, or other labs, for which ethics approval was obtained prior to collection: You generally do not need to apply for a new HREB approval to re-analyze data from previous projects in our lab or to analyze data obtained from other researchers, providing the original approval can reasonably be assumed to cover your intended use.
Meta-analyses: Similar to the above, projects based on synthesizing data that have already been disseminated in scientific articles or at conferences, and/or unpublished data that have been provided to you, do not require new HREB approval.
Projects based on the analysis of existing data that is publicly and legally available: You do not have to apply for HREB approval to conduct analyses using public datasets (e.g., from Statistics Canada), or other publicly accessible information with “no reasonable expectation of privacy” (e.g., data collected from books; newspaper articles; or online sources such as websites, forums or social media posts, providing they are entirely public [i.e., not restricted to friends or members]).

Making determinations such as under what circumstances people have a reasonable expectation of privacy is not always clear cut. As such, if you are planning to conduct a project that falls into one of these categories and are at all uncertain as to whether it meets the exemption criteria, you should always err on the side of contacting someone at the HREB.

If you are planning a new study, me and/or your other supervisor(s) will help you determine whether your proposal falls under an already approved ethics protocol, can be accommodated by amending an existing protocol, or will require an entirely new HREB application. This manual does not aim to provide a guide to that process, which the HREB website has well covered, nor to fully outline the TCPS guidelines. Instead, I hope to (1) reiterate the importance of a few fundamental ethical research principles worth keeping in mind when planning, designing, and disseminating the results of a study, and steps that can be taken to bring psychological research in closer adherence with these principles (above and beyond complying with HREB policies, which is of course non-negotiable), and (2) draw particular attention to situations where open science ideals and human research ethics may conflict, and suggest ways of balancing these concerns.

9.4.1 Potential harms, benefits, and the importance of “taking the perspective of the participant”

Most of the research we conduct is considered “minimal risk” from an ethical perspective. Standard memory experiments wherein participants are tested on memory for a list of banal words or pictures, or asked to watch a video and then determine whether someone in the video is featured in a subsequent lineup, are generally unlikely to cause harm, and tend not to be deemed any riskier to participants than the baseline of everyday life. As the TCPS emphasizes, however, researchers should not assume all participants will perceive the balance of harms and benefits the same way they do. It is important to keep in mind that research participants (UVic undergraduates or otherwise) come from a variety of backgrounds and may have histories of oppression, abuse, and/or trauma we are not privy to. In the specific context of memory research, this possibility should be kept in mind when:

1. Selecting stimuli: Most of the stimulus sets we use are relatively neutral or unlikely to cause harm (lists of nouns, images of paintings, photos of faces, videos of staged, non-violent crimes, etc.). Certain research questions (e.g., regarding the influence of emotion on memory) may require stimuli that are overtly violent, sexual, or generally unpleasant. Although the HREB will make the ultimate decision as to whether such a project is justified by the potential benefits, this is a question worth considering very seriously before you even get to this stage.

2. Developing questions, and deciding whether to ask for certain information: There is a need to balance collecting information that may be relevant to your research question (or which you are just curious about) with not only the confidentiality standard, but ethical treatment more generally. Some would argue that collecting certain kinds of demographic information about your sample – even if they are not directly relevant to your research question – is a kind of methodological transparency, and that this is important information for would-be replicators of your work to have. There is some validity to this, but for most purposes in our lab I think an aggregate-level summary based on the overall characteristics of the UVic participant pool is probably sufficient. That said, I am not steadfastly opposed to collecting extraneous information related to variables you think may be interesting to analyze in an exploratory fashion providing you collect and store this information with due care. If you do opt (or need) to collect potentially sensitive and personal information related to things like ethnicity, gender identity, sexual orientation, etc., here are a few considerations and suggestions to keep in mind:

i. Privacy and confidentiality. Collecting rich demographic details can compromise the anonymity of data. If the data collected contain demograhpic details judged to be sufficiently important, one way to protect anonymity is to store such data separately from other measures. When collecting personal information from participants, do so in a way that enables them to use their own terms and that allows them to opt out of answering if that is their preference. Ultimately, you will have to think critically about what you will do with this information, how you will store it, whether you will remove it from any data you share, etc. to ensure confidentiality is preserved (discussed more below).

ii. Transparency. In addition to your ethical mandate to be transparent about how this information will be stored and whether you plan to share it as part of the consent process (also discussed more below), you should be open with participants about why you are collecting this information. If you do not need it, but are just interested in using it in exploratory analyses, say so. Participants can then decide whether they are okay with providing this information for the described purpose.

iii. Autonomy, sensitivity, and respect. Keep in mind that a question that may not seem at all fraught or sensitive to you personally might be perceived differently by others. Consider leaving all personal/demographic questions open-ended so participants are entirely in control of how they describe themselves, or including “rather not say” and/or “prefer to self-describe” options. If you think you need categorical response options, be sensitive in selecting and wording these options. For example, you should avoid the common well-intentioned formulation of gender options as “male, female, or transgender”, and options that conflate sexual orientation and gender identity. This article by Sarai Rosenberg is an excellent resource on respectfully requesting demographic information, including that related to gender identity and sexual orientation specifically. See also Cameron and Stinson’s excellent piece on measuring gender.

3. Discussing the purposes and implications of your research (e.g., when debriefing participants): In talking about research, we want to convey to participants why it is important, and an easy way to do this is to bring up concrete, evocative, real world examples. This is to be encouraged, but these examples should be chosen with care and an understanding that what may be, to us, a subject of intellectual inquiry foremost may map onto painful lived experiences for others. Perhaps one of the most widely known real-world influences of academic memory research has been in the context of debates (especially explosive in the 1990s, but still ongoing) regarding the validity of recovered memories, especially the potential of suggestive therapeutic practices to elicit false memories of childhood sexual abuse. This is undoubtedly a hugely important research subject students of psychology should be aware of, but is sometimes discussed in academic circles with a level of disconnect I imagine may be anywhere from alienating to deeply hurtful to individuals who have personally dealt with the traumas of abuse. Such examples should not be presented lightly or without warning.

9.5 Participants

This section discusses questions of determining the nature of your study sample and sample size in more detail than is probably necessary for most projects, but the main points can be summarized as follows:

Generalizability concerns: Although we rely on the UVic undergraduate participant pool for most of our research, this is a skewed sample of Canadians (let alone humanity as a whole, the population psychological researchers are ostensibly studying) on many dimensions (education, socioeconomic status, etc.). For certain questions and goals it may be necessary or advisable to seek out alternative populations by collaborating with other researchers, advertising in the community, or recruiting participants online.
Online data collection: Collecting data online has the potential to mitigate problems related to the unrepresentative nature of undergraduate samples; however, this approach does come with its own concerns such as lack of control over the experimental setting, the prevalence of professional participants (i.e., people who have participated in a lot of academic research) on platforms designed for this purpose, and the ethics of compensation. If you are planning to collect data online using a crowdsourcing platform, you should be cognizant of the business model of the platform you intend to use and the reality of who participates on that platform. As a general rule, participants should be paid at least the US minimum wage, and work should only be rejected (i.e., not paid for) in extreme cases.
Fairness and equity in participation: Groups of people should only be prevented from participating in research on the basis of personal/demographic characteristics if the research question necessitates this, and excluding data after the fact (instead of restricting participation outright) is not a viable option.
Principled sample size planning: Sample sizes should not be pulled from thin air or based on what other people have done, but should be based on principled consideration of factors including the effect size you expect (or the smallest one you are interested in detecting) and how important it is to limit the probability of type I and II errors. There are programs available to calculate the necessary sample size given these and other parameters. Alternatively, you may wish to look into approaches that allow for validly stopping data collection once evidence in either direction reaches a certain threshold.

9.5.1 Who will they be?

The “right” answer to this will of course depend on your research question, but many questions asked in this lab are indifferent on this point such that choosing participants tends to come down to ease of access. At a broad level, by far the most commonanswer in psychological research is “undergraduate students”, and this lab is no exception (indeed, the population is even more restricted, namely “undergraduate students registered in a Psychology course who choose to participate in research for bonus points”).There is a growing appreciation of the problems this may pose for generalizability in scientific disciplines that ostensibly aim to understand human behaviour as a whole. The bulk of social science research is conducted with WEIRD participants (Henrich, Heine, & Norenzayan, 2010) – that is, individuals hailing from Western, Educated, Industrialized, Rich, and Democratic regions, meaning many effects and conclusions in the literature are based on a highly skewed demographic sample, and psychology undergraduate pools like UVic’s represent a further restricted sample that tends to be mostly young and female-identifying.

In cognitive psychology, the underlying goal of many of the questions we ask is to better understand the fundamental mechanisms of memory, attention, perception, etc. At some level, of course, these mechanisms and the machinery that allows for them are shared by all humans; an optimistic view, then, is that it probably doesn’t matter who participates in a given experiment, as the basic principles we eventually uncover will generalize to everyone. Of course, because we can only ever peer at this level indirectly, this is quite a logical leap. The truth is that we don’t know how generalizable much of our research is, and it is entirely possible there are oft-cited effects floating around that we think tell us something about how the mind/brain works, but are in fact specific to WEIRD participants for some reason we don’t yet understand. This is a field-level concern that will require large scale, systemic changes to make serious progress on. At the level of an individual experiment, one obvious solution is to collect data from a broader-than-usual sample, perhaps by recruiting from the community rather than only the undergraduate pool or, better yet – in terms of both ease of recruitment/administration and potential demographic span – the internet. The following two sections will discuss some important considerations associated with collecting data from undergraduate and online samples, by far the two most common approaches for past research in this lab.

9.5.1.1 UVic Undergraduates

For much of the work we conduct, it makes sense to at least start by running studies with participants from the UVic undergraduate research pool, so it is very likely you will be working with this kind of sample at some point during your time in the lab. Much of the discussion surrounding human research ethics focuses on ethical treatment of participants, and this will be discussed in the Experiment section, but there are also ethical considerations involved in deciding who does and does not get to participate.

“Researchers shall not exclude individuals from the opportunity to participate in research on the basis of attributes such as culture, language, religion, race, disability, sexual orientation, ethnicity, linguistic proficiency, gender or age, unless there is a valid reason for the exclusion.” (TCPS 2, Chapter 4, 2014)

If you are planning to run your study with UVic undergraduate participants, you will have the option to impose eligibility criteria and restrict participation on certain bases. It is an ethical imperative that participation be open to as many people as reasonably possible given the constraints of your research question. The goal of absolute inclusivity will sometimes conflict with other pragmatic and/or ethical considerations such that “reasonably possible” can be a matter of personal judgment. This section is not intended to be prescriptive, but attempts to set out some general guidelines and suggest possible alternatives to restricting participation under particular circumstances.

The last clause in the above quotation leaves a lot of wiggle room, and inclusion/exclusion criteria are a clear example of harms that can become amplified at the collective level. It is not a huge problem if one experiment requires “normal or corrected-to-normal” vision, for example, and this exclusion may even be necessary, but if visually impaired participants are excluded from most research at UVic this is clearly unfair, and if people with visual impairments are underrepresented in psychological research in general this is a failure of science to serve society.

To be clear, there are valid reasons for exclusion, and for particular research questions exclusions may be unavoidable. Exclusion criteria can even, somewhat paradoxically, serve to increase fairness and equity overall, e.g., if you are studying some effect that has predominantly been investigated with WEIRD participants and want to expand your inquiry to underrepresented groups. But such criteria should only be imposed after careful consideration of ways you might adjust the design of your experiment and/or data analysis to allow for as few restrictions on participation as possible.

For the purposes of recent and ongoing research in our lab, I can think of no instances where it has been or would be remotely justifiable to exclude people from participating on the basis of race, gender or gender identity, age, sexual orientation, national origin, culture, or religion. This does not mean none of these variables are ever relevant to our research questions – e.g., there is extensive research into memory for faces, voices, etc. of individuals who are similar vs. dissimilar from participants on the basis of various demographic characteristics – but any such questions are typically addressed at the analysis stage rather than restricting participation. This discussion will therefore focus on exclusion criteria most likely to come up in our work: English fluency, visual impairments (including restricted colour vision), and hearing impairments.

Because “successful” participation depends to some extent on comprehending verbal instructions (written and/or described aloud) and our research questions sometimes directly pertain to semantic meaning, prior experience with certain words/concepts, etc., it may sometimes seem appropriate to impose exclusions on the basis of English fluency or “first language” requirements. Similarly, the validity of a participant’s data for the purposes of a given research question may depend on their ability to see and comprehend an image, discriminate colours, etc. More rarely, we may use auditory stimuli, which may pose a challenge for participants with various hearing impairments and/or the interpretation of such data. In all of these cases, whenever possible, it is better to make exclusions at the data analysis stage rather than restrict participation outright. There are several ways to address this:

1. Exclusion decisions can be left up to experimenter discretion. This is particularly easy in one-on-one sessions – if you can tell a participant is struggling with the experiment or does not understand the instructions, you can simply make a note next to their participant number after the session, and train RAs to do the same.

2. Participants can be excluded from analyses on the basis of their responses to a direct question about the criterion of interest. We often include a few multiple choice and/or open-ended questions at the end of experiments, so you might ask about the criterion in question at this stage and decide to exclude participants who select a certain response.

Participants can be excluded on some indirect basis that will tend to capture most cases relevant to the criterion in question. If some part of the experiment cannot be performed successfully without understanding the instructions, discriminating between certain colours, etc., setting a simple performance cut-off should suffice. Alternatively, I have seen examples of tasks or questions designed to evaluate things like English proficiency without participants necessarily knowing, particularly in the context of experiments administered online. This should be balanced with the ethical imperative of informed participation and thorough debriefing.

9.5.1.2 Online samples

The internet seems to provide a clear if imperfect solution to the problem of overreliance on WEIRD samples in psychology. Of course, conducting research online is not a panacea; regular internet access remains an inequitably distributed privilege (Poushter, 2016) such that even a sample representative of the entire internet user base would be far from representative of humanity, and in reality demographics tend to be further skewed by factors like how participants are recruited, publishing surveys in English only, and providing little or no financial compensation. Nonetheless, the internet can at least in theory provide access to a more diverse sample than the UVic participant pool and even the local community, so an online sample may be worth considering if you are interested in trying to improve the overall generalizability of your research, establish the generalizability of a particular effect, or specifically reach out to underrepresented populations.

Providing you follow ethics guidelines and receive explicit HREB approval for a given recruitment/advertisement method, you can recruit participants online in a number of ways, e.g., by linking to your study on social media, posting on any of the numerous dedicated platforms for research recruitment, or even advertising on classified websites. The best approach in a particular case will depend on your goals in collecting data online, and this is something you will discuss with me and others involved in your project, but some guidance on crowdsourcing platforms in particular is provided below.

9.5.1.2.1 Crowdsourcing platforms

An increasingly popular approach to online data collection among academics is the use of crowdsourcing platforms, including Amazon’s Mechanical Turk (MTurk), Crowdflower^[1], and Prolific. We have used the first two of these for research in the past and do not currently have a formal lab policy regarding which platform should be used, or forbidding the use of any particular platform. With that said, for reasons discussed in more detail in the appendix for those interested, I would strongly discourage the use of MTurk unless you have access to sufficient funding to ensure participants are paid at least the US federal minimum wage of $7.25/hr^[2] or the minimum wage in the region you are targeting. I would also strongly encourage you to take a look at these Guidelines for Academic Requesters developed by MTurk workers. I am less familiar with the demographics and inner workings of other, smaller platforms, but would encourage generous compensation as a general rule. Prolific takes part of the decision-making out of your hands by imposing a minimum payment of £6.00 (British Pounds) / $8.00 (US Dollars) per hour. Concerns have also been raised regarding data quality on sites like MTurk, such as the potential for bot infiltration and the implications for generalizability and validity of collecting psychological data from people who are essentially “professional participants”. These considerations are all worth weighing seriously.

^[1] At the time of writing (mid-2018), Crowdflower has recently rebranded as Figure Eight and now seems to be more specifically targeted toward tasks designed to train artificial intelligence (i.e., things algorithms still struggle with such as image transcription and sentiment analysis). While it is still possible to conduct more typical psychological experiments using the platform, even the company itself urges caution given issues with quality control.

^[2] An admittedly arbitrary reference point, but most MTurk workers are American (Difallah, Filatova, & Ipeirotis, 2018; Ross, Zaldivar, Irani, & Tomlinson, 2010; see also the MTurk Tracker, Ipeirotis, 2010). Some have raised concerns that such guidelines stand in opposition to the ethical mandate to avoid “undue compensation” and thus may reach the threshold of being coercive. The HREB will be the ultimate arbiter on whether this is the case given the amount you have proposed and the population you are trying to reach, but I do not find it a convincing argument in the vast majority of cases.

9.5.2 Sample size planning

A major thread of the replication “crisis” and responses to it has been increasing attention to the high prevalence of underpowered studies in psychology and neuroscience (Button et al., 2013; Vankov, Bowers, & Munafò, 2014). Low statistical power – to which small sample sizes are one contributor – compromises researchers’ ability to detect true effects, especially if they are small, and means effects that are detected are likely to be overestimated (because when power to detect smaller effects is low, only samples in which the effect happens to be large by chance will produce significant results). At the collective level, the tendency for studies to be underpowered compromises the validity of meta-analyses and can lead to skewed perceptions of just how robust some effects are. As with many issues in psychology, this is not a new discovery (Cohen, 1962), but the potential scope of the problem and the need to change practices in response have only begun to be taken seriously on a large scale within the last decade.

Part of the response to this has been a movement toward more principled sample size planning. There are differing opinions on the best way to approach this and what exactly those principles should be, and you may develop your own philosophy over time, but you should at least get into the habit of running some sample size calculations^[1] prior to initiating a new experiment. You can do this using free programs such as G*Power, which offers a point-and-click interface, or R packages such as pwr. Recently, powerful, flexible, and easy-to-use simulation-based power analytic methods have been developed for a variety of experiment designs (e.g., Caldwell et al.’s Superpower R package). Generally, the key things you will have to consider are:

The size of the effect you are looking to detect. If you are working with a very well-known, widely studied effect, there may be reference points in the literature for how large you should expect this effect to be. Sample size planning in attempts to replicate previous work are often based around the effect size reported in the original sample, which seems reasonable enough on its face. Although basing sample size estimates on effect sizes from the literature is common and perhaps better than nothing, you should be very cautious with this approach for the reason outlined above – that is, widespread low statistical power means effect sizes in the literature are very likely to be overestimates. Estimates from meta-analyses may be a better way to go, as such approaches often incorporate corrections for publication bias, but even this is not failsafe. For all but the most robust, widely replicated effects, effect sizes from the literature should only guide your decision, not be strictly relied upon (unless you are explicitly only interested in finding an effect of that size).

This leads to my own personal preference as far as the effect size upon which sample size planning should be based, namely the smallest effect size of interest (or SESOI; Lakens, 2014). If you are only interested in detecting a sizable effect, use a correspondingly large effect size in these calculations; if you think the effect is important enough that detecting a difference of 0.2 or 0.3 standard deviations between groups would still be worth the additional sample size burden, that is a perfectly principled decision. You may find in the course of these calculations that for some designs the required sample size to detect such small effects stretches the limits of what is practical (or even ethical, if you are relying on a limited pool such as the UVic one that other researchers also draw from), and will have to balance these considerations. Ultimately, there is no correct answer to the question of what your sample size should be, but you should get into the habit of thinking critically about it and justifying your choices in preregistration documents and manuscripts.

Your alpha level (probability of a type I/false positive error). 0.05 (5%) is the go-to benchmark, and although arbitrary it is perfectly fine unless you have a principled a priori reason (e.g., planned multiple comparisons) to adjust it.

Your desired power level (1 – beta, beta being the probability of a type II/false negative error). Although alpha and beta, and the notion of a need to balance them in considering your analytic approach, are often introduced in tandem in introductory statistics courses, historically the norm seems to have been (and may still be, despite some positive change) to largely neglect consideration of beta after that. But determining how much statistical power you want to have – that is, how confident you want to be of avoiding false negative errors – is central to sample size planning. Much like setting the alpha level, there is some inherent arbitrariness in this choice, and like all other parameters that go into sample size determination it ultimately comes down to trade-offs between desirability and pragmatism. A common benchmark is 80% (so a 20% chance of failing to detect a true effect under your various other parameters), but I would advise aiming higher when practical.

Your design. If it is valid and possible to address your research question using within-subjects manipulations, the same sample size will yield much higher statistical power in such a design than in a between-subjects one. Of course, this is not always desirable or even possible. There may also be additional parameters you have to set depending on your particular design – for example, in a mixed design with both within- and between-subjects manipulations, you will need to set things such as the anticipated within-subjects correlation and the number of factors of each type.

^[1] Or considering alternative approaches to strict a priori sample size planning, such as optional stopping if you plan to use Bayesian methods (Rouder, 2014) or analogous sequential analytic approaches in the NHST context (Lakens, 2014). I am less familiar with these approaches but they seem to offer great advantages in terms of data collection efficiency (i.e., not collecting data from more participants than is really necessary), so do hope to move toward implementing them.

9.6 Experiment Design

This section will briefly discuss the open science-y considerations associated with the actual design and programming of an experiment. From this perspective, the ideal experiment is one that can be programmed using open source, freely available means, such as OpenSesame,PsychoPy (both of which use Python, but offer free graphical user interfaces [GUIs] that allow you to do at least some things via point-and-click), or jsPsych (de Leeuw, 2015), which uses JavaScript. This offers advantages for reproducibility, knowledge-sharing, and research efficiency, as anyone can in theory examine and test the resulting experiment script, clearly see what settings were used, and adapt it for their own purposes rather than starting from scratch. These tools also allow you to avoid the costs associated with purchasing proprietary software such as E-Prime, SuperLab, or Matlab, which can be steep or insurmountable for researchers with little access to funding or working at underprivileged institutions. More widespread adoption of such tools can thus indirectly benefit such researchers.

That said, there are various reasons open source tools may not currently work or make sense for you – perhaps you have inherited a project from someone else that relies on proprietary software, or need access to some feature that is not currently available or difficult to implement using these tools (e.g., I am not sure if any of them yet offer as fine-grained control over timing/screen refresh rates as E-Prime; this is not a concern for most behavioural experiments in our lab, but may be if you are interested in EEG). If this is the case, there are still steps you can take during and after the course of designing your experiment to improve the transparency, accessibility, and reproducibility of your research. You can share your experiment design files (e.g., the .es/.es2/.es3 files generated by E-Prime) on the OSF or elsewhere so at least those who have access to the program can reproduce your work. You can (and should) also carefully document, report, and share the details of your method, including any settings you adjust in a point-and-click fashion. You should do this in sufficient detail that someone would be able to adapt your exact experiment in whichever program they might have access to.

Finally, if you are using any custom code/script in your experiment – whether Python, Javascript, the Visual Basic-based language used by E-Prime, etc. – you should get into the habit of annotating extensively, as discussed in more detail in the Data Analysis section.

9.6.1 Stimulus Selection

Another decision point at which your research can be made more open and reproducible is in choosing and/or developing stimuli, which may be words, sentences, images, videos, etc. The ideal here is to use stimuli that are freely available, or to design your stimuli with the intention of making them freely available (what this means will depend on the stimuli; e.g., if you are taking photos of people, you will have to design your consent process and ethics application to ensure you have approval to share them in perpetuity). This will make it possible for others to reproduce your work, conduct replications of your studies that are as exact as possible, or use the stimuli in original research. However, it may also be important for your research question that certain kinds of norm data – e.g., word frequency, visual complexity, emotion ratings – be available for your stimuli, and the goals of “freely available” and “extensively normed” will sometimes be in conflict. There may also be valid justifications for keeping certain stimulus sets, or other experimental material, difficult to access; it may be important for data quality or research validity that materials not become widely known, or that the experiment setting be participants’ first ever encounter with a particular set of materials. Questionnaires used in clinical or other assessment settings, for example, would lose diagnostic value if the details, scoring criteria, etc., were widely available.

If you are using a stimulus set designed by others (freely available or otherwise), make sure to keep track of the source so you can provide proper credit in any presentations or publications based on studies using these stimuli. Relatedly, you should pay attention to any details provided regarding usage rights – although the fair dealing provision of the Canadian Copyright Act allows for even copyrighted material to be used freely for research purposes under most circumstances (UVic Libraries is a great resource for details on copyright considerations, including fair dealing), certain kinds of licenses may prohibit you from modifying any stimuli or sharing the set outside the research context. You should also be careful to track any changes you may make to the set (e.g., removing images) and your reasons for doing so.

9.7 Experiment

9.7.1 Ethical treatment of participants

If you are working, studying, or volunteering in this lab, you most likely have at least some familiarity with the fundamental ethical pillars of psychological research: free, ongoing, and informed consent; the importance of minimizing possible risks and harms, and ensuring these are outweighed by benefits; maintaining participant confidentiality, and so on. Although the HREB is responsible for ensuring a given project adheres to these principles, anyone involved with testing participants or handling data in the lab has a responsibility to ensure they are implemented in practice.Below are some guidelines for ensuring ethical conduct and going beyond what is required to make participants’ experience as pleasant as is reasonably possible. If you are new to testing participants, this will be part of your training, so don’t worry if you are overwhelmed by the information below; you will be given examples directly relevant to the project you’ll be working on and are encouraged to ask questions during this training and on an ongoing basis.

9.7.1.2 Minimizing unnecessary risks and potential harms

Most of this is taken care of in designing the experiment, and for most experiments we conduct participants will not generally be at risk of anything more serious than boredom. But it is not possible to eliminate the possibility of harm in the experimental setting entirely – unexpected harms may arise, or participants may experience crises that are not directly related to the experiment itself. In the event a participant is visibly in crisis or approaches you with concerns related to the experiment or otherwise:

Try to respond with basic compassion and discretion, even to concerns that may seem silly or invalid to you. In a group setting you may want to offer privacy if this seems appropriate (e.g., leaving the room to talk in the hallway).
Keep in mind that you are not a mental health (or legal, security, etc.) professional nor are you expected to behave as such, and offer to refer them to services as needed. You will be provided a list of relevant contacts as part of your testing materials, including supports for personal crises and reporting ethical concerns.
Safety, security, and mental health – both participants’ and yours – are more important than research, and you should ALWAYS feel free to end an experimental session early for these reasons. In a group setting, something like “sorry, something unexpected has come up and I have to end the session early; you’ll still receive full credit, and someone will contact you with debriefing information” is entirely sufficient, but these details can also be sorted out afterwards so don’t worry if you don’t remember in the moment.
- Related to this, you do NOT have to tolerate abuse, harassment, or mistreatment by a participant, even if they are in crisis or you feel somehow responsible for the situation. If an interaction escalates to this point, feel free to take measures such as asking a participant to leave and contacting emergency services or Campus Security if your safety is at risk.

9.7.1.3 Debriefing

It is important that participants are made aware of the purpose of the research they have contributed to. There is a particular ethical imperative, clearly stated in policy documents such as the TCPS, to do so if the experimental design involved any deception. In such cases, the deception must be disclosed, and participants must be provided with justification as to why it was necessary and explicitly given an opportunity to withdraw their data. But even for run-of-the-mill, deception-free, minimal risk studies – i.e., most of what is conducted in our lab – it is important people be given the opportunity to glean some educational value from the participation experience. For studies conducted on campus, researchers in our lab usually deliver a verbal debriefing at the end of each study session. It is possible to administer a written debriefing instead, and there are certain circumstances under which this may be desirable or necessary, such as when a session runs late (in which case the debriefing can be emailed)or for experiments conducted in groups that are likely to have large variations in completion times such that it would be unduly inconvenient to make all participants stay until the others are finished. However, personal experience suggests printed debriefing sheets are rarely read, so discussing the experiment verbally is usually preferable in terms of making it slightly more likely participants learn something. For online studies, debriefing is usually delivered in the form of a text summary at the end of the experiment.

There are some circumstances under which participants should be debriefed even if they do not finish the experiment. A good rule of thumb in the on campus context is that almost everyone who receives credit for a study should be debriefed regarding the purpose of that study, the only exception being individuals who could validly participate in the same study in the future – e.g., if a computer glitch in the very beginning forces you to end a session, or the experimenter does not show up at all, individuals in such sessions can participate another time for full credit and should not be debriefed unless they indicate they are not interested in participating. Potential participants with disabilities that preclude them from standard participation should be fully debriefed, as should individuals who opt to withdraw consent during the experiment (although they of course should not be forced to stay to hear the debriefing information if they are dealing with an urgent situation or clearly uncomfortable).

You are encouraged to consider going above and beyond in terms of providing opportunities for participants to learn from their participation. An easy example is making yourself available after experimental sessions to answer questions and further discuss the study with participants who might want to do so, or inviting them to contact you via email after the fact. Researchers also sometimes offer to send out the results of a study to interested participants once it is complete, but given the realities of academic life it is easy for this to slip down the priority list and be forgotten when the time actually comes to deal with the results. We are moving toward implementing a system that is less susceptible to the vagaries of researchers’ prospective memory, namely identifying projects on the OSF with particular distinctive tags or keywords that are easy to provide participants with either verbally or in printed form.

9.7.2 An example experiment session

Putting all of this together, what can you expect out of a typical in-person experiment session with UVic undergraduates? The procedures listed below will vary from experiment to experiment, but will generally be similar to what is described below:

Prior to the session:
1. Ensure with your supervising graduate student or lab manager that the testing space has been booked
2. Check the experiment SONA page to see how many participants have signed up for the session
3. Arrive 5-10 minutes prior to the session start time and do any setup necessary (e.g., logging into computers and loading local/online experiments, putting any experment instructions up on the projector)
4. As participants arrive, have them sign the research participation record
5. At the scheduled start time, notify participants that the experiment will be starting (there can be some leeway with this, e.g., waiting 3-5 minutes until all participants have arrived and settled in)
During the session:
1. If applicable, introduce participants to the experiments via any provided scripts, inform participants that they can raise their hand if they have any questions, and inform participants that they are free to withdraw from the study at any time without reason or consequence, and they will be granted the full compensation (this will be included in consent forms but is important to reiterate to participants)
2. If debriefing is included in the experiment program, let participants know that they can leave once they’ve finished the experiment. Otherwise, tell participants that they should not leave until instructed to do so, but to indicate in some way that they’ve finished the experiment (e.g., by placing the keyboard on top of the PC) so you have an idea of when everyone has finished.
3. Do your best to answer any questions or resolve any issues raised by participants. Obviously you won’t be able to answer questions about the true study purpose prior to debriefing, but questions of clarification or technical issues are fair game. If you encounter a question you can’t answer or an issue you can’t resolve, don’t stress. Merely provide what information you can and make a note, and contact your supervising graduate student or lab manager after the session.
After the session:
1. Once all participants have completed the experiment, if there is a verbal debriefing read that to the participants and give them an opportunity to ask any questions.
2. Once debriefing is complete and the participants have left, copy any local experiment data from the PCs (if applicable) to a lab USB, log off the computers, and ensure the room is as you left it
3. Drop off the research participation record sheet and any lab USBs in Cornett A179
4. Contact your supervising graduate student or lab manager if there were any issues or unresolved questions during the session

9.8 Maintaining and storing records

This section will mostly focus on the paper records that should or can be maintained while running on-campus experiments. Considerations pertaining to secure storage of digital data files will be discussed in more detail in the Data Management section.

Depending on the experiment and the setting in which it is conducted, the paper records for which you will be responsible may include:

Signed consent forms: Because these contain participant names, they should be securely stored on lab premises whenever possible and treated with particular care when in transit/accidentally taken off premises. They should be retained for 5 years (in case of the very rare event an ethical complaint is filed with the HREB) and then securely shredded.
PRPS Record of Participation form (available here): This form should be filled in every session with the names of participants who receive credit or individuals who are marked as excused or unexcused no-shows, along with the date and indication of credit amount/status (e.g., for a half-hour study, you would mark 1 for participants who show up, -1 for unexcused no shows, and 0 or N/A for excused no shows) . Participants who do show up to the experiment should also be asked to fill in their student numbers. These records should also be securely stored and shredded, but can be destroyed as soon as 3 months after the end of term because they are primarily kept for the purposes of settling any credit discrepancies.
Anonymous record sheet: The exact form this takes will vary depending on the experiment, but most studies will require some kind of record of participant numbers, their associated experimental condition/group, testing times/dates, and a place for comments/notes. This sheet provides an easy reference point for the experimenter (e.g., to avoid reusing participant numbers) and a means of implementing quasi-random assignment (e.g., by pre-populating a spreadsheet with participant numbers and randomly shuffled group IDs, and testing individuals in this order). Additionally, the experimenter should use this sheet to note any errors on their part or unexpected events that may have implications for data quality (e.g., if someone does not seem to understand the instructions or is visibly not paying attention). Such events should be described in enough detail that Future You (or the experimenter in charge of the study) can understand it and make an informed decision regarding whether the corresponding participant(s) should be excluded from analysis. Because this record is fully anonymous (providing you take care not to put any sensitive participant information on it, nor note any participant numbers on the Record of Participation sheet discussed above), it does not need to be securely destroyed and can be stored indefinitely. A better option is to create a digital copy of any information from this sheet that may be needed in the future, such as the participant numbers of individuals who were excluded from analyses and the corresponding justification(s), and store this alongside the rest of the data associated with the study.

When running computerized experiments, it is always a good idea to check the file for the first participant (in each condition, if applicable) to make sure everything is recording as expected. Data files should be backed up regularly (options for this discussed in more detail below), ideally after every experiment session, and changes should never be made to the original files.

9.9 Data Analysis

This is arguably the stage of the scientific process where the greatest number of “researcher degrees of freedom” (see Simmons et al., 2011) come in. Many transparency-related and methodological concerns related to data analysis can be dealt with long before you actually get to this point by preregistering your analysis plans and potentially seeking analytic advice/comments from others. This section will deal with issues related to implementation and ensuring reproducibility.

9.9.1 Reproducibility

Ideally, the entire data analysis process should be reproducible by anyone given your raw data files, from the initial data cleaning, checking, and inspection all the way up to the final analyses (and even beyond, to the manuscript or research summary). The most straightforward way to implement this is by establishing a well-annotated, script-based workflow, ideally in languages that can be run using programs that are entirely free and open source (e.g., Python, R/RStudio). Depending on your particular background and experience, this may be an overwhelming prospect, and learning how to adapt each step of your workflow may require a substantial time commitment. Lab policy is therefore flexible on this point: you are expected to make as much of your data analysis workflow reproducible as is possible given your current skillset, and to prioritize filling in the gaps as you proceed through your career in the lab.

My own bias is to encourage you to use R or Python. Both are open source, highly flexible, have vast communities that are constantly developing new packages and ways of doing things such that it is almost certain someone has already solved the problems you may need to solve (and can help you if not), and may be helpful to you in various career pursuits. However, the learning curve is not trivial for either, so it may not be feasible for you to jump right in. Below I will outline some general principles related to enhancing reproducibility that may help guide your choice of program or suggest ways you might increase the reproducibility of work done in “imperfect” programs.

Scripting is better than point-and-click. Documenting your entire data merging, cleaning, organizing, and analysis process in code makes it possible for others to reproduce and check your work (and for you to remember exactly what you did). Once you have a script you are confident in, adapting it for future projects reduces the possibility of human error that is introduced by steps that are often necessary in point-and-click interfaces such as copying and pasting and having to manually select options anew every time.

Some programs that are mostly used for working with data in a point-and-click fashion also have custom scripting options, such as SPSS’s syntax editor or Microsoft’s Visual Basic, which can be used with Excel. JASP does not, as far as I know, have customizable scripting options at this stage, but ensures reproducibility by storing analysis settings in the output file; it is probably the best option reproducibility-wise if you prefer or are more used to point-and-click interfaces.

Open source is better than proprietary. In addition to the general benefits of open source programs and languages (better security because more people can access, identify problems with, and modify the source code; less vulnerability to the vicissitudes of the market; greater likelihood of still being available far into the future, as others can take over development and updating if the original developers move on), using them in your analysis workflow will increase the number of people who will be able to reproduce your analyses. This is good for transparency and error-checking, as well as collegial knowledge-sharing (e.g., others may adapt your code for their own purposes) and making research more economically accessible.

JASP, mentioned above, is open source, and includes a number of Bayesian analyses. PSPP is an open source SPSS-like program that can work with SPSS files and offers syntax options compatible with SPSS’s. There are also open source equivalents to Excel.

Annotate, annotate, annotate! Regardless of the program you end up using, one of the best things you can do to ensure the reproducibility of your workflow and to benefit your future self is to keep track of everything you do with your data and why. If using R, Python, or other script-based systems, this means commenting or annotating your code to clarify what each section or component is doing and why you have set custom options a certain way. There may be other things it is useful to note, such as instructions on how to adapt particular sections to do something different or use different settings, what to do if a given section results in an error, etc.

I advise always erring on the side of over-commenting; what seems intuitively obvious to you now may not be so in a few weeks, let alone to others who are less familiar with the language or packages you are using.

9.9.2 Version Control

Version control systems (VCSs) are a means of managing files likely to undergo multiple revisions and maintaining copies of each version in an integrated way. If you have ever ended up with a mess of files named along the lines of “term paper.doc”, “term paper final.doc”, term paper final FINAL.doc”, etc., you are familiar with the gist of version control, but for obvious reasons it is preferable to formalize this process and automate some of the organizational elements.

Formal version control can be useful for things like papers, but its advantages are perhaps especially apparent when it comes to code such as R scripts you may develop data analysis and other purposes, particularly those you are collaborating on with others. In the case of code you will be regularly adding new lines or sections and tweaking things as you discover they don’t work as intended. Inevitably, sometimes you will make changes to your code that unexpectedly make things worse or break the entire process, and you may want to revert to an earlier version; used properly, a VCS makes this kind of thing relatively painless. They also usually allow for some kind of straightforward comparison across versions so you can see exactly what has been added, deleted, or changed, and – in collaborative situations – who has done so.

The OSF has its own built-in version control system, so this is one option for incorporating version control into your workflow. However, you will have to remember to manually upload each new version of the file in question to take advantage of this function; it is possible to sync your files with the OSF via a third-party storage provider like Dropbox, but in this case you would be limited to the number of versions maintained by that provider.

An increasingly popular VCS is Git, which has an accompanying hosting service called GitHub that can serve as an online repository for your project files and even generate and host websites. Although Git primarily has a programming/code-based focus, it can also work with other file types. Git integrates smoothly with RStudio, and the OSF allows you to link projects with GitHub pages, so as you develop your own research workflow you may find Git a useful addition. We are in the early stages of learning it in this lab, but there are a number of online resources and tutorials available (e.g., this one by Jenny Bryan with reference to R/RStudio integration specifically) if you are interested.

9.9.3 Statistical Considerations

9.9.3.1 Common NHST pitfalls

Misinterpretation of p-values: These are so common there’s an entire (pretty good) Wikipedia article about them. It is easier to start with what they are not: the p-value does not reflect the probability the null hypothesis is false, or the probability of the observed pattern of results occurring by chance alone if the null hypothesis is false. p-values are probabilities, but they are long-run error probabilities: if the true state of the world is that the null hypothesis is true (e.g., there is no difference in variable x between populations A and B), and you re-ran the same study and analyses using samples of the same size some very high number of times (say 1000), you would only obtain an effect equal to or greater than the one you have observed p% of the time (so if you get a p = 0.04, 40/1000 samples from a population in which the null hypothesis is true would produce effects equal to or greater than the one you have observed).
Misinterpretation of confidence intervals. I learned in my first year of graduate school that my understanding of confidence intervals (CIs) was 100% wrong, and have been engaged in a gradual process of becoming slightly less wrong every since. Again, it is useful to start with what they are not, and I will borrow heavily from a paper by Hoekstra, Morey, Rouder, and Wagenmakers (2014). Consider a 95% CI with lower and upper bounds of 0.3 and 0.7, respectively. This CI does not indicate there is a 95% chance the true population mean is in the 0.3-0.7 range, nor that you would find means in this range 95% of the time if you repeated your experiment and analyses a great number of times. Really, a 95% CI tells you nothing about the particular range it captures in any one case; instead, if you repeated the study and analyses which generated the CI a great number of times, 95% of CIs of that size would contain the true population mean.

The correct interpretations of p-values and CIs are neither catchy nor intuitive, and some people argue they are fundamentally uninformative or uninteresting and advocate for a full shift away from frequentist approaches/NHST and toward likelihood-based or Bayesian approaches. I subscribe to the more measured perspective of Lakens and others who emphasize that these approaches are all just tools designed for slightly different aspects of the overarching problem of uncertainty in sampling and measurement; none is inherently superior to the others, but one may be more appropriate than another for a particular purpose. Assuming your statistical background is, as with most students of psychology, primarily rooted in NHST, I do encourage you to branch out to other approaches, but will not insist you use one or the other.

9.9.3.2 Null effects

You are probably familiar with the idea that standard NHST only allows for two conclusions: you can reject the null hypothesis (if you get a significant result), or fail to reject it (if you get a non-significant result), but you can never accept it. But null results can be theoretically and practically important – it can be perfectly reasonable for your primary hypothesis to be that you will find a null effect (e.g., no difference between groups), and even if not it may be useful for you to have more information on exactly how null-ish your results are before making any decisions about methodological changes or abandoning your original hypothesis altogether. There are both frequentist and Bayesian approaches to the problem of quantifying support for the null hypothesis/null effects, and I am sure I’m not aware of all of them, but will introduce you to two here.

9.9.3.2.1 Equivalence testing

The essence of equivalence testing is that although the NHST/frequentist framework does not allow you to test the hypothesis that a given effect size is exactly zero, you can define a range of effect sizes around zero that you consider too small to be worthwhile. For example, you might decide that in the context of your effect of interest, Cohen’s d values falling between -0.1 and 0.1 are so small as to be meaningless to you. Equivalence testing then enables you to statistically reject the hypothesis that the effect in question is large enough to be outside these bounds. In other words, the approach does not exactly provide support for a null effect, but does provide compelling evidence against the possibility the effect is large enough to be of interest (however you have defined it).

There are different kinds of equivalence testing, but one that is commonly used is the two one-sided test (TOST) procedure, which Daniël Lakens has laudably attempted to make accessible to psychologists via tutorials (Lakens, 2017) and an accompanying R package, TOSTER.

9.9.3.2.2 Bayes factors

The intuitive way Bayesian approaches allow you to quantify relative support for different models/hypotheses, including those in which there is no effect, is a major advantage. This takes the form of the Bayes factor, which is a likelihood ratio conveying the relative likelihood of the observed data under one hypothesis versus another. E.g., you might have a Bayes factor of 5 indicating the data are 5x more likely under the alternative hypothesis than the null hypothesis (and the inverse will of course give you the ratio corresponding to the opposite). These analyses can be conducted in JASP or using various R packages (e.g., BayesFactor). For a comprehensive course on Bayesian statistics more generally, see Richard McElreath’s Statistical Rethinking.

9.10 Disseminating and Communicating Results

Once you have finished a study and analyzed your data, the next step will be to do something with your results. This might be presenting at a conference, publishing in a journal, sharing your results in some informal way, or some combination of the above.

One recommendation I think is good practice (although I regularly forget to do it myself) is writing a concise, 1-3 page summary of your study purpose, methods, and results while all are fresh in your mind. You may sometimes find yourself moving onto other projects very quickly and – memory being fallible as we know it is – forget many of the details when the time comes to publish or revisit your work for other purposes. Although you should always be able to track down these details if you follow the other advice in this manual, having a brief summarized version may save you time, and also provides an easy way to share your results with others if they are not yet or never published formally.

9.10.1 Reproducible Documents

Although reproducibility initiatives tend to focus on methods and analyses, efforts to go beyond this and generate fully reproducible theses/dissertations, journal publications, and even conference posters are also proliferating. RMarkdown, for example, is a tool for R/RStudio that allows you to write and format reports in a language called markdown, and embed all the code for your statistical analyses, figures/tables, etc. within the same document^[1]. You end up with some (fairly ugly-looking) code that, when “knit”, will run all of your analyses, create your figures and tables, and so on, and embed the results within the text of a nicely formatted pdf, Word, or html file. This saves you having to copy and paste or manually rewrite your statistical results, which is a potential point of failure, and can make editing much less of a hassle – if you decide to, say, remove an outlier’s data from all of your analyses, you will likely just have to change a single line of code and re-generate the document instead of tracking down all the references to those data in text, figures, and tables, and manually changing them. Although the initial learning curve can be steep, it will save you time in the long run

A related system (and one that can be integrated with RMarkdown) I am less familiar with but may also be worth looking into is LaTeX, which seems to have particular advantages for work that requires embedding mathematical formulas.

^[1] Recently, Rmarkdown has been superseded by the similar but expanded Quarto (in which this lab manual was created), which offers the same functionality with many new features.

9.10.2 Data Sharing

Regardless of what comes of your results, you should plan to share your data as publicly as possible given the applicable ethical constraints. Although it may make sense to share your data in multiple forms/at multiple levels of analysis, an ideal to work toward is being able to share your data exactly as they were collected (see e.g., Jeff Rouder’s [2016] notion of “born-open data”) – this may not always be possible or advisable given the nature of the data you are working with, but is a means of maximizing transparency and working toward a fully reproducible research workflow. The Tri-council policy on data management, including guidelines and expectations for data sharing, is still in the draft stage, but it seems likely the councils will gradually move toward requiring (1) all data and code associated with journal publications and other formal “research outputs” arising from grant-funded work be stored in an institutional or recognized third party digital repository, and possibly (2) that formal data management plans (DMPs) be submitted as part of grant applications. Using digital repositories and developing DMPs are practices worth adopting even if they are not mandated; I will discuss the former here, and the latter in the final section.

There are a wide range of digital repositories available – some are generalist in the sense of being multidisciplinary and/or not limited to data specifically, such as the OSF, Figshare, and Zenodo, while others are specific to the social sciences or Psychology or dedicated exclusively to research data, not other parts of the research workflow. The service that makes the most sense for you will depend on the sensitivity and format of your data, as well as your purpose in sharing them (e.g., some journals or grant agencies may only recognize certain services or require you submit to their own). These services differ with respect to features such as where data are hosted, privacy and protected access options (e.g., only allowing verified researchers to access your data), documentation and metadata requirements, whether submissions are open or curated, and the formats for which they are specialized. Eventually it may make sense to standardize which services we use as a lab, but for the time being a few examples include:

The UVic instance of Dataverse (more information about the international project here), which does not currently have third party protected access options but does allow for various levels of restricted access;
ICPSR, a large international repository for social science data that offers multiple levels of restricted access, including an option to release data only to researchers whose applications are verified by ICPSR staff. ICPSR is a paid service, but UVic currently has an institutional subscription.
Databrary, which is specialized for video data and also allows for multiple levels of restricted access.

An exciting recent development I will continue to keep an eye on is the Federated Research Data Repository, which has not yet fully launched but is dedicated to hosting research data from Canadian institutions.

9.10.2.1 Privacy and confidentiality concerns

The best case scenario in terms of methodological transparency is to publicly share enough information that anyone could reproduce your entire analysis, from start to finish, including merging, cleaning, exclusions, etc. Ideally, this means sharing the original raw, unaltered data files. However, this ideal must be balanced against privacy and confidentiality concerns. The small decrease in full start-to-finish reproducibility caused by sharing data files with some information removed and/or changed is far preferable to the risk anyone who participated in your study with the assurance their confidentiality would be maintained being identified.

The TCPS distinguishes between 5 categories of data:

1. Directly identifying information, which includes identifiers like names, social insurance numbers, phone numbers, etc.;

2. Indirectly identifying information, which can be reasonably expected to identify an individual in combination (e.g., date of birth, city, distinctive or rare personal characteristics);

3. Coded information, which does not include any direct identifiers but can still be linked to individuals given access to some code;

4. Anonymized information, which contains no direct identifiers (or has been stripped of any such information), cannot be linked to individuals via code, and does not contain sufficient indirect information to identify a particular individual; and

5. Anonymous information, which is and has always been free of any identifying information.

These categories are not always clear cut; technological advances have already substantially blurred the lines between what information can and cannot be considered potentially identifying, and we have no idea what might be possible within a few years let alone decades. As we progress toward increasing reproducibility, transparency, and collaboration by sharing data and analysis code more widely, our ethical duty to do everything we can to protect participant confidentiality must be kept in mind. Some of the implications here are obvious; data files containing participant names should of course not be shared publicly nor with anyone not directly associated with the research project, and things like highly personal and/or sensitive stories or responses to questions, audio and video recordings of experiment sessions, etc. should not be shared unless this was approved by the HREB and participants consented to this use of their data. Similar considerations apply to things such as stimulus sets comprising photos or videos of people; while making stimulus sets publicly available is desirable, this should only be done with the express consent of the depicted individuals, and the exact nature of the original indication of consent must be kept in mind when using older stimulus sets or those obtained from other labs. Someone consenting to their photo being shared in 2005 likely did so with a very different understanding of who was likely to access it and the potential uses it could be put to than someone providing such consent today.

Things get particularly sticky when it comes to the possibility of identifying an individual on the basis of some combination of indirectly identifying information and/or metadata. Essentially, it can be assumed that the more data we collect, the more this possibility increases; this is especially obvious in the context of data such as birthdate, location, and various demographic information, but as machine learning grows more complex and powerful these possibilities may extend to information we don’t currently see as risky.

For our purposes, the most straightforward way to protect participant privacy and confidentiality in most of the research we do is by fully anonymizing the data that are stored. Because the vast majority of our data collection is computerized, this is usually easily implemented by the use of participant numbers or IDs; these serve to individuate participants, but providing (1) there is no other information in the file that could reasonably be used to identify the participant (e.g., extensive demographic information, responses to personal questions, sufficiently long responses to reveal a distinctive writing style, etc.) and (2) these numbers or IDs are not stored in connection with the participant’s name anywhere else, the risk the participant will ever be identified on the basis of, let alone harmed by, public release of their data file drops near zero the second they leave the experimental session.

There are a number of things we can do to keep this risk as low as possible. One, of course, is to limit the collection of personal information to only what is directly relevant to your research question. If you don’t care about participant gender or have any reason to believe it is important to what you are studying, maybe you don’t need to ask participants for this information. That said, there are valid arguments to be made for collecting more information than you need. Exploratory analyses of variables secondary to your research aims can reveal patterns worthy of further investigation, your a priori sense of which variables you think are (or are not) important to what your are studying need not be correct, and you – or other researchers studying or trying to meta-analyze similar questions – may come to lament the lack of certain data later. You can also take particular care to make sure participant names and anonymized IDs are not stored together somewhere outside the original data file, and train RAs to do the same.

Multi-session experiments: There is some research for which it is not feasible to store data in a fully anonymized form, as the researcher needs some way of tracking the same individuals over time or across sessions. Probably the most common case of this in our lab will be experiments involving a memory test after some delay; if the study material is not the same for everyone, we will need some way to figure out who’s who in the test data, or at the very least which condition they were in at study.

Probably the best approach in terms of balancing security and ease is to still use participant numbers or IDs in the data files themselves, but keep a record elsewhere matching these IDs with names. This record should be securely stored, accessible only to those who need it, and securely destroyed as soon as it is no longer needed. Maintaining this record on paper is arguably more secure than keeping a digital record as it is easier to control access to and to permanently destroy a physical copy, but practical circumstances may outweigh this advantage - e.g., if a project involves multiple experimenters and/or testing locations, a paper record may be not only inconvenient but also at a higher risk of being lost. Providing reasonable measures are taken to ensure security, either format is fine.

Anonymizing after the fact: Although I have not worked with them myself, there are a number of means by which you can anonymize data using R, including the methods described here.

9.11 Data Management: Short and Long Term

Last but not least, it is important research data – including “data” in the usual sense of participant data and results, but also other critical research materials including stimuli, experiment files, and important references – be managed with an eye toward making it easy not only for you to find the files you need tomorrow or next week, but also for you or others unfamiliar with the project to find what might be needed or helpful 5 or 10 years down the line. It’s a bit morbid, but I try to regularly ask myself “if I died tomorrow, would someone else given access to my files be able to take over this project?” The usual answer is “not without some amount of pain and/or cursing my spirit”, but I am taking steps to improve the situation, and the earlier in your research career you start adopting good data management practices the easier it will be in the long run. I have already touched on some topics relevant to data management above, including data sharing, reproducibility, and version control.

9.11.1 Data Management Plans (DMPs)

More and more funding agencies are moving toward requiring data management plans as part of applications, and it is a good habit to get into regardless. DMPs generally include high-level information such as why and how you are collecting the data in question; information about the data themselves (what kind, which variables, what file format, how much, how sensitive/confidential, etc.); how you will structure, organize, and store your files; plans for sharing and longer term preservation, and the kinds of documentation/metadata that may be required for your data to feasibly be used by others in the near or far future. The UK-based Digital Curation Centre offers a useful checklist herethat may be helpful in introducing the kinds of questions you should start thinking about early in your research career and before data collection begins on a given project. Starting with a standardized tool that guides you through the process, such as the Portage Network’s DMP Assistant, will also probably make your life easier.

9.11.2 File Management and Storage

As someone who still occasionally finds myself with ~10+ files on my desktop named various iterations of “asdglkj”, I can assure you you will thank yourself later for establishing a good file management system now. Ideally, your system should be intuitive and well-documented enough that others who may eventually need access to some of your files will also thank you.

9.11.2.1 Backups

All digital lab data – experiment files, stimuli, raw data, analysis scripts, etc. – should be stored in at least 3 separate locations: for example, your office computer, your home computer, and an external hard drive. If using one of the office or testing room computers as one of your storage locations, you should use Tivoli Storage Manager to regularly back up your machine, which will back your data up on a secure server (and can serve as one of your three sources if used regularly enough).

By far the most convenient way to ensure data are stored in 3 locations is by using a cloud storage service capable of syncing data across multiple devices (e.g., Dropbox or Google Drive). Because many of these services store the data locally on each synced machine as well as “in the cloud” (i.e., on the company’s servers, wherever they are located), this is a near-effortless way to work with your files in various locations and know you will always have recent copies elsewhere in case one link in the chain breaks down without having to worry about manually transferring files from place to place on a regular basis. Ultimately, unless you are tech-savvy enough to securely encrypt files on your end before uploading them to such services, you should assume the contents of any files stored in this way are fully accessible by higher-level employees at these companies and in many cases any governments with jurisdiction over the server location. Unlikely as it may be that anyone would ever actually do so, if you are dealing with sensitive data collected at UVic or elsewhere – or any data, sensitive or otherwise, participants have been assured would be securely/locally stored – you should not store it in the cloud. This also applies to any other university or student data you may be working with, e.g., as part of your duties as a Teaching Assistant. For files you are not concerned about keeping private (e.g., your own papers), and participant data you have consent to share publicly (e.g., via the Open Science Framework), it is fine to use such services.

9.11.2.2 Lab hard drive

After an experiment has wrapped up, you will be expected to transfer all relevant data onto a shared external hard drive following the file naming system outlined below. At a minimum, this should include:

1. Any files (including stimuli) or scripts required to run the experiment, and for online studies for which this is not possible a detailed enough description that someone could reproduce the full procedure;

2. All completely raw, unaltered data files, including those of participants who were ultimately excluded from data analysis (if data are not already in .txt, .csv, .xml, or .html format, please convert to one of these formats and include these copies as well – this is primarily a “future-proofing” mechanism, as files in non-proprietary, plain text formats are more likely to be immune to obsolescence);

3. Any files (e.g., merged data, or data at other stages of processing that could not be automated) or scripts required to reproduce your analyses, as applicable;

4. Sufficiently robust documentation regarding variable and file names – e.g., in the form of a “readme” file – that someone completely new to the project could navigate your files and understand what’s going on within them.

Additional information you might include are things like the final summary recommended above, more extensive results such as informative figures and tables you have generated, background information, and key references justifying or otherwise relevant to the project.

The archive you create for this purpose should also adhere to the backup rules above; in this case, the hard drive would of course count as one of the three locations, but should not be the only place these archives are stored.

9.11.2.3 Guidelines for naming and organizing files

I will not prescribe a particular approach to file naming and organization for your personal use; as long as you develop a system that works for you, that’s fine. That said, I will want the final archived version of each project formatted according to these guidelines, and you may find it useful to adapt some of them in your own system.

9.11.2.3.1 General advice

Much of the following has been adapted from a slide presentation given by Jenny Bryan at a Reproducible Science workshop in 2015.

· In naming both files and variables, use hyphens and/or underscores rather than spaces to separate words. This preserves human readability while vastly improving machine readability.

· File and variable names should follow the general principle of being “descriptive but brief”.

· Some recommend naming files themselves fairly generically and relying on folder structure to identify what the file is. I prefer some redundancy here so that files will be identifiable even if they somehow end up stored out of context (maybe unlikely, but better safe than sorry). You don’t have to reproduce the whole folder structure in the name, but at least include the project name/number if applicable, year, and if it’s something you have authored or generated (e.g., script, or highly processed data), and your initials.

· You may wish to include a version number for files you are likely to change substantially but may also need to revisit old versions of, but using a formal version control system can obviate the need for this.

You can’t always rely on “date created/modified” metadata to determine the history of a given file as they may get moved around, and this information sometimes gets overwritten in the process. So for files for which you anticipate this being important information, it may be worth including a date in the filename. If you do so, make sure to stick with a uniform format across files, preferably the near-universal YYYY-MM-DD.

9.11.2.4 Lab hard drive format

The following is a sample folder structure of the sort I would want you to store on the lab hard drive (albeit ultimately zipped or otherwise compressed), with each additional bullet level representing a new folder level. This specific example pertains to a project with multiple paths/sub-projects conducted over multiple years.

metamemory-judgments_08-06-391b [parent project name with HREB identifier]

materials-based-bias-effect [child project name]
- 2018-2019 [school year, as this is the usual data collection schedule]
  - svdw_naming [individual study/experiment name]
    - readme.txt [file containing necessary documentation]
    - data
      - processed
      - raw
    - experiment
      - stimuli
    - preregistration
    - references
    - results

9.12 Reference management

Although it used to be the norm for students and researchers to end up with huge piles of printed or photocopied research articles at the end of a project, the more common modern equivalent is the folder full of pdfs with unintelligible names. Better-organized folks than myself may be able to keep on top of this manually in a way that works for them, but in my experience it can spiral out of control very quickly, especially when you are trying to juggle literature associated with multiple projects and/or courses at the same time. Even if you think your current system is working, I highly recommend trying out a reference manager or two early in your career. Most of these tools have some capacity to extract important metadata such as article titles, journal and author names, and publication dates from pdfs or other file formats; organize and rename your files in a meaningful way (e.g., into folders named by first author, and files named “[Article title (Year)].pdf”; some combination of in-program folder- and or tag-based system for sorting articles by project, class, or topic(s); plugins that allow for easy in-text citations and bibliography generation in programs like Microsoft Word, and cloud-based syncing systems that allow you to access articles across different devices.

UVic Libraries provides a guide to some of the most commonly used reference managers, including a comparison of various features that might be of importance to you. I can vouch for the ease of use and high-quality metadata extraction of both Zotero and Mendeley, but in terms of open science and data preservation Zotero, which is open source, is a better bet. Mendeley is currently owned by the academic publishing giant Elsevier, and thus comes with some privacy and ethical concerns, as well as some risk of losing control over your library/annotations or becoming locked in to the platform given its proprietary nature. York University’s library provides a detailed comparison of various features and pros and cons of the two programs.

Other points of caution, regardless of which program you use, are

1. Be aware of your privacy settings (e.g., whether the program you are using adds citation records into a searchable public library by default, and how to turn this function off or exclude certain documents, such as personal communications or draft manuscripts you have been sent privately)

2. If you have any references you want to keep private and are using a program with sync functionality, look into where its servers are located (neither Zotero nor Mendley currently offer the option to store data within Canada).

3. Metadata extraction is not perfect, so you will still want to glance through your document list anytime you add a large number of papers to make sure everything looks okay. Similarly, you should double-check any bibliographies you generate for the same purpose and to ensure APA compliance.