A real life case study on reporting inconsistencies: what would you do?

(Note: If you haven’t done so yet, be sure to read my earlier blog post as an introduction to the 4 papers with 150+ inconsistencies)

Many scientists will at some point in their academic career play a game about research ethics involving discussions of case descriptions. These cases typically start with a description of a tricky scenario, for example about two scientists arguing about who should be the first author, or about data management and scientific fraud. The interesting part of these discussions is the wide array of opinions on what you should do if you would find yourself in one of those scenarios.

This blog post is such a case. It is not fictional scenario but a real life case which is currently unfolding:

Case Description: Three researchers – Jordan, Nick, and Tim – find over 150 reporting inconsistencies in 4 published papers. Although they have contacted all the corresponding authors of these papers, only the senior author, Dr. Brian Wansink, has replied. However, Dr. Wansink did not want to share the anonymized dataset underlying the 4 papers or offer any explanation for the inconsistencies, and eventually stops replying.

Given this scenario, what would you do?

After our initial discovery we tried many different possible ‘solutions’ to this case of suspected research misconduct. First we wanted to be absolutely sure about our calculations of the 150+ inconsistencies, so all three of us checked everything. As we found no errors in our calculations, we proceeded with writing a thorough pre-print and published it at PeerJ Pre-Prints, which we then shared with others.

We received a lot of responses, and within a few days thousands of people read and downloaded our pre-print. The statistician Andrew Gelman blogged four times about it (1, 2, 3, 4); the Everything Hertz Podcast discussed it quite thoroughly; after interviewing one of us, Slate magazine published an article; and the investigation was featured at Retraction Watch. To our knowledge, two more media outlets are preparing an article about our investigation.

While these efforts helped to raise awareness, they did not by themselves cause any change in the situation. As such, we thought it would be best to find and contact some kind of ethics board at the university of the researcher. As such, we wrote the following long and detailed letter to the Office of Research Integrity and Assurance (ORIA) at the Cornell University, and a CC to the Institutional Research Board (IRB) at Cornell:

We are contacting you regarding four published articles of which Dr. Brian Wansink, of the Cornell Food and Brand Lab, is the senior author. The four articles involved are :

  1. Just, D. R., Sığırcı, Ö., & Wansink, B. (2014). Lower buffet prices lead to less taste satisfaction. Journal of Sensory Studies, 29(5), 362–370.
  2. Just, D. R., Sigirci, O., & Wansink, B. (2015). Peak-end pizza: prices delay evaluations of quality. Journal of Product & Brand Management, 24(7), 770–778.
  3. Kniffin, K. M., Sigirci, O., & Wansink, B. (2016). Eating Heavily: Men eat more in the company of women. Evolutionary Psychological Science, 2(1), 38–46.
  4. Siğirci, Ö., & Wansink, B. (2015). Low prices and high regret: how pricing influences regret at all-you-can-eat buffets. BMC Nutrition, 1(36), 1–5.

On November 21, 2016, Dr. Wansink published a blog post in which he gives a frank description of the research practices in his lab. The reactions to this blog post by members of the research community, typified by a number of the comments written by readers of the blog, suggest that many people regard the post as describing what are sometimes referred to as “questionable research practices” (e.g., generating hypotheses after having looked at the data).  Other reactions concerned the hiring and employment policies in the Food and Brand Lab that appeared to be suggested by the blog post.  Although both of these issues are important, however, neither is the focus of our request here.

The blog post lists five articles that were co-authored by the graduate student whose experience at Cornell is the focus of the post. Four of these articles have a common theme, namely patterns of consumption among diners at an all-you-can-eat buffet, featuring pizza, salad, and other food items. My colleagues and I have read these four articles, which all appear to be based on a single dataset, and identified a total of over 150 statistical and other errors and inconsistencies in them.

The nature and severity of these errors and inconsistencies vary widely. For example, many of the reported means and standard deviations (SDs) are mathematically impossible given the reported sample sizes. Many of the reported test statistics are internally impossible, such as reported t or F statistics that are incompatible with the means, SDs, and sample sizes used to calculate them. There are also multiple inconsistencies between the articles, such as samples and means which should be identical but are reported differently in each article; inconsistent descriptions of what observations were or were not made; and inconsistent reporting of the numbers of participants in each condition, both within and across studies.

Naturally, it would be easier for my colleagues and I to determine exactly what might be causing these errors and inconsistencies if we had access to the data set. Between January 5 and January 10, 2017, I had a cordial exchange of e-mails with representatives of the Food and Brand Lab, in which I requested access to these data. They explained that it was indeed possible to share data, although they “would then need to revise the IRB request for the study by adding you to the project as a co-author who has access to the data that is agreed upon.” It appears that the lab personnel initially thought that my colleagues and I might be wanting to conduct supplementary analyses of our own to investigate other aspects of the patterns of consumption in the data. However, when I made it clear that our aim was to identify the source of the errors and inconsistencies that we had found, I received (and have still received, as of this writing) no further reply.

We considered that the high number of errors and inconsistencies that we found was so high that it was worth writing up the methodology and results of our investigation. A preprint detailing our findings can be found attached to this letter, as well as here: https://peerj.com/preprints/2748v1/. At this time, this manuscript is under consideration by a peer-reviewed journal for possible publication.

Our investigation of these articles has already gained a considerable amount of attention, with our preprint getting over 4,400 views and 2,900 downloads in the eight days that have elapsed since its appearance. The four articles in question have also been the subject of a number of blog posts, including no less than four by the widely respected statistician Dr. Andrew Gelman of Columbia University. We also understand that both Slate magazine and the Guardian newspaper are preparing stories about this matter.

Although, as mentioned above, neither I nor my colleagues have received any further direct communication from the Food and Brand Lab about our request for the data, we recently noted that Dr. Wansink has added an update to his blog in which he explains why these data were not made public. An analogous comment was also added to the PubPeer pages for one of the articles in question; although this comment is not signed, we presume that it was posted with the authorization of Dr. Wansink.

I must confess that my colleagues and I find Dr. Wansink’s apparent attitude to our request to be a little frustrating. First, as already mentioned, I have received no reply to my formal request for a copy of the data set, although more than three weeks have now passed; yet, Dr. Wansink and his team appear to be happy to communicate about this matter on social media. Second, these posts (i.e., Dr. Wansink’s blog, and the post on PubPeer that we presume originates from an authorized source within the Food and Brand Lab) only address the question of why the data from these studies has not been made generally available in a publicly-accessible repository; they do not explain why an anonymized version of the dataset could not be shared with bona fide researchers. (If Dr. Wansink does not consider us to be bona fide researchers, he has so far failed to explain why this might be the case; certainly, when the representative of the lab indicated that it would be necessary to add my name retroactively to the IRB request, there was no mention of any vetting procedure that might be required.) Third, it appears that the data set in question has recently been shared with what Dr. Wansink, in his blog post, refers to as “a non-coauthor Stats Pro [sic],” whose identity and institutional affiliation have not been revealed.

Given the severity and scope of the errors and inconsistencies that we have identified in these articles, we believe that the further investigation of these problems, based on the data set, ought not to be left solely to an unnamed person appointed by the Principal Investigator himself.  To do so would, we feel, risk giving the impression that Cornell is not acting in a fully transparent, collegial, and scientific manner here.  We therefore hope that you will add your institutional support to our request to the Food and Brand Lab to give us access to the full data set that was collected at Aiello’s restaurant and used as the basis of these four articles. This would be of immense help to us in our efforts to verify the errors and inconsistencies that we found, and to identify their origin.

Although we do not find all of the reasons given by Dr. Wansink for why the data set cannot be freely shared with the public to be especially compelling, we are conscious of the need to respect the confidentiality clauses that were included in the consent forms that were signed by participants.  Hence, we would be happy to give any assurances that might be required regarding the handling of these data, including the signing of non-disclosure agreements or other documents concerning the confidentiality and ethical protection of the participants in these studies, in order for the data to be released to us so that we may continue and deepen our reanalysis.  We would also be happy for the names of participants, as well as any other genuinely personally identifiable information (e.g., addresses or phone numbers), to be removed from the data set before it is sent to us.

Thank you for your time, and your consideration of this matter.

This is where it becomes more interesting. If you were the chair of the Office of Research Integrity and Assurance (ORIA) at the Cornell University, what would you do? How would you reply to our letter?

Personally, I would deem it reasonable to agree with our request for anonymized data. In addition, I would argue that the sheer volume of errors constitute sufficient reason to initiate an investigation in the veracity of the research performed at the Cornell Food and Band lab. This is strengthened by the fact that Jordan found even more errors in six other, highly cited papers by Dr. Brian Wansink. However, I am not an expert in these matters and am personally involved, so perhaps my judgment is not the best.

This is how the Office of Research Integrity and Assurance decided to respond:

Thank you for your inquiry. Cornell University supports open inquiry and vigorous scientific debate. In the absence of sponsor or publisher data sharing requirements, however, Cornell allows its investigators to determine if and when it is appropriate to release raw data, subject to any IRB imposed limitations.

To clarify, the IRB is the Institutional Research Board (IRB) at Cornell University, which has (so far) not replied yet.

How should we respond?

What would you do?

Do you agree with the reply of the ORIA?

How do you hope the IRB will respond?

Share your thoughts!

11 thoughts on “A real life case study on reporting inconsistencies: what would you do?

  1. I wonder whether it would be more appropriate to report this to the journal(s) the papers were published in. Shouldn’t they be made responsible for following up on this or threaten the author with removing the publications?

    • Thanks for your comment! I Agree that the journals certainly have a shared responsibility in guaranteeing the veracity of the papers they publish, especially once reporting inconsistencies have been found.

  2. The ORI reacted as it should, as they are required to do.

    Although I admire you discuss all of this publicly, I warn you that this may have bad repercussions for you; my advice to you is, for the moment, to deal with this case more delicately, and only discuss the case publicly when it is resolved.

    Sending your report to the journal is a way to go. I am curious what will happen. In a similar case I was/am involved in, the journals referred us to the integrity officer of the university of the primary researcher. I also believe this is something you should do anyway, IF you suspect misconduct.

  3. 1. I propose you contact the editors and/or the publishers of the Journal of Sensory Studies, the Journal of Product & Brand Management and Evolutionary Psychological Science and ask them (1) to publish ASAP an expression of concern because the raw research data are (currently) ‘unavailable’, and (2) a thorough response / rebuttal on your concerns which have been publised in the PeerJ preprint.

    2. The authors of the paper in BMC Nutrition have refused to give you unrestricted access to an anonymized version of the raw data of this paper. I therefore propose you contact the Research integrity group of publisher BioMed Central ( https://www.biomedcentral.com/about/who-we-are/research-integrity-group ) and urge them to retract ASAP the paper in BMC Nutrition for the simple reason ‘refusal of giving others access to the raw data’. Tell them as well that they must issue immediately an expression of concern in case they hestitate to retract the paper. Urge them to send you a response in which they acknowledge the receipt of your requests and urge them to give a thorough response / rebuttal on your concerns in the preprint.

    3. I propose you always mention in your correspondence about this topic the recent paper of Smith & Roberts (Time for sharing data to become routine: the seven excuses for not doing so are all invalid, https://f1000research.com/articles/5-781/v1 , first published 29 April 2016).

    4. I propose you always mention in your corresponce about this topic that sharing raw research data of published material (papers, dissertations, etc.) is mandatory for all researchers who are affiliated to any of the 14 research universities in The Netherlands. Refer always to the VSNU Code of Conduct http://www.rug.nl/about-us/organization/rules-and-regulations/algemeen/gedragscodes-nederlandse-universiteiten/code-wetenschapsbeoefening-14-en.pdf and http://www.rug.nl/about-us/organization/rules-and-regulations/algemeen/gedragscodes-nederlandse-universiteiten/wetenschappelijke-integriteit-12-en.pdf (item 10 of the Preamble of the Code connects both documents with each other).

    5. Refer as well always to RUG (the University of Groningen) who did not hestitate to punish researchers who were unwilling to share raw research data, see
    and to https://www.nrc.nl/nieuws/2015/07/01/universiteit-integriteit-in-geding-bij-taalfoutonderzoek-a1415044 for backgrounds.

    6. Ask both ORIA and IRB of Cornell for a comment on the views in Smith & Roberts and for a comment on the verdict of RUG.

  4. In reply to the other “anonymous” above: perhaps that’s a good idea. I checked the “author instructions”, and more specifically the “ethical responsibilities for the authors”, concerning the journal “Evolutionary Psychological Science” and it reads:

    “Upon request authors should be prepared to send relevant documentation or data in order to verify the validity of the results. This could be in the form of raw data, samples, records, etc. Sensitive information in the form of confidential proprietary data is excluded.”

    http://www.springer.com/psychology/personality+%26+social+psychology/journal/40806

    Side note: the following is also stated:

    “A single study is not split up into several parts to increase the quantity of submissions and submitted to various journals or to one journal over time (e.g. “salami-publishing”).”

  5. Cornell University states: (1) “we allow our investigators to determine if and when it is appropriate to release raw data”, and, (2) “we support open inquiry and vigorous scientific debate”. Excuse me very much, but both statements are contradicting each other. The first statement implies towards my opinion that Cornell University is unable to help / assist you in getting access to the raw research data of the four papers. It seems therefore that is has not much sense to put lots of time and efforts in communicating with Cornell University about the wish to get access to this set of raw research data.

    See, eg, http://blogs.lse.ac.uk/impactofsocialsciences/2015/07/03/data-secrecy-bad-science-or-scientific-misconduct/ for some reflections about a relationship between a refusal of sharing of raw research data and scientific misconduct.

    • To be honest I am sensitive to arguments about why data should not (always) be shared. Obtaining the data myself is far from my goal. My aim is to have a scientific literature which is accurate; this is clearly endangered by 4 papers with 150+ inconsistencies such as mathematical impossibilities and contradictory reporting. Getting the researchers to share the data with an outside party for a formal investigation would be one way to go about it. If all authoritative parties decide that the data cannot be shared then so be it, but then there is still the case of of the veracity of the literature. There are several options, from the authors correcting the errors (and being accountable for the veracity of the corrections) or retraction of the papers.

  6. Tim wrote: “If you were the chair of the Office of Research Integrity and Assurance (ORIA) at the Cornell University, what would you do? How would you reply to our letter?” See below for an example.

    Dear Dr. Tim,

    Congratulated with your preprint in which you challenge several findings in the four papers of Dr. Wansink. A great piece of work and fully in line with the way how Cornell University trains their students how to conduct research.

    Cornell University supports open inquiry and vigorous scientific debate. I have therefore ordered Dr. Wansink to send you without any delay the full set of the requested raw research data of the four papers. All at Cornell University must always work according to Recommendation 7 (‘Safeguarding and Storing of Primary Data’) of the ‘Proposals for Safeguarding Good Scientific Practice’ of DFG, the main German funding agency [1]. It is therefore very easy for Dr. Wansink to ensure that you receive without any delay, by email or by wetransfer.com, the requested sets of the raw research data.

    Please contact me when you have not received the full set within the next 24 hours. Be assured that I will take firm measurements against Dr. Wansink when it turns out that you have not received within 24 hours the full set of the raw research data of the four papers (Dr. Wansink is at the moment not sick and/or on leave and/or for a prolonged period of time at a site without access to the internet.).

    Please inform me about the outcome of your enquiries and please don’t hesitate to continue with challenging papers of Cornell University.

    Best wishes,

    ———
    [1] http://www.dfg.de/download/pdf/dfg_im_profil/reden_stellungnahmen/download/empfehlung_wiss_praxis_1310.pdf

    Recommendation 7 states for example:
    (a): “Primary data as the basis for publications shall be securely stored for ten years in a durable form in the institution of their origin.”
    (b): “Being able to refer to the original records is a necessary precaution for any group if only for reasons of working efficiency. It becomes even more important when published results are challenged by others.”
    (c): “Experience indicates that laboratories of high quality are able to comply comfortably
    with the practice of storing a duplicate of the complete data set on which a publication is based, together with the publication manuscript and the relevant correspondence.”

  7. Klaas: BMC have not refused us anything. Indeed, I think it is very likely that they do not have a copy of the data. Although BMC as a group has had a post up saying that they require data sharing since 2011, I have established that BMC Nutrition only added this requirement to their policies in early 2016, which was after the relevant article was published.

    More generally, to those commenters who have suggested that we write directly to the journal editors: We are currently considering our options in this regard. The advice that I, at least, have received from various parties about this type of issue in other cases has been mixed; it seems that, if writing to the journals is to be effective, the timing (relative to other events that might occur along the way) can be critical.

  8. It doesnt really matter whether you are right or wrong. Cornell has a policy and they are sticking to it. They NEED to do that or else risk being sued by Wansink.

    Instead I recommend you encourage Cornell or the funding agency that supported the work to initiate an investigation themselves. You may be seen as being biased. That is what worked in the Hauser case.

  9. Tim, Nick (and Jordan), thanks for the response and please excuse me for some delay in a reply. I tend to think that it is an excellent decision to consider carefully the next steps and to take you time before you set a next step.

    My experiences with contacting editors (and publishers) of journals are mixed (which is a mild conclusion). My experiences with for example communicating with the editors of BMJ Open are excellent. Invariably, I get a response the next day, and all responses are always very professional. This resulted in an offer to submit an Eletter, in a quick publication of my Eletter, and in a comment where the editor of BMJ Open has set online older versions and reviews on older versions of the paper in question. See http://bmjopen.bmj.com/content/6/11/e012047.responses (see my comment at https://www.researchgate.net/publication/310778980 for the submitted version of my Eletter and for more insight in the involvement of Elizabeth Moylan with our efforts to retract the Basra Reed Warbler article).

    Elizabeth Moylan, on the other hand, does not communicate with me. This is also the case with her co-author. I have until now also not received a copy of her ICMJE form. This is towards my opinion no problem at all for me. It is on the other hand, at least towards my opinion, a problem for Elizabeth Moylan, as people will start to wonder why her form is ‘unavailable’, and already for a prolonged period of time. The editor of BMJ Open told me that the author(s) are at the moment preparing a response on my Eletter. The editor also told me that the ICMJE form of Elizabeth Moylan is not in the possession of the editors of BMJ Open.

    So it is towards my opinion no problem at all that the raw research data of (a large amount of) papers from the lab of Wansink are at the moment ‘unavailable’. It seems towards my opinion likely that many people will start to wonder what’s going on and / or will become curious why the raw research data of these papers are ‘unavailable’.

    I tend to think that it is good to contact the editors and/or the publishers of the papers in question and to ask them for their views on your concerns (both in regard to the contents of the papers and in regard to the ‘unavailability’ of the raw research data of these papers. Note that it s not taken for granted that all journals and/or publishers will respond similar to the way how BMJ Open is and was responding on my queries.

    2. I tend to think that it is an excellent decision of all of you to continue with critically reviewing more papers from the lab of Wansink (et al) and to publish quickly the findings, at for example http://steamtraen.blogspot.nl/2017/02/a-different-set-of-problems-in-article.html and https://medium.com/@OmnesRes/cornells-alternative-statistics-a8de10e57ff#.v6kfwbdy8 and https://medium.com/@OmnesRes/cornell-and-the-first-law-of-fooddynamics-cb2ed34d7e7f#.lywc2lf7h (and as well at a variety of other sites, including the comments).

    Please note that reviewing (a large amount of) the papers of the lab of Wansink is just how science is working. So please continue with reviewing more papers (maybe you can also ask other people to conduct part of these reviews). I would like to suggest to consider that you might compose some sort of digital version of a ‘special issue with reviews of papers from the lab of Wansink’.

    3. It seems to me that your current activities fit with views of both Richard Gill and Dave Fernig.

    Copy/pasted from http://www.math.leidenuniv.nl/~gill/#smeesters :
    “In physics, the interesting experiments are immediately replicated by other research groups. Interesting experiments are experiments which push into the unknown, in a direction in which there are well-known theoretical and experimental challenges. Experiments are repeated because they give other research groups a chance to show that their experimental technique is even better, or to genuinely add new twists to the story. In this way, bad reporting is immediately noticed, because experiments whose results cannot be replicated immediately become suspect. Researchers know that their colleagues (and competitors) are going to study all the methodological details of their work, and are going to look critically at all the reported numbers, and are going to bother them if things don’t seem to match or important info is missing. In particular, if the experiment turns out to be methodologically flawed, you can be sure someone is going to tell that to the world.”

    Copy/pasted from http://ferniglab.wordpress.com/2013/03/28/correct-correction/ : “If there is nothing intrinsically wrong with the data, the refusal to share data to the point at which recourse has to be made to the journal editors (who can do no more than request the data) can only lead the onlooker to a rather dark conclusion. This conclusion is perhaps best summed up by Marcellus’ words in Hamlet, Act 1, scene iv “Something is rotten in the state of Denmark“.

    4. I am at the moment unsure if it has much sense to spend much time with communicating with ORIA and/or the IRB at Cornell.

Leave a Comment