Brent Strickland and Aysu Suben have a recent paper out in which they identify (and try to demonstrate experimentally) an important source of potential experimenter bias in the survey style experiments that comprise the bulk of the work in X-phi.
The problem is, simply, that knowledge of the experimental hypothesis that is being tested may affect the design of the experimental stimuli - which, in X-phi, are typically short vignettes. Subjects are exposed to these vignettes and their immediate, intuitive responses recorded for analysis.
Strickland and Suben attempt to demonstrate how knowledge of the hypothesis under investigation might introduce this kind of bias by doing a replication-with-a-twist of an earlier experimental result of Knobe and Prinz's.
I was going to recap the entire argument and experiment, but in this case a stellar summary is already up over at the experimental philosophy blog. The comments section on this entry is of a characteristically high quality (seriously, read the comments - they really push the debate forward).
Further, the original paper is only 11 pages, and really straightforward - so read that if the X-phi blog summary wasn't enough.
Now that you're back and know all about the experiment ...
I don't think that anyone has yet pointed out just how problematic this may be given that the aim of many of the experiments in X-phi are specifically to test our natural intuitions about broadly philosophical issues.
My (not yet fully developed) problem is that given that if people do in fact have a set of intuitions about the issues that our experimenters are investigating then that would potentially introduce a bias into the stimuli anyway.
Let's take the idea of freedom of will as an example. Let's assume that people are naturally incompatiblist - that is, let's assume that, intuitive, people untainted by years of reading philosophy intuitively hold that freedom of will is possible in a fully deterministic universe.
Suppose, further, that we want to test this hypothesis with vignettes and surveys but are afraid that our exposure to our hypothesis will introduce a bias into our stimuli creation - I fail to see how getting someone who we've hypothesized holds some pre-theoretical intuition about determinism and free-will to draw up the stimuli will help. It seems possible, even likely, that stimuli they create will be influenced by this pre-theoretical (or, if they are philosophers, their pet-theoretical) understanding of the issue at hand.
A step in the right direction to a much more productive - and interesting - approach is suggested in a comment by Strickland:
I'm most interested in comparing the effectiveness of different possible solutions. One simple starting point would be to have three groups of on-line experimenters : (1) receives hypothesis A (2) receives hypothesis B (3) receives no hypothesis. Then you give all m-turkers clear instructions on the types of sentences they need to build (e.g. all sentences must have a group as a grammatical subject and must contain the verb "desire". then the experimenter can choose the tense and any complements)....
This is much more thorough - unlike the Strickland and Suben experiment, which only had groups (1) and (2) in the quote above - the inclusion of the "hypothesis neutral" group would give us the opportunity to not only test peoples's responses to the given vignettes, but further gives us the opportunity to see how the actual vignettes of hypothesis naive stimuli creators behave in comparison to those created by subjects who had been exposed to hypotheses.