The facial items varied considerably in coder judgment reliability as well as criterion (empirical and convergent), content, and face validity. Observational scales should provide behavioral cues that correspond to empirical descriptions of the facial expression of pain.