Critique of ADHD in Female Adolescents
I created this paper for my graduate research methods class at the University of Rochester. The professor asked us to use critical techniques from our class textbook, "Research and Evaluation in Education and Psychology," by Mertens (2019), and other class material to analyze a paper of our choice. I chose to critically analyze Rucklidge and Tannock's (2001) research on female adolescents with ADHD, which I have included below.
Critique of ADHD in Female Adolescents
Summary of Article
Below is a summary of the research conducted by Rucklidge and Tannock (2001) on the psychiatric, psychosocial, and cognitive functioning of female adolescents with Attention Deficit Hyperactivity Disorder (ADHD).
Problem
Rucklidge and Tannock (2001) assert that researchers primarily derive their knowledge of ADHD from studies conducted on boys. They argue that females are a neglected area of study in ADHD research, which raises significant public health concerns.
Research Questions
Rucklidge and Tannock (2001) claim that the limited existing research on females with ADHD has primarily focused on two areas: comparing females with ADHD to females without ADHD and comparing females with ADHD to males with ADHD. They believe that previous research does not sufficiently examine the emergence of impairments in adolescence nor considers variables across the multiple domains of psychiatric, psychological, and cognitive functioning.
To address this gap, Rucklidge and Tannock conducted a controlled quantitative quasi-experiment. Their study aimed to compare females with ADHD to those without ADHD and to males with ADHD. They compared across dimensions of psychiatric, psychosocial, and cognitive functioning, with a particular focus on unmedicated adolescent participants, to better understand the contributing factors to the emergence of ADHD.
Methods
Subjects
A total of 107 subjects, aged 13 to 16 years, were included in this study: 24 females with ADHD and 35 males with ADHD, both groups having a confirmed and current diagnosis, along with 28 control females and 20 control males. The researchers also collected demographic and characteristic data from each individual.
Dependent Measures
Dependent measures included instruments measuring anxiety, depression, distress, drug use, attributional style, locus of control, positive and negative life events impacts, perception of childhood satisfaction, and academic and intellectual functioning.
Procedures
The researchers conducted sessions with participants over six hours on-site at a pediatric facility in downtown Toronto. They mailed questionnaires to parents and teachers to gather additional data. If concerns arose from the questionnaire responses, the researchers followed up with interviews with the parents.
Results
Researchers categorized results across three domains: Psychiatric, Psychosocial, and Cognitive. For the psychiatric domain, researchers found no significant differences between the age of participants, age of onset, parental marital status, treatment taken for ADHD, ADHD subtype, or comorbidities. I have included further details on results in Appendix A - Details on Results. Suffice it to say, Rucklidge and Tannock summarize their findings first by comparing females with ADHD to control females, noting that females with ADHD showed significant impairments compared to female controls, particularly in the psychosocial domain. Rucklidge and Tannock then share findings based on a comparison of females with ADHD and males with ADHD, noting that, contrary to previous research, both males and females with ADHD exhibited similar levels of comorbid psychiatric disorders. However, in line with previous research, females with ADHD did exhibit more psychosocial impairment compared to males with ADHD.
Discussion
Rucklidge and Tannock note certain limitations in their research, such as incongruencies in reports of parents and teachers and the self-reports of participants, some specific differences in variables, and additional controls, which did alter results and addressed limitations such as the recruitment process. More details are given in Appendix B—Details on Discussion.
Critical Analysis of Article
Mertens (2019) provides a guide for critical analysis, detailing twelve questions for internal validity, ten for external validity, and three for general validity for Experimental, Quasi-Experimental, and Single-Group Designs.
I have detailed answers to these questions in Appendix C - Details on Critical Analysis. After applying these critical questions, I summarized these critiques into major themes, which I have provided below.
The primary threats I see affecting internal validity include a lack of timeframe or project plan, which could have led to history or maturation threats, bias in testing and instrumentation, threats from statistical regression and differential selection due to the small sample size and lack of controls, and threats due to communication bias.
The research lacks information on a timeframe and project schedule. The researchers do not specify if interviews and tests were conducted simultaneously or over several years; this raises concerns about history and maturation effects, particularly given the adolescent participants who experience rapid changes in their environments and personal lives. Without a clear timeframe, assessing how external events or individual maturation might have influenced the results is challenging. It also makes the research very difficult to replicate.
Testing threats are present, as some participants, especially the males with ADHD, had previous assessments. The males with ADHD might have been familiar with the tests, potentially influencing their responses. Additionally, the study used different levels of test administrators (doctoral-level psychologists versus graduate students) without standardized training or interrater reliability calculations, bringing to question the standardization of test administration. The extensive 6-hour battery of tests could have led to participant fatigue, potentially affecting later test results.
Statistical regression issues may also exist due to the small sample size and high proportion of new female ADHD referrals. With 75% of the females being new referrals, the source of these referrals and their impact on the results need more exploration. Differential selection problems arose as the study did not fully control variables such as IQ, parental education, medication use, and ADHD severity. These uncontrolled differences introduce potential confounding factors, complicating the interpretation of the findings.
Communication bias and experimenter influence are significant concerns. Given their strong preexisting stance on female ADHD, the researchers' personal biases and involvement could have influenced the study's outcomes. Detailed information on the experimenters' characteristics and roles, as well as the population sampling and control group recruitment, is necessary to ensure objectivity.
External validity is compromised by a lack of testing details, multiple treatment interference, effects of observation or novelty or disruption, and, again, communication bias with experimenter influence.
There is insufficient detail for successful replication, including a lack of information on population pools, controls, participant characteristics, timeline, interview techniques, and interviewer training. This lack of detail affects the operationalization and generalizability of the findings.
Extensive use of multiple tests within a short period and additional questionnaires could lead to fatigue-affecting results. The order and standardization of test administration were not specified, raising concerns about multiple treatment interference, replication, and generalizability.
The awareness of being observed by researchers might influence participants' behavior (Hawthorne effect). The novelty or disruption of bringing participants to Toronto for extensive testing might increase the self-reporting of symptoms, particularly among new female referrals, affecting the generalizability of results.
Experimenter influence and communication bias are evident again as the study does not clarify what researchers told the participants about the research, which, once again, could lead to the Hawthorne effect. If participants knew the study aimed to promote women's health, it might have increased symptom self-reporting among women. Additionally, the lack of detailed information on who conducted the tests and their potential biases, along with the researchers' partisan stance on female ADHD, could have affected objectivity and external validity.
General validity concerns are also significant. The sparse documentation of the study's implementation procedures leaves us wondering if the researchers followed the planned procedures accurately. This lack of documentation makes replicating the study or generalizing its findings difficult. Additionally, the researchers did not control for essential variables such as socioeconomic status, ethnicity, or comorbidities, which ensures the findings' applicability to a broader population.
Since the participants did not have ADHD, ethical concerns regarding the control group were minimized. However, not knowing what researchers told the control participants or how they were selected, combined with their potential biases, raises ethical and validity issues that could have influenced the study's outcomes.
Rucklidge and Tannock's (2001) study faces significant internal, external, and general validity challenges. Key issues such as an unclear timeframe, inconsistent sampling methods, lack of control over confounding variables, and insufficient procedural details undermine the study's reliability and replicability.
Reflection
After initially reading Rucklidge and Tannock's (2001) paper, I recall feeling impressed. It sounded persuasive and took a strong position that there is a big gap in ADHD research around women, and this has significant consequences for public health, mainly because it seemed that women were more severely affected by the condition than males. I was impressed by the technical language that Rucklidge and Tannock used and their confidence in positioning their argument.
My opinion changed when I did a critical analysis using Mertens' (2019) techniques. I was left feeling that this paper needed improvement, as the researchers put forward strong opinions not founded in operational science. The researcher's experiment seems arbitrary, and the results appear discredited. I developed the personal position that the authors wanted to believe that girls with ADHD need extra attention. As a result, they biased their work to convey this belief, unlike other researchers who might approach the topic with a balanced perspective.
The study's lack of operationalization, detailed procedural descriptions, and thorough control measures significantly undermines its credibility. These shortcomings highlight the importance of comprehensive documentation and rigorous methodology to ensure the reliability and generalizability of research findings.
To my mind, the researchers may have simply recruited a small sample of extremely affected and motivated females, not representative of the population with ADHD, and encouraged them to express their complaints extravagantly while recruiting equally biased groups of males with ADHD and controls, encouraging them to report being unaffected.
References
Mertens, D. M. (2019). Experimental and quasi-experimental research. In Research and evaluation in education and psychology: Integrating diversity with quantitative, qualitative, and mixed methods (5th ed., pp. 155–156). SAGE Publications, Inc.
Rucklidge, J. J., & Tannock, R. (2001). Psychiatric, psychosocial, and cognitive functioning of female adolescents with ADHD. Journal of the American Academy of Child and Adolescent Psychiatry, 40(5), 530–540. https://doi.org/10.1097/00004583-200105000-00012
Appendix A - Details on Results
For the psychiatric and demographic domains, researchers did find significant differences in the socioeconomic status and educational level of fathers in girls with ADHD compared to female controls. Researchers also found differences in overall impairment and ADHD symptoms, with females with ADHD being more impaired and showing more negative symptoms than both males with ADHD and female controls.
For the psychosocial domain, researchers found no significant differences in drug use, attribution of positive events, or any psychosocial variables between male and female controls. Significant differences were found with females with ADHD exhibiting higher levels of depression, anxiety, suicidal thoughts, interpersonal problems, negative mood, negative self-esteem, negative life events, impact of negative life events, global attribution to negative events, external locus of control, overall psychological distress, and dissatisfaction with their teachers compared to female controls. Additionally, females with ADHD also showed more depression, overall psychological distress, impact of negative life events, external locus of control, and more negative self-esteem compared to males with ADHD.
Finally, researchers looked at the cognitive domain. There were no significant differences between male and female controls, and there were no significant differences in academic achievement between males and females with ADHD. Females with ADHD had lower IQ, reading, spelling, arithmetic, and processing speed and were more distractible than the female control. Females with ADHD had lower vocabulary scores than males with ADHD, yet higher coding and processing speeds than males with ADHD.
Rucklidge and Tannock (2001) hypothesized that participants' IQ or their father's education level could explain some differences. When they controlled for these factors, they eliminated differences in locus of control and vocabulary. They also explored other controls related to medication use and date of diagnosis, which eliminated differences in vocabulary and coding scores, respectively.
Appendix B - Details on Discussion
Rucklidge and Tannock (2001) state that their paper is the first of its kind to analyze psychiatric, psychosocial, and cognitive functions between female adolescents with ADHD versus controls and males with ADHD. Despite results indicating more significant impairment of females with ADHD than males with ADHD, parents and teachers did not note any perception of difference between females with ADHD and males with ADHD. Does this indicate that teachers and parents are biased in their observations? Or that females with ADHD self-report more negatively than males with ADHD? It was also noteworthy that females with ADHD had lower vocabulary scores while males with ADHD had lower processing speed index, and there was no difference in these areas in the control groups.
Rucklidge and Tannock eliminated some gender differences when they controlled for the severity of ADHD, noting that potentially, the severity of ADHD creates increased psychological distress, not just gender. It is also possible that both males and females experience the same level of problems, yet females are more likely than males to report the problems. Furthermore, hormonal differences or differences in treatment at home or school between genders could have caused the differences. This sample was also unusual compared to previous research as participants were mainly inattentive type ADHD instead of hyperactive or combined.
Rucklidge and Tannock note certain limitations, including how recruitment was done. Males were recruited based on previous assessments, whereas 75% of the females in this study were new referrals. Another limitation was that around 10% of the sample were taking non-stimulant psychotropic medication, which could have affected things. Small sample sizes and extensive analysis could have produced spurious results.
Appendix C - Details on Critical Analysis
Below is a critical analysis of the research conducted by Rucklidge and Tannock (2001). I have used techniques described by Mertens (2019) as a guide for critical analysis.
Internal Validity
Mertens (2019) identifies many threats to internal validity that should be considered when critically analyzing a quantitative experimental, quasi-experimental, or single-group design. I will enumerate these threats in consideration of Rucklidge and Tannock's research.
History
History refers to when events other than the independent variable could have caused the results. Rucklidge and Tannock do not specify the timeframe in which they conducted this research. For example, did they interview each participant on the same day, at their convenience, or in blocks? What about the initial screening with parents, subsequent questionnaires for parents and teachers, and follow-up calls with parents? Establishing a timeframe is essential to determine whether the researchers conducted this study over a short period, such as one month, or extended it across multiple years. Suppose the researchers conducted the study over multiple years, from 1999 to 2001. In that case, many events, such as Y2K, macroeconomic conditions, political events, or natural disasters, could have affected the results. A timeline for the research would greatly help in understanding the context.
Maturation
Similar to history, if a more extended timeframe occurred, personal events may have changed the results for individuals. Especially with adolescents, it is a formative timeframe, and stressors like school, social relationships, or family changes affect the results obtained. Once again, a timeframe would have been helpful, along with an explanation of how maturation effects were mitigated.
Testing
Testing refers to participants becoming wiser about test answers when researchers give tests multiple times. In this research, Rucklidge and Tannock appear to administer the battery of tests only once and in one setting. Having a single administration of tests alleviates some of the testing concerns. However, questions remain: If researchers had exposed these participants to the tests before, could this have shaped the results? We also know that in this research, the males mainly came from previous assessments, indicating they may have been familiar with the tests and testing process. Seventy-five percent of females were new referrals who may have yet to encounter the tests. In this scenario, males may have known which answers to give to appear less negatively impacted, while females may not.
Instrumentation
Instrumentation as a threat refers to differences in test administration. In this case, each participant only had one testing session. However, a doctoral-level clinical psychologist conducted the psychiatric tests, while psychology grad students administered the psychological tests. The battery included a large number of tests, totaling around six hours. The researchers should have specified how they standardized the use of instruments. For example, the researchers should have informed us how they trained clinical psychologists and psychology grad students to use the correct technique. Additionally, they did not calculate interrater reliability. How do we ensure that the researchers used the instruments correctly for over 100 participants in this study?
If researchers provided information on the exact number and nature of individuals conducting these diagnostics, along with information on training for standardization technique, observations for interrater observability, and a schedule and timeframe of the diagnostic sessions, it would be helpful. One other point in the paper is that culturally, men may be more likely to underreport negative symptoms. I think this is a crucial point that needs further exploring: Are the instruments written in such a way that favors someone who is emotionally in touch with themselves and emotionally expressive, for example, a woman, versus a man who may not be in touch with his feelings or may not feel it is socially acceptable to complain about his feelings. I wonder how much of the differences seen between men and women in ADHD (and psychological research in general) stems from men not being willing or able to disclose negative emotions. In contrast, women are culturally encouraged to, leading to an appearance of greater suffering in women, which is really just a greater communication of suffering in women.
Statistical Regression
Statistical regression refers to groups being extreme and skewing results. In this case, the samples were relatively small, with the researchers only studying 24 females with ADHD. As stated, 75% of these females were new referrals to this clinic. More information on this would be beneficial; for example, with the majority of females with ADHD being new referrals, how could this have impacted things? What was the source of these new referrals? The researchers say they are from advertisements in pediatric offices or new referrals at the hospital. More information on the population used for referrals would be helpful.
Differential Selection
Were there differences in the experimental and control groups besides the variables the researchers studied? In this case, the researchers call out specific differences, for example, differences in fathers' educational level, IQ of girls in the group with ADHD, medication use, and severity of ADHD symptoms in the females with ADHD group. When the researchers controlled for these differences, the results did change. Nevertheless, they did not control for many other factors; the researchers did not control for socioeconomic status, ethnicity, culture, or comorbidities, for example. With such a small sample, it is feasible that the researchers introduced confounding variables.
Experimental Mortality
Experimental mortality refers to participants dropping out of the study. In this case, our participants underwent one 6-hour battery of tests, supplemented with additional questionnaires from their teachers and parents. The researchers do not mention dropouts, so we can assume that each participant completed the experiment. The researchers excluded some participants: out of 123 adolescents, they eliminated five due to low IQ (< 80) and 11 because their ADHD symptoms were not strong enough. It would have been nice if the researchers had explicitly said that no dropouts had occurred.
Selection-maturation
Selection-maturation refers to differential selection due to biological or psychological differences. In this experiment, the researchers did not control for hormonal balances from a biological perspective. No other biological characteristics were discussed, such as height, weight, or physical health conditions. The researchers focused on participants aged 13-16, a critical developmental period when there may be vital biological differences between males and females. From a psychological perspective, the researchers used no controls for comorbid psychological conditions, so we do not know if groups contained implicit biases.
Experimental Treatment Diffusion
Experiential treatment diffusion refers to experiment and control groups being close enough to share ideas. In our case, the researchers provided few details on how they conducted the interviews. For example, were they on the same day? Were all participants sitting in a waiting room together and discussing things before entering for their tests? Or, if the tests were on different days, was there communication between participants around their tests? Or communication between parents and teachers who were also doing questionnaires and interviews? How many teachers were involved? Did one teacher have numerous participants, and did they diffuse information between them? The researchers shared very little information on these practices.
Compensatory Rivalry by the Control Group
Compensatory rivalry occurs when a control group tries hard to get specific test scores. We are left wondering what precautions were put in place to ensure the control group was unbiased. The researchers told us the control groups came from hospital staff and community resources. The researchers need to provide more information in this regard. It is a very vague description. How can we ensure that hospital staff and community resources as control group members were randomized correctly and did not have biases?
Compensatory Equalization of Treatments
Compensatory equalization of treatments refers to whether or not the researchers gave extra resources to the control group. The researchers provided little information on our control groups besides that they were hospital staff or came from community resources. We do not know what incentives the control group had nor what information they were told or believed. They could have been given extra informational or belief resources that skewed the results. Potentially, as hospital staff and community members, they wished for the research to be a success, so they downplayed their own psychological disturbances.
Resentful Demoralization of the Control Group
Was the control group demoralized in any way? For example, did they feel left out? The researchers should have explained how they asked the control group to participate. For example, did they know the study was on ADHD? Did the participants know the researchers were comparing them as individuals without ADHD to individuals with ADHD? Did they know it was a study on gender differences? The researchers provided very little information here.
External Validity
Mertens (2019) also identifies threats to external validity. Below, I discuss these threats in relation to Rucklidge and Tannock's research.
Treatment Detail
Treatment detail refers to how much detail researchers use to describe the treatment. This research needs to offer more details for successful replication, and it needs to be better operationalized. For example, we needed to be told in detail what population pools researchers used for recruitment. Researchers should have provided more information on the controls they used or the characteristics of the participants beyond those they investigated. We also needed to be given more materials on the exact timeline and techniques of conducting the interviews, the training provided to the interviewers, and the information shared with the participants.
Multiple Treatment Interference
In this example, the researchers used many testing instruments: over a dozen tests in 6 hours and additional questionnaires for the parents and teachers. We need to find out the order in which researchers administered the tests and whether they standardized the tests in that order. It appears likely that fatigue could have affected the researchers and participants, with the results of the later tests being affected more than the results of the earlier tests. Exact information on how researchers carried out this procedure would make replicating this experiment easier.
Hawthorne Effect
The Hawthorne effect refers to participants trying harder because researchers are observing them. In this research, the researchers must tell us how they informed participants about the study. Therefore, we must determine the prompts given to participants to understand how precisely they believed the researchers were observing them. For example, did the control group know they were part of an ADHD gender study? What the researchers tell participants and their beliefs can influence their efforts during the experiment. The researchers shared little information on this, making the study hard to generalize.
Novel or Disruptive Treatment
Novel or disruptive treatments can affect results. In this case, bringing participants to Toronto for a 6-hour battery of tests could be seen as novel or disruptive. Researchers gave participants lunch and parking, but researchers have yet to tell us anything else about incentives to participate. This research may have been a novel experience for the participants, and this novelty may have increased the self-reporting of results for those with ADHD, particularly women, if researchers told them the study aimed to promote women's public health. Furthermore, 75% of the women with ADHD were new referrals, while the men with ADHD were mostly previous clinical clients. Therefore, there may be a disparity in novelty and disruption between men and women. The researchers provided few details on how they populated the control groups, which adds to the uncertainty.
Experimenter Influence
Researchers give us little information on who the experimenters are; for example, they tell us that the diagnostician psychiatrist was Rucklidge and that the experiment used undergraduate psychology students for psychology diagnostics. The researchers provide no other information on their characteristics. The research does not tell us if Tannock participated in conducting the study and, if so, in what way. In Rucklidge and Tannock's research, they appear to express an opinion on the field by citing research highlighting the neglect of women in previous ADHD studies, which poses a public health concern, despite research that holds an opposing view, which they do not introduce. Their position is partisan. They do not use objective language to say, for example, that some researchers feel previous studies have neglected women, which may pose a public health risk, and that they want to find out if this is true. Instead, their language indicates that they come into the research with a point they want to prove rather than an objective, open mind. I wonder if this became an undeclared or unconscious bias that has tainted their work.
Pretest / Posttest Sensitization
In this case, the adolescents underwent a single battery of tests in what appears to have been one sitting. A consent form, questionnaires, and a potential follow-up interview with parents were also sent to parents and teachers. We do not know how much teachers or parents communicated pretest/posttest information with adolescents and whether or not they could have sensitized them to information.
Interaction History
Once again, we need to find out the timeframe under which the researchers conducted this work, so we cannot say with certainty which historical events may have caused an interaction effect here. We also need to find out how representative the sample was of its population or its diverse characteristics.
Influence of Measurement
Similarly to threats to internal validity, we are not sure the instruments were used in a standardized manner by researchers. Little documentation exists on any form of use, including how researchers were trained or observed for interrater reliability. Once again, we also do not know how having so many instruments administered at once affected their reliability. Fatigue of researchers and participants and test interaction may have played a role.
Time Lapse Between Treatment and Measurement
In this case, we could not manipulate the independent variable because our experiment was a quasi-experiment. We needed to assign participants to groups where their characteristics already existed. This experiment had no active intervention or treatment.
Other Threats to Validity
Finally, Mertens (2019) offers other threats to external validity for quantitative experimental, quasi-experimental, and single-group designs.
Implemented as Planned
Implementing as planned relates to proper steps being put in place to ensure the researchers implemented the experiment according to plan. This research needs more documentation regarding procedures for experimenting. The sparsity of documentation leaves us to wonder if researchers carried out work according to plan, and it also makes it harder for us to replicate this research or to generalize its findings.
Strength of Treatment
In this case, we did not have a treatment, although we have concerns regarding how consistently and reliably the dependent variables were measured. The research needed to document planning and provide adequate operational definitions.
Ethics of Denying Control Treatment
Our control participants, in this case, did not have ADHD. Control participants without ADHD minimize the ethical concerns of not providing them with ADHD treatment. However, the researchers did not explain what they told the control participants. How were they selected? What kind of assignment to groups did researchers consider for the desired controls? Ethics is still a concern here, as researchers may have given participants information promoting them to join the experiment that biased their participation, or researchers may have selected them in an unethical and biased manner.