Sydney Learning and Understanding Test having in it items on Letter Series, Analogy, Knowledge and Matrices is actually a collection of tasks developed by psychologists to assess human cognitive abilities, derived from within the framework of fluid and crystallized intelligence. One of our major was to determine the reliability and consistency of Sydney Learning and Understanding Test and the extent of errors. We also examined the concept of validity. Besides, we also discussed the theoretical perspective behind this test. The Test was administered on 1227 persons, of whom 58% were females and 42 % were males. The participants were adults above 18 years, with their mean age at 31.44 years. The results of the test indicated a fairly high internal consistency and validity, but the results cannot be used to infer anything on predictive validity.
Sydney Learning and Understanding Test was designed to test four types of mental problems. These are Letter Series, Analogies, Knowledge and Matrices. The Letter Series requires one to identify the Letter that should be placed next in the series of letters. Analogies require the participant to decipher the relationship between two words on the basis of which the relationship between a third word and a fourth word to be chosen out of five, can be established similarly. The questions on knowledge test general or academic knowledge. The questions on matrices expect the participants to identify the pattern of movement of a set of three signs across the matrices.
This test is actually a collection of tasks developed by psychologists to assess human cognitive abilities, derived from within the framework of fluid and crystallized intelligence (Horn & Cattell, 1966). The battery which forms the basis of our analysis here has been provided for teaching purposes and should not be taken to reflect ‘Intelligence Quotient’. The battery given here is intended to measure Gf and Gc. We need to explain some of the terms used here, before we proceed further.
Gf and Gc derive their name from two broad intellectual abilities most extensively studied. The crystallized and fluid intelligence measured by this test pertain to the elements of formal education and acculturation that exist in the content of this test. It is believed that Gf depends to a lesser extent than Gc on formal education experiences (Horn and Noll, 1994). There are other abilities such as SAR, TSR, Gv and a few more within Gf/Gc theory, but the battery is designed to measure only Gf/Gc.
One of our major goals here is to determine reliability and consistency of Sydney Learning and Understanding Test and the extent of errors, if any. We shall also examine the concept of validity. Besides, it will be our endeavour to determine the theoretical perspective behind this test.
The concept of intelligence is related to two aspects, ability to solve complex problems (complexity aspect) and the ability to acquire new knowledge (the learning aspect) (Carlstedt, Gustafsson & Ullstadius 2000).The tests designed to measure general cognitive ability (G) have met with considerable success in practical applications even as the construct of G is poorly understood. That explains why the measuring instrument has not developed much over the last several decades (Carlstedt, Gustafsson & Ullstadius
The test must be reliable and consistent. The greater the degree of reliability, the better is its validity or result. The factor of error does not give the true score of a student. Therefore the score obtained in a test is given by the formula- X = T + e, where X is the obtained score, T is the true score and e represents error of measurement (Gregory, 2004).
Validity of a test has two aspects: 1) validity of measurement and 2) validity for decisions. Each of the test items must be within content domain and proportionately distributed within each sector of the domain to give a greater content validity. While the opinion is divided on whether content validity can also provide the validity for decisions (e.g. decisions to recruit in jobs requiring specific skills), a majority of the experts are of the opinion that content validity is relevant only in the validity of measurement, not in the validity of decisions based on the test score (Murphy,2005, Ch.8).
It may be crucial to point out here that the validity of longstanding belief in stable innate capacities and restraint such supposed innate capacities have on individual’s ultimate performance potential is doubtful. “The reviewed empirical evidence from the acquisition of expert performance contradicts this theory and demonstrates that individuals gradually acquire increasingly complex mechanisms roughly ordered along an ordinal path leading to elite levels of performance (Ericsson, 2003).” This has a very crucial implication for the test. A test taker can improve his score without necessarily implying the doubtfulness, lesser reliability or lesser validity of the measuring instrument (the test).
A question raised about the tests like Sydney Learning and Understanding is whether or not there is a substantial correlation between working memory (WM) capacity and general (fluid) intelligence tests, as ubiquitously found in intelligence research? People with high WM capacity can keep in memory many elements and are therefore good at storing sub results needed within an item (Verguts & Boeck, 2002). If this correlation really exists, the tests like Sydney Learning and Understanding will have several consequences. These tests will really be a test of WM capacity rather than the exactly the traits intended to be measured and consequently it will have low content validity and even lower decision validity. “Current proposal implies two things: First, people use the same rule through out the test and become “primed” to use these rules. Second, the amount of priming is a factor of individual differences related to WM capacity (Verguts & Boeck, 2002).”
As against the above, Verguts & Boeck (2002) have argued that another factor may be partly responsible for this correlation, namely, that people with a high WM capacity can store many solution principles over items. They conducted two experiments that validated their alternative explanation.
In another study by Unsworth & Engle (2005), it was strongly suggested that “the number of goals or sub-results that can be held in memory does not account for the shared variance between working memory span measures and fluid intelligence. Thus the results do not support the hypothesis advanced by Carpenter et al (1990) that the link between individual differences in working memory capacity and intelligence is due to differences in the ability to hold a certain number of items in the working memory.” In other words, the reason working memory tasks are consistently good predictors of fluid ability is due to something else, that is, the ability to control attention.
Sydney Learning and Understanding Test was administered on 1227 persons, of whom 58% were females and 42 % were males. The participants were adults above 18 years, with their mean age at 31.44 years. They were healthy adults with familiarity with English language. The directions and questions were read in a clear and confident manner, and they fully understood each task. The participants were thanked for their cooperation and assured of confidentiality. They were made comfortable in context of whether or not they could answer any question. They were invited to ask any question they had. The test proceeded as planned after recording their age, language and sex. After the completion of the experiment session, the participants were thanked and the nature of research that was conducted was explained to them.
The battery was designed to measure only Gf and Gc. The guidelines were followed to ensure that all participants received exactly the same instructions. Otherwise, wide variations in results could have taken place. The test-battery included a blank score sheet to record the participant’s answers. The answer sheets were scored at a later date.
On an average the participants answered 82% of the Letter series answer correctly, where the standard deviation was 15% followed by Analogy, Knowledge and Matrices. Only 46% of the matrices (with standard deviation of 22%) items could be answered correctly by the participants.
All item-wise results were statistically significant at P< -0.05, two tails, for Pearson’s product moment correlation co-efficient, while not applying to factor loadings and factor correlations. The two factors were generated through an Exploratory Factor Analysis (EFA), using oblique rotation. After rotation, Gf accounted for 49.36% and Gc for 22.7% of the total variance. Between subjects (t-test) for mean differences between females and males (df = 1225), none of the results were statistically significant at alpha = 0.05, two tails.
From the available results, we arrive at a number of conclusions. In Table-2, we notice age and test correlation coefficients. The analysis of this tabular information reveals as people get older, their correct response to analogy tends to decrease. The same is true for the matrices. Letter Series and Matrices share 46% of their behaviour (variability). A large percentage of knowledge variance (94%) accounts for the Gc factor. After rotation, Gf and Gc account for 76.06% of the test item’s behaviour. Alpha coefficients for the two intelligent factors Gf and Gc with 3 and 2 items respectively were 0.71 and 0.62 respectively.
The possible range of reliability coefficient is between 0 & 1, where 0 indicates complete unreliability, while 1 indicates complete reliability. The correlation coefficient can take values from -1.00 to +1.00, with the difference that (-) indicates reverse correlation. From the table 2, we get reverse correlations for all the categories of items on the test and age. However, the items related to knowledge were an exception with the correlation of -0.09. The high correlation of Gc and Knowledge (0.97) goes on to provide greater evidence in favour of Gc as a factor of intelligence dependent on formal education.
As for the further reliability of this test we would have liked to have used the same test on same and different individuals over a period of time so as to assess its consistency under controlled conditions. However, a significantly higher percentage (of 72.06) of the two factors Gf and Gc explaining test-item’s behaviour produces confidence in us about the reliability of the instrument. In other words the instrument is highly trustworthy so far as representing questions that can be said to measure the factors associated with G. The factors associated with Gc had directly correlated items under analogy and knowledge (0.49 and 0.97) but Letter Series and Matrices had negative correlations. That is quite understandable by the very concept of Gc. Similarly Gf showed a higher correlation of 0.79, 0.51 and 0.85 on Letter Series, Analogy and Matrices. The lower correlation in case of Analogy (0.51) and Knowledge (-0.02) can be understood in terms of crystallized and fluid intelligence that depends to a lesser extent on formal education and acculturation.
Conclusion: In conclusion we may say the test is fairly reliable in terms of factors it is designed to represent. However, nothing can be said about its predictive and decision making values. For instance, it cannot be inferred from the result that those scoring high on this test can be said to perform well in real life tasks that call for the attributes of Gf and Gc, because much of what constitutes Gf and Gc and what really accounts for in terms of solving these items is still a matter of study and research.