Army General Classification Test

The Army General Classification Test (AGCT) is the predecessor to the AFQT, boasting a g-loading of ~0.92. This 40 minute comprehensive test evaluates verbal, quantitative, and spatial abilities and is accepted by Mensa, Intertel and other High IQ societies.

Keep in mind, reattempts are invalid as there is only one form, so needless to say, increases in scores after a reattempt are expected. Please wait at least 6 months before reattempting for an accurate score. This test is intended for native English speakers, as well.

Scratch paper is ALLOWED while calculators are NOT ALLOWED. The score at the end will have a standard deviation of 15. PLEASE CAREFULLY READ THE INSTRUCTIONS AND UNDERSTAND THE SAMPLE PROBLEMS BEFORE TAKING THE TEST.

History and purpose

After many concerns during World War II over the misassignment of soldiers into unsuitable roles and the underutilization of more capable soldiers, the US Army spent lots of resources towards commissioning an intelligence and aptitude test, resulting in the early forms of the AGCT. After the end of World War II, the AGCT continued to undergo constant improvements and revisions to ensure its accuracy. Amassing an enormous sample of more than 12 million soldiers, this transcends the samples of modern professional tests by over 5 thousand times.

Due to the wide range of ages that drafted soldiers could be, the test was tailored to provide accurate scores from teenagers to middle-aged adults. Furthermore, with drafted soldiers of all classes and lifestyles being the intended testees, the test was designed with questions that minimized prior knowledge from education and culture. Although interestingly enough, it was found that high correlations with schooling continued to endure.

A test of ‘g’

In order to rehabilitate this test for modern use, a few things had to be done.

original score distribution had to be re-normalized by correcting for skew
Norm obsolescence, if any, had to be ascertained and accounted for
The g-loading has to be estimated

1. New distribution

The original distribution is highly left-skewed. This is because those charged with the norming underestimated the number of easy questions on the test. This resulted in a test that discriminates well in the low range (you don’t want to draft morons), but not as effectively in the higher range.

In order to correct for this flaw, the test had to be re-normalized. With percentile rank-equating, it was possible to generate new aligned norms.

Norm obsolescence

It is normal to wonder if a test from 1941, 82 years ago, is still valid today. Consider this:

In 1980, during the renorming of the ASVAB, the AGCT was pitted against it. It was found that the percentiles matched nicely at all ranges. 39 years later, where Flynn effects would have predicted a systematic inflation of nearly 12 pts, what was found was a simple fluctuation of the sign of the difference between the tests throughout the range.

This can be easily attributed to either sampling or error of measurement. There are absolutely no Flynn effects for this test.

Before it was released on the subreddit, it was given to dozens of people within the community with known scores from professional tests. More often than not, AGCT ended up being one of their lower rather than higher scores. This gives me great confidence to declare that the AGCT is not an obsolete test.

Construct validity

The ‘g-loading’ is the degree to which a test correlates with the ‘g factor’ or general intelligence. A higher g-loading means a test is better, and figures above 0.8 are generally considered to be great. These correlations are often derived through factor analysis. As item data for this test is impossible to get by, we can first estimate this test’s accuracy by its proxy g-loading from its successors, the ASVAB and AFOQT.

Factor analyzing these two batteries, and deriving composites from subtests that most resemble the AGCT in terms of content was the only way to get an appraisal of its construct validity.

From the ASVAB, the pseudo-AGCT composite yielded a g-loading of .92, whereas the AFOQT pseudo-AGCT composite had a g-loading of .90. Averaging the two gives an estimate of ~.91.

Furthermore, using data from the automated AGCT form at CognitiveMetrics, the g-loading for the AGCT can be calculated. With a sample size of 1734 and M 121.7 SD 12.95, we can calculate the reliability at 0.941 and after being corrected for range, 0.956.

The g-loading of this sample is 0.816 and after being corrected for range restriction and SLODR, the g-loading has been calculated at 0.925, further aligning with our estimations above. The g-loading unadjusted for V is 0.535, Q is 0.733, and S is 0.597. It isn’t possible to correct for SLODR due to lack of individual norms, but after correcting for range restriction, the g-loadings are 0.659 for V, 0.733 for Q, and 0.646 for S.

A g-loading of 0.925 is highly impressive for an 82-year-old test. Factorial validity is manifest.

More about the AGCT:

3. https://www.yumpu.com/en/document/read/15323423/the-asvab-score-scales-1980-and-world-war-ii-cna

2. https://clearinghouse-umich-production.s3.amazonaws.com/media/doc/79410.pdf

1. https://sci-hub.wf/10.1037/0021-9010.77.6.875