You are on page 1of 23

The Law of Anomalous Numbers Author(s): Frank Benford Source: Proceedings of the American Philosophical Society, Vol.

78, No. 4 (Mar. 31, 1938), pp. 551-572 Published by: American Philosophical Society Stable URL: http://www.jstor.org/stable/984802 . Accessed: 08/02/2014 11:22
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

American Philosophical Society is collaborating with JSTOR to digitize, preserve and extend access to Proceedings of the American Philosophical Society.

http://www.jstor.org

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

THE LAW OF ANOMALOUS NUMBERS


FRANK BENFORD Electric Research General Laboratory, Company, Physicist, Schenectady, New York (Introduced by Irving Langmuir) (Read April22, 1937)
ABSTRACT

that the first It has been observed pages of a table of common logarithms thatmoreusednumbers showmore wearthando thelast pages,indicating begin withthedigit1 thanwiththe digit9. A compilation ofsome20,000first digits shows sources thatthere is a logarithmic takenfrom distribution widely divergent ormore digits are composed offour offirst whenthenumbers digits. An analysis different showsthatthe numbers from sources ofthe numbers takenfrom unresuchas a group ofnewspaper showa much latedsubjects, better items, agreement witha logarithmic distribution thando numbers from mathematical tabulations or otherformal data. There is herethe peculiarfactthat numbers that indiin largegroups, in good viduallyare without relationship are, whenconsidered witha distribution law-hence the name " Anomalous agreement Numbers." A further ofthedata shows a strong analysis for tendency bodiesofnumerical data to fallintogeometric series. If theseries is madeup ofnumbers containing ormore thefirst form a logarithmic three series. If thenumbers condigits digits thegeometric relation stillholdsbutthesimple tainonlysingle digits logarithmic no longer relation applies. An equationis givenshowing the frequencies of first digitsin the different 1 to 10, 10 to 100,etc. orders ofnumbers The equationalso givesthefrequency ofdigits in thesecond, third place ofa multi-digit and it is shown thatthesamelaw appliesto reciprocals. number, Thereare manyinstances thatthegeometric showing or thelogarithseries, miclaw,has longbeenrecognized as a common phenomenon in factual literature and intheordinary affairs oflife. The wire gaugeand drill gaugeofthemechanic, the magnitude scale of the astronomer and the sensory response curvesof the are all particular psychologist examples ofa relationship thatseemsto extend to all human affairs. The Law ofAnomalous is thusa general Numbers probability law ofwidespread application. PART I: STATISTICAL DERIVATION IT OF THE LAW

has been observedthat the pages of a much used table of common logarithmsshow evidences of a selective use of the natural numbers. The pages containingthe logarithms of the low numbers1 and 2 are apt to be more stained and frayedby use than those of the highernumbers8 and 9. Of
PROCEEDINGS VOL. 78, NO.

4, MARCH 1938

OF THE

AMERICAN

PHILOSOPHICAL

SOCIETY,

551

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

552

FRANK BENFORD

course,no one could be expected to be greatlyinterestedin but the mattermay be the conditionof a table of logarithms, ofstudywhenwe recallthat the table moreworthy considered and engineering, is used in the building up of our scientific, general factual literature. There may be, in the relative table, data on how we cleanlinessof the pages of a logarithm thinkand how we react when dealing withthingsthat can be describedby means of numbers. Methodsand Terms the the data collectedwhileinvestigating Beforepresenting law that applies to numerpossibleexistenceof a distribution it may and to randomdata in particular, ical data in general-, a fewtermsand outlinethe methodofattack. be wellto define is made betweena digit,whichis one First, a distinction of the nine natural numbers 1, 2, 3, ... 9, and a number, whichis composedof one or moredigits,and whichmay contain a 0 as a digitin any positionafterthe first. The method any tabulationofdata that is not ofstudyconsistsofselecting in some way in numerical range,or conditioned too restricted too sharply,and makinga count of the numberof times the natural numbers 1, 2, 3, ... 9 occur as firstdigits. If a naturalnumberit decimalpoint or zero occursbeforethe first is ignored,forno attentionis to be paid to magnitudeother digit. than that indicatedby the first The Law of Large Numbers was made to collect data fromas many fieldsas An effort types. possible and to include a variety of widely different no numbers that have random from purely The types range relationotherthan appearing withinthe covers of the same magazine, to formalmathematicaltabulations that admit of no variationfromfixedlaws. Between these limitsone will recognizevarious degrees of randomness,and in general the title of each line of data in Table I will suggestthe nature of the source. In every group the count was continuousfrom to the end, or in the case oflong tabulations,to the beginning numberof observationsto insure a fair average. a sufficient

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

THE LAW OF ANOMALOUS NUMBERS

553

The numbers countedin each groupis givenin the last column of Table I.
TABLE I
PERCENTAGE OF TIMES THE DIGITS IN NUMBERS, NATURAL NUMBERS AS DETERMINED BY First Digit
T itle
_ _ _ _ _ _ _ __ - _ _ _ - _ _ _ - _ _ _

1 TO 9 ARE USED AS 20,229OBSERVATIONS

FIRST

-~ount 9

B C D

A Rivers, Area 31.0 16.4 10.7 11.3


Population 33.9 41.3 Constants Newspapers 30.0 20.4 14.4 18.0 18.3 18.4 14.2 4.8 12.0 12.8 11.9 8.1 8.6 10.0

E Spec.Heat
F Pressure G H.P. Lost

24.0 18.4 16.2 14.6 10.6


29.6 30.0

7.2 10.6 8.0 8.3 8.1 8.2 6.6 8.3

7.2

8.6
6.2 5.8 6.0

I Drainage 27.1 J AtomicWgt. 47.2 K n-, i/n,*.**25.7 L Design 26.8 N Cost Data 0 X-RayVolts P Am. League Q Black Body R Addresses S n',n2... n! T Death Rate
Average.

H Mol. Wgt.

26.7 25.2 15.4 10.8


23.9 18.7 20.3 14.8 13.8 5.5 9.7 14.3 12.6 4.4 6.8 7.5 10.1 9.0 9.8 8.7 8.8 9.4

9.8 10.8

4.1
6.4 7.0

4.1 1.0 6.0 5.7 5.1

5.5

3.2

3.7 2.9 5.0 4.4 5.1 2.5 4.4

4.2

4.8 2.8 8.0 4.9


5.5 5.8 5.6 4.7 5.0 4.8 7.3

2.2 3259 10.6 104 5.0 100 4.7 3.6 703 690

5.1

335

4.1 1389 3.2 1800


1.9 5.5 5.6

6.7 6.6 7.1


9.8 8.1 7.4 6.6 8.5

M Digest

32.4 27.9 32.7 31.0 28.9

33.4 18.5 12.4


18.8 17.5 17.6 17.3 19.2 10.1 14.4 12.6 14.1 12.6

7.5

5.0 4.4 6.8 8.4

5.1

4.1
5.0 3.3 7.2 7.0

8.9 5000 4.2


3.1 4.8 3.0 741 707 1458 1165 342 418

159 91

6.5
5.5 7.4 6.4 7.0 6.4 6.5

5.5
4.7 5.1 4.9 5.2 5.6

308

560

25.3 16.0 12.0 10.0


27.0 18.6 15.7 18.5 +0.4

6.7

8.5

8.8

6.8
7.2

7.1

5.0 4.1

54

5.5

900

Probable Error 4-0.8

. . . . . .30.6

12.4 9.4 8.0 6.4 5.1 4.9 4.7 -0.4 +0.3 +-0.2 +-0.2 +-0.2 +-0.2 +0.3

1011

At the foot of each column of Table I the average percentage is given for each firstdigit, and also the probable errorof the average. These averages can be better studied if the decimal point is moved two places to the left,making the sum of all the averages unity. The frequency of first l's is then seen to be 0.306, whichis about equal to the common of first logarithmof 2. The frequency 2's is 0.185, which is slightlygreater than the logarithmof 3/2. The difference here,log 3 - log 2, is called the logarithmic integral. These resemblances persistthroughout, and finally thereis 0.047 to with or be compared log 10/9, 0.046.

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

554

FRANK BENFORD

The frequency of first digits thus follows closely the logarithmic relation
Fa

= log (a +

),

(1)

whereFa is the frequency of the digit a in the firstplace of used numbers.


TABLE II
OBSERVED AND COMPUTED FREQUENCIES Natural Number Number Interval Observed Frequency Logarithm Interval Observed - Computed Prob. Error of Mean

2 3 4 5 6 7 8 9

2 to 3 3 to 4 4 to 5 5 to 6 6 to 7 7 to 8 8 to 9 9 to 10

1 to 2

0.185 0.124 0.094 0.080 0.064 0.051 0.049 0.047

0.306

0.176 0.125 0.097 0.079 0.067 0.058 0.051 0.046

0.301

+0.005

+0.009 -0.001 -0.003 +0.001 -0.003 -0.007 -0.002 +0.001

?40.004 -0.004 ?40.003 ?40.002 ?t0.002 4t0.002 ?t0.002


?t0.003

?t0.008

There is a qualification to be noted immediately,for Table I was compiledfromnumberscomposed in general of and six digits. It will be shown later that Eq. (1) four,fivelaw for largenumbers,and there is a more is a distribution numbersof general equation that applies when considering one, two significant digits. If we may assume the accuracyof Eq. (1), we thenhave a law ofthe mostgeneralnature,forit is a probabilprobability ity derived from"events" throughthe medium of theirdescriptivenumbers;it is not a law of numbersin themselves. The range of subjects studied and tabulated was as wide as and as no definite timeand energy exceptionshave permitted; ever been observed among true variables, the logarithmic law forlarge numbersevidentlygoes deeper among the roots ofprimalcauses than our numbersystemunaided can explain. Frequency ofDigits in theqthPosition The second-place digits are ten in number,for here we the frequency musttake 0 into account. Also, in considering

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

THE LAW OF ANOMALOUS NUMBERS Fb

555

of a second-place digit b we must take into account the interval between digit a that preceded it. The logarithmic to two digitsis now to be dividedinto ten parts corresponding ... of 9. Let a be the first the ten digits0, 1, 2, digit a nummeanber and b be the second digit;thenusingthe customary ing of position and order in our decimal system a two-digit numberis written ab, and the next greaternumberis written ab + 1. The logarithmic interval between ab and ab + 1 is log (ab + 1) - log ab, while the interval covered by the ten possible second-place digits is log (a + 1) - log a. Therefore the frequencyFb of a second-place digit b followinga first-place digit a is = log F Fb= )/l1 Og ( ab?+ ab ,1og + a (2)

As an example,the probability Fb of a 0 following a first-place 5 in a randomnumberis the quotient Fb = log0/ log .

It followsthat the probability fora digitin the qthpositionis ... p (q +1) 1 lgabc abc ... pq Fb = -3abc o (p+1)) log abc... oIp ..
abc p

Here the frequencyof q depends upon all the digits that precede it, but when all possible combinations of these digits are takeninto accountFq approachesequalityforall the digits 0, 1, 2, *.. 9, or Fq 0.1. (4) As a resultof this approach to uniformity in the qth place of digitsin all places in an extensivetabulathe distribution tion of multi-digit numberswill be also nearlyuniform.

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

556

FRANK BENFORD TABLE III


FREQUENCY OF DIGITS IN FIRST AND SECOND PLACES

Digit

First Place

Second Place

1. 2. 3. 4. 5. 6. 7... 8.... 9.

0.

0.000 0.301 0.176 0.125 0.097 0.079 0.067 0.058 0.051 0.046

0.120 0.114 0.108 0.104 0.100 0.097 0.093 0.090 0.088 0.085

Reciprocals and scientificdata are Some tabulations of engineering such as candles per watt, and watts given in reciprocalform, per candle. If one formof tabulation followsa logarithmic then the reciprocaltabulation will also have the distribution, will show that this same distribution. A little consideration must followfordividingunityby a given set of numbersby withmerely leads to identicallogarithms means of logarithms a negativesign prefixed. The Law ofAnomalousNumbers A study of the itemsof Table I shows a distincttendency forthose of a randomnatureto agreebetterwiththe logarithmic law than those of a formalor mathematicalnature. The was foundin the arabic numbers(not spelled best agreement out) of consecutivefrontpage news items of a newspaper. Dates were barred as not being variable, and the omissionof the counted digitsto numbers spelled-outnumbersrestricted 342 streetaddressesgiven in the cur10 and over. The first rentAmericanMen ofScience (Item R, Table IV) gave exceland a complete count (except for dates and lent agreement, page numbers)of an issue of the Readers' Digest was also in agreement. On the other hand, the greatest variations from the relation were found in the firstdigits of mathelogarithmic

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

THE LAW OF ANOMALOUS NUMBERS

557

maticaltables from engineering handbooks,and in tabulations ofsuch closelyknitdata as Molecular Weights,Specific Heats, Physical Constantsand AtomicWeights.
TABLE IV
SUMMATION OF DIFFERENCES BETWEEN FREQUENCIES OBSERVED AND THEORETICAL

Nature

Nature

1 2 3 4 5 6 7 8 9 10

D F G R P Q 0 M A T

Newspaper Items 2.8 Pressure Lost,AirFlow 3.2 H.P. Lost in AirFlow 4.8 Street Addresses, A.M.S. 5.4 Am. League,1936 6.6 Black Body Radiation 7.2 X-Ray Voltage 7.4 Readers' Digest 8.4 AreaRivers 9.8 Death Rates 11.2

11 N Cost Data, Concrete 12.4 12 S n.... n8,n! 13.8 13 L DesignData Generators 16.6 14 B Population, U. S. A. 16.6 15 I DrainageRate ofRivers 21.6 16 K n-1,-Vfn .*22.8 17 H Molecular Wgts. 23.2 18 E Specific Heats 24.2 19 C Physical 34.9 Constants 20 J Atomic 35.4 Weights

These factslead to the conclusionthat the logarithmic law applies particularly to those outlaw numbersthat are without known relationshiprather than to those that individually followan orderly course;and therefore the logarithmic relation is essentiallya Law of Anomalous Numbers.
PART

II: GEOMETRIC BASIS

OF THE LAW

The data so farconsidered have been composedentirely of used numbers;that is, numbersas they are used in everyday affairs. There must be some underlying causes that distort what we call the "natural" numbersysteminto a logarithmic and perhaps we can best get at these causes by distribution, first examiningbriefly the frequency of the natural numbers themselves when arranged in the infinitearithmeticseries 1, 2, 3, ... n, wheren is as large as any numberencountered in use. Let us assume that each individualnumberin the natural numbersystemup to n is used exactlyas oftenas everyother individual number. Starting with 1, and counting up to

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

558

FRANK BENFORD

10,000,forexample, 1 would have been used 1,112 times,or 11.12 per cent of all uses. If the count is extendedto 19,999 l's occur in 55.55 per cent thereare 9,999 l's added, and first of the 19,999 numbers. When number 20,000 is reached there is a temporarystopping of the addition of first l's and 90,000 of the other digits are added to the series before FRfEQfJENVCY Or P/,Sr PLACC D/G/r3 0 OBSERVED

0.30

/ 2

3 4

7 8 3

ofobserved and computed frequencies formulti-digit FIG. 1. Comparison numbers.

intothe series,at100,000. At thispoint l's are again brought the percentageof l's is again reduced to 11.112 per cent as illustratedin curve A of Fig. 2. This curve is Fn and log n scale. If the equations forA plotted to a semi-logarithmic forthe threediscontinuous but connectedsections are written 10,000-20,000, 20,000-99,999 and 99,999-100,000 the area underthe curve will be veryclosely0.30103, wherethe entire area of the frameof coordinateshas an area 1. But an integrationby the methodsof the calculus is merelya quick way of adding up an infinite numberof equallyspaced ordinatesto the average heightof this additionfinding the curve and from

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

THE LAW OF ANOMALOUS NUMBERS

559

the ordinatesand hence the area underthe curve. But if we are satisfied witha resultsomewhatshortof the perfection of the integralcalculus we may take a finitenumberof equally spaced ordinatesand by plain arithmetic come to practically the same answer. By definition each point of A represents

LINEAR FREqU/NC/ES 1 2 3 4 5 67

FROM /Q000TO/0,000
89

/0

ArFOR/ 8FOR 9

0.4 0O3

NA7VRAA NUMBER
FIG. 2. Linear frequencies of the naturalnumber systembetween10,000and

100,000.

the frequency of first l's from1 up to that point,and an integration (by calculus or arithmetic)under curve A gives the average frequency offirst l's up to 100,000. The finite number corresponding to equally spaced ordinatesnow representsa geometric series of numbersfrom10,000 to 100,000,and it is substantially thisseriesofnumbers, in thisand otherordersof

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

560

FRANK BENFORD

the natural numberscale that lead to the,numericalfrequencies already presented. digit. The frequency Curve B of Fig. 2 is for9 as a first of9's decreasesin the numberrangefrom10,000to 89,999 and then increasesas 9's are added from90,000 to 99,999, and an undercurve B leads to a good numericalapproxiintegration intervallog 10 - log 9, as called for mationto the logarithmic by the previousstatisticalstudy. Series and Logarithmic Geometric seriesand a logarithof a geometric The close relationship demonstration. micseriesis easilyseenand hardlyneedsformal spaced ordinates of Fig. 2 forma geometric The uniformly series of numbersfor these numbershave a constant factor is determined and thisconstantfactor betweenadjacent terms, increment. logarithmic in size by the constant Semi-LogCurves plottedto a semi-logarithmic A geometric seriesofnumbers In line. the originaltabulation of obscale gives a straight served numbersthe line of data marked "R" is designated simplyas "street addresses." These are the streetaddresses Men American in the current 342 people mentioned ofthe first of such a list is hardlyto be disofScience. The randomness be usefulforillustrativepurputed, and it should therefore poses. indicatedby the height In Fig. 3 these addresses are first of the lines at the base of the diagram. The heightof a line, measured on the scale at the left,indicates the numberof addresses at, or near, that streetnumber. Thus therewere fiveaddressesat No. 29 on various streets. In orderto make the trendclearer,the heightsof these lines were summed,beacross to the right. It was at the leftand proceeding ginning found that four straightlines could be drawn among these of trend,and these four summationpoints with fair fidelity lines representfour geometricseries, each with a different factorbetweenterms. Each line will give the observedfre-

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

~~D/UTRl8ur/ON AND 3UMMA7YON OF F/Rsr 34 2 5IMER/CAN MEN OF SCIENCE"


-

/9-34
-

WIll

/0

/0

___t_Q

_08_C

FIG. 3. Distribution and summation offirst 342street American M addresses,

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

562

FRANK BENFORD

range it covers,and hence satisfies quency over the numerical relationship. the logarithmic and Nature'sNumbers The Natural Numbers In natural events and in events of which man considers thereare plentyofexamplesofgeometric an originator himself progressions. We are so accustomedto labeling orlogarithmic things1, 2, 3, 4, *** and thensayingtheyare in naturalorder that the idea of 1, 2, 4, 8, * being a more natural arrangement is not easily accepted. Yet it is in this latter manner large numberofphenomenaoccur,and the that a surprisingly evidenceforthis is available to everyone. First, let us considerthe physiologicaland psychological reactionto externalstimuli. with increasing The growthof the sensationof brightness illumination is a logarithmic function, as illustrated by Fechner's Law. The growth of sensation is slow at first and a straight whilethe rodsofthe retinaare alone responsive, paper (the stimulus being on the line on semi-logarithmic functhe intensity-brightness scale) can represent logarithmic tion in this region. When the cones come into action there and anotherstraight is a sharp change in the rate of growth, rangeof vision. When over-exciour working line represents tationand fatigueset in, a thirdline is needed; and thus three geometricseries could be used to state the relationbetween illuminationand the sensation of brightness. If the literathe brightness numericalreferences, ture contained sufficient of the close approximation function should give an extremely law of distribution. logarithmic The sense of loudness followsthe same rules, as does the sense of weight;and perhaps the same laws operate to make at ages ten and fifty. the senseofelapsed timeseemso different seriesthat repeat geometric Our music scales are irregular everyoctave. rigidly In the fieldof medicine,the responseof the body to medias are the killingcurves cine or radiationis oftenlogarithmic, undertoxinsand radiation.
.

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

THE LAW OF ANOMALOUS NUAIBERS

563

In the mechanical arts, where standard sizes have arisen fromyears of practical experience, the finalresultsare often geometricseries,as witness our standards of wire diameters and drillsizes, and the issued lists of "preferred numbers." The astronomer lists stars on a geometric scale brightness that multipliesby 100 every five steps and the illuminating engineer adopts the same typeofseriesin choosing thewattage of incandescentlamps. In the field of experimentalatomic physics, where the results representwhat occurs among groups of the building units of nature, and where the unit itselfis known only by mass action,the test data are statisticalaverages. The action of a single atom or electronis a random and unpredictable event; and a statistical average of a group of such events would show a statisticalrelationship to the resultsand laws here presented. That this is so is evidencedby the frequent use made of semi-logpaper in plottingthe test data, and the test points often fall on one or more straightlines. The analogy is complete, and one is temptedto thinkthat the 1, 2, 3, *.. scale is not the natural scale; but that, invokingthe base e of the natural logarithms, Nature counts e0,
ex, e2x e3x ...

and builds and functions accordingly.


PART

III.

DIGITAL

ORDERS OF NUMBERS

The natural number system is an array of numbers in simple arithmetic series,but on top of this we have imposed an idea taken froma geometricseries. Numbers composed of many digits are ordinarily separated into groups of three digitsby interposing commas,and here we unknowingly give evidence of the use of these numberson a geometric scale. For convenience ofdescription the naturalnumbers1 to 10 are called the first digitalordernumbers, thosefrom10 to 100 the second digitalorder,etc. It will be noted that 10 is both the last numberof the first orderand the first numberof the second order,and when an integration is carriedout, as will

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

564

FRANK BENFORD

be done later, 10 appears as both an upper and a lowerlimit, and it is thus used in this case as a boundaryline ratherthan a unit zone in the natural numbersystem. In Fig. 4 the curves show the frequencywith which the naturalnumbersoccurin the Natural NumberSystem,beginits frequency ningat the leftedge,where1 is the onlynumber, is 1; that is, until a second numberis added 1 is the entire is 0.50 for1 numbersystem. When2 is reachedthefrequency
/INEAIR FREQUEAW/E5 OF rHE NATURAL mum8ERS

/ ro

/,0O

am

A feL

r Dw Xw __r

/Gr17L

~~~~~JECOND

ORDER-O_

7-I/RD / OeRDCR DTrAL

44

X I1 I

V0

41

11;1

42C~~~~~~~

1.2% -V tl=SW XM

rr/2

;SS

FIG. 4. Linear frequenciesof the natural numbers in the firstthree orders.

f oreachofthe thefrequency 2. AtS,for andO.S0for example, until9 is continues is 0.20,and theequal division S digits first reached. At 10, the digit1 has'appearedtwiceand has a of 0.20 against0.10 foreach of the othereight frequency but once. thathave appeared digits from 9 on thescale thatthecurve It willbe observed rising thecurvecontinuing is foronlythedigit1, while of abscissoe 2 to 9 inclusive. At 19 the from 9 is forthedigits downward 2 to rises for curve join thecurvefor1 at 29 and 1 frequency

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

THE LAW OF ANOMALOUS NUMBERS

565

and 2 have a commoncurve until 99 is reached and a third 1 is about to be added to the series. At any ordinatethe first curves therefore tell the frequencyof the total number of natural numbersup to that point.

II I

I,

/0

FIG. 5. Continuous and discontinuousfunctionsin the neighborhoodof the digit 9.

The curvesare drawnas ifwe weredealingwithcontinuous functionsin place of a discontinuousnumbersystem. The forusing a continuousformis that the thingswe justification use the numbersystem to representare nearly always perand the number,say 9, given to fectlycontinuousfunctions, willbe used in some degreeforall the infinite any phenomenon sizes of phenomenabetween 8 and 10 when we confineourselves to singledigitnumbers.

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

566

FRANK BENFORD

An enlargedsketch of the linear frequencycurves at the and second ordersis givenin Fig. 5. The junctionofthe first lines h-b and b-j are the computedratios of 1 in this region, whilethe lines 8-b forthe ratio of 9 beginsat 8, foras soon as of our usinga 9, whilefor size 8 is passed thereis a possibility rFIW /EUNCY OF S/NGLE D/G/I3 / T09 + rHEORE7/CAL O OgSERvVED FREQUENCY Or FOOTNOFfS VING Ar L.EAsr /N /o BOOKS EACH H/A ONE PAGE WIrH TEN FoorWores O&RV&ED) (2,968
0.50

__

0.40

0. /O

7 8a9

ofsingledigits. frequencies and observed FIG. 6. Theoretical

size 812 the chancesare about equal forcallingit either8 or 9. The summationof area underthe curve 8-b-c is taken as the ofusinga 9 forphenomenain this region. This is probability accuratelythe size ofall phenomabout equivalentto knowing between8.5 ena in this regionand decidingto call everything

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

THE LAW OF ANOMALOUS NUMBERS

567

and 9.5 by the number9. Once 9 is passed the curve for 1, b-j, beginsto risein anticipationof the phenomenabetween9 and 10 that will be called 10. It has been notedthat forhighordersofnumbers the areas underthe curvesof Fig. 2 are proportional to the frequency of use of the firstdigit. The same demonstration will now be made withthe aid ofthe calculus in regions that are markedly discontinuous. Selectingthe thirddigitalorder,Fig. 4, the area underthe 1-curvecan be written
*199

A1"'

00

yd dx +j

999 19

lOQO

Y2d +
88

99

3 dx,

(5)

wherethe ordinatesof the first risingsectionof the curve are Yi

(6)

The descendingsectionof the curve has ordinates


Y2 = 111
(7)

and the last rising sectionbetween999 and 1,000has ordinates


a

a-

888

(8)
(9)
(10)

The curvesare plottedto semi-logarithmic coordinates aDd


x = log a, dx = dala.

The integralsaftermakingthese substitutions give the value 1990


A1"'=

loe

99

8 1000'

A similar operation yields for the 1-curve in the second digital order 8 A1 A1" = og. g190 + 100 9

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

568

FRANK BENFORD

order and in the first


Al' = loge-jj+ -?jj-

10

throughthese solutionsand running From the symmetry digits,we can write the solutionsforthe eightotherfirst from the generalequation forthe Law of AnomalousNumbers

F
where

r = Fa

1) +or 8 =[log 10 (2.10r1 r _1 loge

oge [lge (a
~~a+

1) 10-

1N

)10r -1

lori
_

tN11

the expressions from to convert whereN = log, 10 is thefactor base e, to the commonlogarithm system, the naturallogarithm system,base 10. done as was unwittingly If highordersof r are considered, simplify by in the originalstatisticalwork,these expressions and denominator, the terms - 1 in both numerator dropping beand the numericaltermshaving lor in the denominator come negligible. Hence the generalequations become

Fr = =log0lo1,
Far

(12)
,

a$l

= log

(13)

in form, no longerhave a difference but these two expressions and -theymay be mergedinto
Far =

+ log1o a

(14)

observedformulti-digit originally whichwas the relationship numbers. In Table V numericalvalues are given forthe theoretical second, third and of used numbersfor the first, frequencies digitalorders. limiting

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

THE LAW OF ANOMALOUS NUMBERS TABLE V


THEORETICAL First Digit FREQUENCIES IN VARIOUS DIGITAL Third Order ORDERS

569

First Order

Second Order

Limiting Order

1 2 3 4 5 6 7 8 9

0.39319 0.25760 0.13266 0.08152 0.05348 0.03575 0.02352 0.01456 0.00772

1 tO 10

10 tO 100

0.31786 0.17930 0.12432 0.09479 0.07631 0.06366 0.05444 0.04742 0.04190

100 tO 1000

0.30276 0.17638 0.12487 0.09669 0.07889 0.06662 0.05764 0.05078 0.04537

0.30103 0.17609 0.12494 0.09691 0.07918 0.06695 0.05799 0.05115 0.04576

The frequencies ofthe singledigits1 to 9 varyenoughfrom the frequencies of the limiting orderto allow a statisticaltest if a source of digitsused singlycan be found. The footnotes so commonlyused in technical literatureare an excellent source, consistingof units that are indicated by numbers, lettersor symbols. The procedure ofcollecting data forthe first-order numbers was to make a cursoryexaminationof a volume to see if it contained as many as 10 footnotesto a page, forobviously no test of the range1 to 9 could be made if the maximum number fell short of the full range. The numbershere recorded in Table VI are the numberof footnotes observedon consecutivepages, beginning on page 1 and continuing to the end of the book, or until it seemed that a fairsample of the book had been obtained. The books used were the Standard Handbook for Electrical Engineers, Smithsonian Physical Tables, Handbuchder Physik and Glazebrook's Dictionaryof Applied Physics. In Table VI the observedpercentagesof singledigits1 to 9 are givenalongwiththenumber ofpages used in each volume and the numberof footnotes observed. The frequency for1 is seen to be 43.2 per cent as against the theoretical frequency of39.3 per cent,and forthe digit9 the observations agreewith = theorywith Fg' 0.8 per cent.

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

570

FRANK BENFORD

In general the agreementwith theoryis as good as the computedprobable errorsof the observation.
TABLE VI
COUNT OF FOOTNOTES 1 ~~Pages __ Used 2
_ _ _

3
_ _

4
_ _

5
_ _ _

6
_ _

7
_ _

8
_ _

9
_

Volume Volume

Frequencies, in Per Cent 22.7 22.1 12.3 6.6 5.0 6.1 2.4 5.0 1.7 2.2 0.3 1.1 0.3 0.6 0.3 0.0

Total Count

3. 4. 5. 6. 7. 8. 9. 10.

1. S. H. E. E. 2. Sm. Phy. Ta .......

II. derPhy..
H. der Phy.. H. der Phy.. H. der Phy...

All All

360 360
365

55.1 56.3

H. derPhy... H. derPhy.. . GlazebrookI ...... GlazebrookV .....

361 360 360 All All

37.2 29.7 19.5

52.8 23.6 33.0 56.8 49.5 41.7


25.7 26.6 17.4

27.5 11.8 10.7 23.2 6.7 7.6 22.3 13.7 6.9 25.2 13.4 9.1

12.1 14.6 17.7

8.5

9.5 11.0 11.9

5.5

4.8 8.0 11.3

4.0 4.3 2.4 2.3 4.7

5.2 5.9 9.2

3.2

0.8
2.6 1.8 6.1

0.0
2.2 1.0 5.8

5.9 2.8 1.4- 0.5 1.5 1.5 3.2 1.7

2.4 1.4 1.5 0.5

0.9 1.4 1.1

1.6

586 181 230 287 293

127

1.6 0.0 0.8 0.5

254 211 394 405

ObservedAve.43.2 PredictedAve.39.3 Difference. Probable Error.

23.6 25.7 +3.9 -2.1 3.0 40.6

11.8 8.3 4.9 3.9 1.9 1.6 0.8 5.3 3.6 2.4 13.3 8.1 1.5 0.8 -1.5 +0.2 -0.4 +0.3 -0.5 +0.1 0.0 40.7 ?0.5 ?0.6 ?0.5 ?0.4 ?0.4 ?0.4

2968

Summation ofFrequencies thatmustbe metby theseexpressions One ofthe conditions of the integersis that, in any one order, forthe frequencies must equal unity;that is, the sum the sum of the frequencies must equal certainty. of theirprobabilities Selectingthe first-order digits,Eq. 11, and remembering of a group rule that the sum of the logarithms the logarithmic of numbersis equal to the logarithmof theircombinedprodP' ucts, we have the probability
P = logio 9102-345678

1023456789 1-010

rs

10 1010
Pt
=

10

10

10

10 N'

1 1

whichreducesto =1.
log1010 + 0

In a similarmanner fromthe complete set of equations

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

THE LAW OF ANOMALOUS NUMBERS

571

indicatedby Eq. 11 we have


P=

79 89-99 lo 190 29-3949-59-69 1 g10 99 19 29 39-4959-69-79-89 1 1 1

+I
10

1
100

1
100

1
100

10 10100

100

=1

log1010 + 0

1 1 100 N

and similarproofcan be workedout forthe otherorders. SummaryofPart III Single digits, regardlessof their relation to the decimal point and also regardless of preceding or following zeros,have a specific natural frequencythat varies sharply from the logarithmic ratios. The second digital order,which is composed of two adjacent significant digits, has a specificfrequency approximatingthe logarithmicfrequency; and for three or more associated digitsthe variation fromthe latter frequency would be extremely difficult to findstatistically. The basic operation F=f F or F_ _a a

fda a

in converting from the linearfrequency ofthe naturalnumbers to the logarithmic frequency ofnaturalphenomenaand human events can be interpreted as meaningthat, on the average, these things proceed on a logarithmicor geometricscale. Anotherway of interpreting this relationis to say that small thingsare more numerousthan large things,and there is a tendencyfor the step between sizes to be equal to a fixed fraction ofthe last preceding phenomenon or event. There is

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

572

FRANK BENFORD

no necessityor implicationof limits at eitherthe upper or the lower regionsof the series. If the view is accepted that phenomenafallinto geometric series,then it followsthat the observedlogarithmic relationship is not a result of the particularnumericalsystem,with its base, 10, that we have elected to use. Any other base, such as 8, or 12, or 20, to selectsome ofthe numbers that have been suggestedat various times, would lead to similarrelationships; for the logarithmicscales of the new numerical systemwouldbe coveredby equally spaced stepsby the march ofnaturalevents. As has been pointedout before, the theory of anomalous numbersis reallythe theoryof phenomenaand events, and the numbersbut play the poor part of lifeless symbolsforlivingthings.

This content downloaded from 184.174.224.243 on Sat, 8 Feb 2014 11:22:35 AM All use subject to JSTOR Terms and Conditions

You might also like