BU-857-M

Sequential Procedure for Testing Germination Rates of Seeds Stored in Seedbanks
September 1984

by

Issac Bekele1
Department of Crop Science, The University of The West Indies, St. Augustine, Trinidad

1 Paper prepared when the author was a visiting fellow in Biometrics Unit, Cornell University, Ithaca, N.Y.

Summary Samples of seeds stored for long-term conservation in seedbanks have to be monitored regularly in order to check the viability status of the seeds. In previous works, each inspection has been regarded as a separate statistical test of the null hypothesis that the sample needs regeneration. Here an overall procedure that treats each inspection as a part of a single process and subjects them to overall error rates will be developed. Properties of the procedure are examined and compared with other procedures.

Key words

Conservation; Germination test; Overshoot; Power I type tests; Seedbanks.

1. Introduction In technologically advanced countries farmers use modern cultivars (high yielding, disease resistant, etc) as opposed to traditional varieties which have commonly been used by farmers in developing countries. But in recent times these latter farmers have been slowly shifting into using introduced cultivars and abandoning the traditional varieties. Continuous use of modern cultivars with desirable characteristics is feasible only if a broad genetic base is retained for each species of crop plants which can be used as a pool for producing new varieties. This shift has exposed the natural gene pool to extinction (Frankel and Bennett, 1970).
In an attempt to control this process of genetic erosion, measures are being undertaken at different levels throughout the world. Seeds of different species of traditional cultivated crops are being systematically collected and stored under conditions believed to prolong the survival of the seeds. Such storage facilities are termed 'genebanks' or 'seedbanks'. This approach is believed to be the cheapest and safest method of conserving crop genetic materials.
Each sample of seeds is given a unique identification number either at point of collection from the fields or time of exchange and is referred to as an accession. All the accessions are kept under similar conditions but each is monitored separately.

-2-
Although under proper storage conditions the process of aging is believed to slow down, regular germination tests should be carried out on samples taken from an accession to check if viability has dropped to a level that requires regeneration of the accession. It has been argued that increases in the percentage of cells of surviving seeds which show chromosomeaberrations and the incidence of mutant phenotypes in succeeding generations are correlated with loss of viability (Abdella and Roberts, 1968 and 1969). Let p denote the proportion of viable seeds and pmin be the minimum p such that its consequences on surviving seeds which show chromosomeabberrations and mutant phenotypes in succeeding generations are within tolerable limits. Hence the accession can be kept in the storage without a need for regeneration as long as p does not drop below pmi n . But if p drops to pm1. n , then the accession must be regenerated and new seeds stored.
Monitoring viability involves germinating seeds sampled from the accession. Usually the first test is carried out after time t 1 years from initial storage and a formal statistical test is made using the data from the germination test to determine whether or not to regenerate the accession. If the evidence is against regeneration, the seeds are kept in the store until the next regeneration time, regenerated and new seeds stored otherwise.
Thus before regenerating an accession, a number of tests are carried out on groups of seeds sampled from it at different

-3-
stages of its life in the store. Since these tests are distinctive, sufficient seeds must be stored initially to insure availability of seeds for exchange, successive tests and regeneration when it is necessary. Hence, it is evident that both frequency of inspection and the number of seeds used for each test are important factors in determining the initial size of an accession. Therefore, adoption of a statistical procedure that requires fewer seeds for tests is highly desirable.
The size of the overall error rates are also essential. The important error rate that has to be controlled is the probability of failing to regenerate the accession. If this rate is high, in the long run the seedbank would be losing some of its most valuable genetic materials. Secondly, it would be desirable if the procedure stops at or close to the true time of regeneration as possible because this could cut on the long-term cost of the seedbank.
A sequential probability ratio test (SPRT) for testing percentage germination of seeds has been suggested for use in seedbanks (Ellis, Roberts and Whitehead, 1980 and Whitehead, 1981). SPRT and also fixed sample approach consider inspections at different times as unrelated statistical problems rather than part of an overall process and result in separate significant statements (inspection wise error rates). Although in both cases, inspection wise error rates are known the overall error rates are unknown. nevertheless, it is possible to estimate the unknown overall error

-4-

rates for each of these approaches from computer simulation

for comparison purpose.

At inspection timet., the new procedure makes use of l.

information

from

all

inspections

up

to

time

t

.
].

-

1

and

updates

it with current information from germination test. Based on

this cumulated information about viability condition of the

seeds, a decision is made whether or not to regenerate the

accession. Hence the whole monitoring process is treated

as a single act. The method is based on the assumption that, for any
~ixed time period t, the number of germinating .seeds· out of

n tested is binomially distributed with probability of ger-

mination p(t). In addition, it is assumed that the logit

of p(t) is a linear function of t. The test procedure is developed with some modification analogous to the power 1 type tests of Darling and Robbins (1967, 1968) for iid normal random variables.

-5-
2. Formulation of the Problem Let p(tl..) denote the germination rate of the accession at time t.l. and T be the true time of regeneration (Tis unknown). Next let
p = p (t )
00
and pmin = p (T)
p0 is the initial germination rate and pml..n is the terminal germination rate. Hence T denotes the true time it takes for p(tl..) to drop from po topml..n• An each-inspection time germination test is made and the following hypothesis assessed:
HO: P $ Pmin HA: p > Pmin The accession is kept in the store as long as evidence supports HA and there are sufficient seeds for future testing. Now consider a case where tests carried out on a single seed basis and let t 1 , t 2 , ···, ti' ••• denote predetermined inspection times (note that the t.l.'s need not be all different since in practice test are carried out on a number of seeds at any given inspection time). Define
1, if a seed planted at t.l. germinated xl.. =
O, otherwise

-6-

If

then

P(x.=l) = p(t.) l. l.

xi is a Bernoulli random variable with parameter p(ti).

The loglikelihood of p(t.) is given by: l.

l(p(t.)) =Ex. t::>g {p (t.) I (1- p (t.))} + E [-,g {1- p (t.)}

l. l.'

l.

l.

l.

Let

2.1

log (p (t.)) l.

= {og{p (t.) I l.

(1:.

p (t1))

~

be denoted by R(t.). l.

Assume that R(t.) has the following form: l.

R(t.) = R -St.

l. 0

l.

where R is the logit of p •
00

S is the rate of deterioration of seeds per unit time on a

2.2 2.3

logistic scale. It is a general parameter that includes the true

rate of deterioration.

Hence the loglikelihood of p(t.) reparameterized in terms l.
of S is:

f(S)=Ex.(R -St.)-Efog{l+exp(R -St.)}.

l. 0

l.

0 l.

Under this parameterization, it is desirable to regenerate the

2.4

accession when R(ti) drops to R1 (=R(T)), and maintain .acces-

sian in the store otherwise.

is the logit of pmJ..n •

-7-

3. Test Procedure

The test statistics are defined and the stopping rule is given

below. An approximate overshoot correction is incorporated

into the procedure.

3.1 Derivation of Test Statistics

If S denotes the current time, it is desirable to regenerate

the accession when S coincides with T where S < T.

Suppose that each time an inspection is made it is pretend-

ed that 'it is now time to regenerate the accession'. Let Ss denote the rate of deterioration of seeds under this pretense.

Hence, at time s we have the following logistic regression line:

Where

Rs

(t.)
l

=

R
0

-

Ss

t

.
l

for

t. = t ,
ll

t2,

• • ·,

s

3.1.1

Ss

=

(R
0

-

R ) /S
1

The true logistic regression line is

3.1. 2

T 3.1. 3

Where

ST = (R0 - R1)/T S includes all Ss's and ST.

3.1. 4

The hypothesis now can be expressed as in terms of S as

follows:

HO: Ss ~ST

HA: Ss>ST

Figure

3.1.1

shows

the

relationship

between

Rs

(t.) l

and

RT(t.). l

(Figure 3.1.1 goes here)

-8-

Now from (3.1.2) and (3.1.4)

8s > 8T as long as S < T.

Hence it is desirable to regenerate the accession when 8s 8T.

Otherwise, define

Z = ~ t 1. (Y1. - Es (Y1. ) ) and

3.1.5

V= E_t.1n1.Es (Y1 ./n1 .)[_l-·Es (Y1./n1 )J.

3.1. 6

Summation is over all inspection times up to the current time S.

Y. is the number of germinating seeds among the n. seeds 11

tested

at

time

t. 1

and

Es

is

the

expectation

under

the

pretended

assertion 'it is now time to regenerate the accession' (refer

to appendix B). Hence

Es (Yi) =nips (ti).

ps

(t.)'s
1

are

computed

from

the

logits

derived

from

Rs

(t.).
1

> 0_ for all S < T

3.1. 7

E(Z = 0 at S = T

< 0 for S > T.

So E(Z) is a decreasing function of t and has different distri-

butions at each timet .• 1 Now,·by analogy to Darling and Robbins (1967, 1968) proce-

dure (Appendix A) and modifying (Appendix B) to serve the require-

ments of seedbanks, the following stopping rule can be used.

regenerate the accession if

Z s_a(v)

3.1.8

continue otherwise

-9-

where

1
a(v)= {(v+l)[fog(v+l) -2fog2o:]}2

a is type I error of Darling and Robbins procedure and it

can be chosen as small as desired. Then the following hold

3.1. 9

p(stopping too late)< a

3.1.10

p (stopping too early) -+ 0 as n-+ co

3.1.11

The test terminates with probability 1 as n-+ co (refer tc Appen-

dix c for proofs).

So at each inspection time, Z and a(v) are computed and

based on the evidence either the accession is regenerated or

sampling continued.

The procedure controls the probability of stopping too

-late as desired. And secondly the test terminates with prob-

ability 1 as n increases at t. = T. l
3.2 Correction for Overshoot

Examination of the properties of the procedure indicates

that it is certainly conservative. The probability of failing

to stop is lower than the desired level a and secondly for

small sample size the procedure could lead to early stoppings.

Therefore, an approximate correction is incorporated into the

procedure by analogy to Siegmund (1979) and Whitehead (1981).

At current inspection time s, information increases at

rate I s , where

I s = Rs S2p s (s) (1 - p s (s)) .

3.2.1

-10-
Then an approximate correction is
0s = 0.583fsr.
The procedure (3.1.8) becomes regenerate the accession if
continue otherwise. ~mere

3.22 3.2.3

Z = Z+O cs

3.2.4

The correction increases-at smaller rate than V, and there-

fore, the properties (3.1.10) and (3.1.11) still hold. The

effect of the correction factor can be specially effective when

small sample sizes are used.

-11-
4. Discussion Computer simulation was used to examine the properties of the procedure and to make comparison between different tests. Table 4.1 gives estimated error probability (a) for 1000 replicates each for two different sample sizes. Twas set at 100 years and pm1. n at 0.85. The value used for a was 0.05.
Table 4.1 Estimated error probabilities (a) for two initial germination
rates p =0.99 and 0.95. 0

n
100 1000

0.95 0.001 0.001

0.95 0.002 0.000

For each of these simulations, inspection intervals of equal sizes of five years were used starting the first inspection at year five. Theoverall error rate was considerably smaller)than a as expected. AJso it is important to note that the sample size has no appreciable effect on the error rate.
Tab.Le 4. 2 gives estimates of the probability of stopping too late for SPRT, the new approach and the fixed sample case.
For each case an estimate of a based on 1000 runs is given.
Two initial germination rates were used. A group of 40 seeds were used for SPRT which lead to the use of an average of 116
and 194 seeds for p = 0.99 and 0.95 respectively at any given 0
inspection time. For fixed sample case 467 seeds were used

-12-

per test. Inspection interval of 20 years was used starting with year 20 until the test te~rninated. T was fixed at 100 years.
Table 4.2 Estimates of probability of stopping too late for the ~hree
procedures

Po

Tests

0.95

0.99

SPRT

0.02

0.049

New Procedure*

0.009 0.01

Fixed Sam)2le

0.25

0.058

New Procedure(n=467)

0.004 0.004

* The new approach's estimates are based on 194 and 116 sample

sizes for p = 0. 95 and 0. 99 respectively which is the same 0
as the average for the SPRT.

The fixed sample requires 467 seeds to achieve the same

result as SPRT. In fact an elaborate comparison of SPRT and

fixed sample approach is given by Ellis and others. (1980). The

fixed sample approach is extremely wasteful as compared to

the other two.

The SPRT approach stops too late on·average about 3.5

times more often than the new approach for the same average

sample size. Hence the new approach shows a higher perfor-

mancc! in this respect than SPRT.

The use of the error rate to compare different procedures

without considering the effect of inspection times could be

unsatisfactory.

-13-

It would be interesting to see the magnitude of such an error when the inspection grid misses the desired time of regeneration. In fact this is one of the serious problems of predetermined inspection times. If the last inspection is carried out at t m, when t m > T, the error rate should be higher for any procedure. The size of course depends on the difference t n - T.
Simulation was carried out to study the effect of inspection times on error rates for SPRT and the new approach (Table 4.3). Inspections were made at equal intervals of 20 years starting at 20 years for both cases. T was fixed at 90 years and initial germination rate of 0.99 was used. 1000 replicated runs were made for both approaches. Group of 40 seeds were used for SPRT which led to the use of an average of 145 seeds per inspection time. So 145 seeds per test were used in the simulation for the new procedure.
Table 4. 3 Frequency of stoppages at different times of inspection out of 1000 replicates each for SPRT and new procedure.

Inspection times (yrs) 20 40 60 80
100 120

Frequency

SPRT

New Procedure

00

00

0 39

260 745

738 216

20

When the last inspection is carried out after the true time of regeneration, which could happen in practice if pre-

-14-
determined times of inspections are used, the SPRT will stop more frequently at the first time after T the last time before T. For the same average sample size, the new procedure however, will stop more frequently at the last time before T than the first time after T.
Although adoption of statistical procedures with desirable properties such as seed saving and ideally smaller error probabilities, their vulnerability to changes in inspection times must as well be accounted for. In practice this is a more serious problem because for thousands of accessions of different species of crop plants, the desirable times of regeneration were not known. An objective method of estimating these inspection times should be sought for.
Certainly the new procedure indicates better performance in terms of smaller error rate than the SPRT which uses the same average sample size per inspection. Even if the last inspection is carried out after the true time of regeneration, fewer accessions will be regenerated after T years if the new procedure is used. But another important property of the new procedure is that it enables stochastic estimation of inspection times using germination information. Therefore, the procedure is a powerful statistical tool.

-15-
Acknowledgement The author wishes to thank Dr. C. E. McCulloch, Dr. D. Robson and Dr. J. Whitehead for helpful and inspiring discussions and valuable comments at different stages of the development of this work. Assistance provided by Cornell University is highly appreciated.

-16-
References
Abdella, F. H. and Roberts, E. H. (1968). Effects of temperature, moisture and oxygen on the indirection of chromosome damage in seeds of barley, broad beans and peas during storage. Annals of Botany 32, 119-136.
Abdella, F. H. and Roberts, E. H. (1969). The effects of temperature and moisture on the induction of genetic changes in seeds of barley, broad beans and peas during storage. Annals of Botany 33, 153-167.
Bekele, I. (1981). Monitoring Accessions in Seedbanks. M.S. Thesis (unpublished). University of Reading: Reading.
Darling, D. A. and Robbins, H. (1967). Inequalities for the sequences of sample means. Proceedings of National Academy of Science 57, 1577-1580.
Darling, D. A. and Robbins, H. (1967). Confidence sequences of sample mean, variance and median. Proceedings of Na~ tional Academy of Science 58, 66-68.
Darling, D. A. and Robbins, H. (1968). Some further remarks on inequalities for sample sums. Proceedings of National Academy of Science 60, 1175-1182.
Ellis, R., Roberts, E. H. and Whitehead, J. (1980). A new more economic and accurate approach to monitoring the viability of accessions during storage in seedbanks. Plant Genetic Resources- Newsletter 41, 3-17.
Frankel, 0. H. and Bennet, E. (1970). Genetic Resources in Plants -their Exploration and Conservation. Oxford: Blackwell Scientific Publications.
International Board for Plant Genetic Resources (1976). Report of the IBPGR Working group on Engineering, Design and Cost aspect of long-term seed storage Facilities. Rome: IBPGR.
Siegmund, D. (1979). Corrected diffusion approximation in certain random walk problems. Advances in Applied Probability 11, 701-719.
Whitehead, J. (1981). The use of the sequential probability ratio test for monitoring the percentage germination of accessions in seedbank. Biometrics 37, 129-136.

-17-

Appendix A
Power One type tests for Normal Random Variables
Let
x1 , x2 , • • • be iid normal random variables with E(xi) = e and v(x.) = 1. Suppose interest lies in testing
l.
H : 8.:i.. 0
0
versus

Assume it is desirable to continue with sampling as long as

H0 is true and quit sampling otherwise and take some appropriate action.

Darling and Robbins have suggested the following type procedure:

continue with sampling as long as

where

Sm < a (m)

and Under H0 :

S =X -·+ • • • +X

m1

m'

.1
a(m) = {(m+l)[log(m+l) +2log2a]}

A.l A.2

Sm - N(O,m). Each time a sample is drawn, both S and a(m) are computed and
m
compared. The procedure calls for termination of inspection when

S ~ a(m).
m
Darling and Robbins show that:

~18-
PH (S ~a(m) for some m2:_1) ~a. 0m
PH (Sm2:. a (m) for some m,2l) ~ 1 as m ~ oo,
1
In the next section a modified version of this procedure to
suit the special case of monitoring percentage viability is
given.

A. 3
A.4

-19-

Appendix B

Derivation of test statistics
First transitional test statistics zl and vl are derived and

test procedure outlined with analogy to Darling and Robbins

(1967, 1968) procedure. Then the statistics Z and v of section

3 are formally derived.

I. For S close to S0 , the loglikelihood of S can be expanded

approximately as:

where

f(S)~f(S 0

)+(S-S
.0

)f 1

(S
0

)+~(S-S 0

)

2 f"(.~o)

B.l

1. I <s ) = df/ dB! rs 00
and

Next let then T test
0

8=s -s 0

l (S) =:= f ( S ) - Elf I ( S ) + ~ 82f II ( S )

00

0

versus
HA: 8 > 0
The statistics zl and vl can be used, where
z1 = -f 1 (B ) 0

B.2

-20-

z1 =-i'(B) 0
and

v 1 = -i" (B ) • 0
z1 is a linear function of the efficient score and v1 is

Fisher's information.

If ni seeds are used for germination test at time ti and

yi denotes the number of seeds that germinated, then

Z1 = L: t . (Y • - n . p T ( t . ) )

11

1

1

and

B.3

B.4

The sequential test is based on Sm and m of Darling and
Robbins test replaced by z1 and v 1, respectively . . This analogy

is reasonable since under H0
z1 - AN(O,v 1 ).
Then by analogy to (A.3)

where

pH (Z 1 ~a(v 1 ) for some v 1 > 0) <a 0
l
a (V1) = { ( v 1 + 1 ) [log (v 1 + 1) - 2log (2a ) ] }L •

B.S B.6

Where a can be chosen as small as desired. Use of the statistics

z1 and v 1 requires knowledge of B • To overcome restrictions 0
arising from this, the approach can be modified as follows:

II. Suppose at time t , z1 and v 1 were evaluated and decision arrived 1 to continue with sampling.

Now let B, satisfy:

where

-i' (B ) =a {-f" (B )}
11

B.7

-21-

and

l"(S ) =d2lldSis
11
l(S) is the loglikelihood of S.

If S denotes the current time, then

zl

<a("V1)

<:;:>

S

<(R0

-

R )IS
11

=

T 1

where R0 and R1 are the logits of p at t 0 and T.

Given information on the status of the seeds in storage

B.8

up to time t 1 (=s), then T 1 is the future time for which it would be necessary to undertake regeneration of the accessions

if T1 = T. Then from (B. 8)

Continue with inspection as long as

B.9

as long as

Then

-l 1 ((R8 - R1 ) Is) > a {-l" ((R0 - R1 ) Is)}

B.lO

-l 1 ((R0 -

R1 ) 1s)

=

}:; t . {Y. ~~

Es

(Y. ~

)

}

-l"((R0

-R1

)

1

s

)

=

}

:

;

t. ~

2

n

.E
~

s

(

Y ~

.

I

n
~

.

)

[

l

-

E

s

(Y.In)] ~

Where

Es

(Y.) ~

is

the

expectation

of

Y. ~

under

the

pretense

that

it is now time to regenerate the accession. (B.lO) holds for

B.ll B.l2

all time t. <T. B.ll and B.l2 are Z and V of 3.1.5 and 3.1.6. ~

-22-

Appendix C

Properties of the test Procedure

Now where

Z=H

. (Y .
l. l.

-

n

. l.

p

s

(t

.)) l.

= Z1 + ll

ll =

H. n. (pT (t. )

l. l.

l.

-

ps

( t.))
l. .

and

where

V = H .2
l.

n.p
l.

s

(

t.
l.

)

(

l

-

p

s

(t.))=v1
l.

+s

S=

Et.2
l.

n.{(p
l.

s

(t.)
l.

-p

2
s

(t.))-
l.

(pT(t.)
l.

- p T2

(t.))}.
l.

Under H0

E(Z) - AN(O,v).

Since at t. = T
l.

li=O

a=o

a(v),.. a(v1 + o) = a(v1) +d(o).

Hence,

Properties

> a(vl) for t. <T
l.
a(V =a (v 1) for t. = T
l.
<a(v1) for t. > T.
l.

p (stopping too late) < a

p(stopping too early)+O as n+oo

p(stopping at desired time T) + 1 as n +oo.

C.l C2
C.3 C.4 C.5

-23-

Proofs

(C.3):

p (stopping

too

late)= PH

(Z > a(v)

for

all

t .--< T)
1-

0

Ho<p (Z>a(v) for ti = T).

But at t. = T
1

pH(Z>a(v))=pH (Z 1 >a(v 1 ))~a..
00
Hence the result

(C.4):

p (stopping too early) = PH. (Z ~a (v) for any t. < T)
A1
=PH (Z 1 .::s_a(v)- f'.. for any t. < T).
A1
At any given time ti' a(v) and f'.. are increasing functions of n. l
a(v) increases by an order of n2 and f'.. by n. f'.. > 0 for all t. < T.
1
When

a (v) - f'.. -+ -oo as n -+ oo

pH (Z 1 S(a(v)- f'..) for any t. < T) -+0 as n-+.;;,
A1
as required.

(C.S):

p(stopping at time T) =PH (Z ~a(v) for t. = T) A1
= p(Z 1 ,s. (a(v)- f'..) for t. = T) 1
-+ 1 as n-+ oo

Noting that f'..= 0 at t = T and a(v) -+oo as n-+oo. i
It follows that the test terminates with probability 1 as n -+co.

-24-

Title:

Stochastic Estimation of Inspection times for Monitoring Viability of seeds in "Genebanks'.

Summary:

A procedure for estimating inspection times based on techniques to monitor viability of seeds suggested by Bekele (1984) is explained.

Introduction:

Background of the problem is summarized. The importance of objective estimations of the inspection times is explained. The distributional assumptions about survival of seeds are given and the model applied developed.

Test Procedure:

The test statistics are briefly defined and the decision process explained. The properties of test outlined.

Estimation of Inspection times:

Technique for esimating confidence sequences is given. The use of the confidence intervals for estimating inspection times is explained. Properties of the confidence interval are given.

Discussion:

Predetermined and estimated inspection times are compared simulation results. Modification of estimated times is suggested.

R(t)

-25-

~----------L---------------~----------------:>

0S

T

t