You are on page 1of 24

FAQ on calibration

Rationale
From experiences gained in inter-laboratory comparison studies, being it proficiency tests or method validation studies by collaborative trial, we know that the importance of instrument calibration and its effect on analysis results is frequently underestimated. Participants to some of the studies expressed their wish to get more guidance on instrument calibration; despite a number of international guidance documents is already available. This situation could be explained by the difficult language applied in some of the guides, respectively the lack of knowledge on where to find very practical and easily understandable guidance. The more than five million hits gained at the time of writing this text with one of the most prominent internet search engines for the term "instrument calibration" do not really contribute to improve the situation. Therefore we felt it necessary to prepare this document that aims to address some aspects of standard preparation and instrument calibration for the determination of polycyclic aromatic hydrocarbons (PAHs), respectively mycotoxins in food that seem to provide difficulties to some operators. Like in every question-and-answer scenario one will face the fact that either a question can be asked extremely specific, resulting in a similar specific answer or a question can be asked rather general and the answer thus might also be rather general. This might result in a dilemma, in which the answer to a question might not be the one of interest, because either it is too specific and not applicable to the exactly (slightly different) question one might have, or it is too general and does not touch specific aspect one had in mind. In other words: If a model is simple, it likely will be wrong, if it is complex, it surely is impractical Applying this to this guide, the compromise was to try to answer both to relevant general issues but also to a few specific ones that are sometimes encountered. The format of the guide was chosen on purpose, because as a frequently asked questions (FAQ) document it remains open to address and include any question regarding standard preparation and instrument calibration that might come up in future.

FAQ on calibration Index Standard preparation .................................................................................................................. 3 1. Where do I get reference materials for PAH analysis from? ......................................... 3 2. Which level of purity of reference materials is acceptable? .......................................... 4 3. Which are the advantages of gravimetric standard preparation? ................................... 4 4. Which type of balance do I need for the preparation of calibration standards?............. 4 5. Is serial dilution of a standard solution for the preparation of calibration standards acceptable? ..................................................................................................................... 5 6. How shall I store PAH standard solutions?.................................................................... 6 7. Which containers shall I use for storage of standard solutions? .................................... 7 8. How shall I estimate the shelf life of my standard preparations? .................................. 7 9. Which type of volumetric glassware may I use for the preparation of calibration standards? ....................................................................................................................... 8 10. How do I verify the concentration of my standard preparations? .................................. 8 11. How many points do I need for a calibration curve? ..................................................... 9 12. How many replicates per calibration point?................................................................. 11 13. Why shall the concentration levels of the calibration standards be equidistant? ......... 12 14. Which range of concentration has the calibration to cover? ........................................ 14 15. Which type of internal standard shall I use? ................................................................ 15 16. When do I need to prepare matrix matched calibration standards? ............................. 16 17. How do I determine matrix effects? ............................................................................. 17 18. In which sequence shall I measure the calibration standards? ..................................... 18 Evaluation of calibration measurements .................................................................................. 18 19. How shall I test for linearity of the calibration?........................................................... 18 20. Does a correlation coefficient (r) of 0.99 indicate linearity of calibration? ................. 19 21. Which level of R is sufficient?.................................................................................... 19 22. Which information can I get from the plot of residuals? ............................................. 20 23. What is the residual standard deviation?...................................................................... 20 24. May I force the calibration curve through the origin?.................................................. 20 25. What is homo- and heteroscedasticity? ........................................................................ 20 26. How do I test for homoscedasticity / heteroscedasticity? ............................................ 21 27. Linear regression or weighted linear regression which shall I apply? ...................... 21 28. May I remove outliers? ................................................................................................ 21 29. How do I estimate confidence and prediction intervals? ............................................. 22 General ..................................................................................................................................... 23 30. Is there any internationally harmonised document on calibration?.............................. 23 31. Where can I get guidance on calibration? .................................................................... 23

FAQ on calibration

Standard preparation

1.

Where do I get reference materials for PAH analysis from?

A number of suppliers of chemicals have PAH standards in their assortment. A non-exhaustive list of suppliers, respectively links to other sources of information is given in the following: The International Society for Polycyclic Compounds (ISPAC) has on its website a list of suppliers of polycyclic aromatic hydrocarbons and heterocyclic aromatic compounds both neat and in solution: ISPAC Standards A searchable database on suppliers of different chemicals is on the homepage of Chemindustry (www.chemindustry.com). The following link gives an example for suppliers of benzo[a]pyrene (neat, and in solution): ChemIndustry: example of search for benzo[a]pyrene A similar searchable database which returns besides the name of different suppliers also some information on the product (e.g. packaging size) can be found on the webpage www.chemexper.com chemexper.com A large collection of PAH reference substances, among others different certified reference materials, is included in the 2008/2009 catalogue of LGC. It contains single substance reference materials (neat and in solutions, native and labelled) as well as PAH mixtures. LGC standards Important suppliers of reference materials for PAH in Europe (non-exhaustive list) ALFA Aesar, Chiron, Dr. Ehrenstorfer, SIGMA Aldrich, VWR certified reference materials (CRMs) for PAHs The Institute for Reference Materials and Measurements (IRMM), LGC, the National Institute of Standards and Technology (NIST)

FAQ on calibration 2. Which level of purity of reference materials is acceptable?

A purity of 100 % would be desirable, but in reality most of the target PAHs (15+1 EU priority PAHs) are available on the market in purities of above 95%. Hence the operator has to choose a reference material with a purity that is suitable for the particular task. However, care must be given that impurities do not interfere with the target analytes. The purity of the reference substances shall be considered in the calculation of the standard concentrations. The uncertainty of the purity shall be included in the measurement uncertainty estimate.

3.

Which are the advantages of gravimetric standard preparation?

The weighing procedure is more precise than handling of volumes, which results normally in smaller uncertainties. Handling of low volumes of liquids is difficult due to the influence of many factors such as surface tension, and leads frequently to bias. For gravimetric standard preparation it shall be noted that the uncertainty from weighing increases with decreasing amounts of weighed substance. This has consequences for the selection of the type of balance and the weighing procedure applied. A prerequisite for gravimetric standard preparation is thermal equilibrium of the balance and all chemicals and consumables which are used for the standard preparation. Thermal equilibration might take a couple of hours especially in case of large solvent volumes. Before starting with gravimetric standard preparation make sure that the balance is working properly, by applying suitable check weights.

4.

Which type of balance do I need for the preparation of calibration standards?

An analytical balance with a readability of 0.1 mg, respectively 0.01 mg for weighing of substances at levels as low as about 30 milligram, will be fit for the purpose, which means that the uncertainty of weighing is at an acceptable level. The US Pharmacopeia [1] defines the minimum permissible weight of a balance as a load that will give a relative uncertainty of less than 0.1%. As a rule of thumb the minimum weight can be estimated for a balance by multiplying the readability of the balance (e.g. 0.1 mg) with a factor between 3000 and 5000.

FAQ on calibration
However the applicability of this rule of thumb depends on the precision of the balance and has to be evaluated experimentally according to Eq 1:

3 x Stdev of 10 measurements 0.001 weighed amount


does not include the tare weight of the weighing vessel!

Eq 1

It has to be noted that the minimum weight corresponds only to the amount of substance weighed and In any case, care has to be given that the balance is calibrated and working according to the specifications. Also provisions on environmental conditions must be respected, e.g. too low air humidity leads to electrostatic problems and might cause bias. Further information on the use of balances in standard preparation can be found in open source literature in e.g. a paper published by Ch Burgess and R.D. McDowall [2]

5.

Is serial dilution of a standard solution for the preparation of calibration standards acceptable? No!

Two aspects have to be taken into account in standard preparation by serial dilution. The probably more important aspect is the lack of ability of identifying biased standard preparation. Figure 1 A presents a standard preparation scheme where bias in the preparation of dilution 1 (D1) from the stock standard solution (S) cannot be identified from the measurement results of the calibration standards (CS1 to CS5). Even worse would be scheme B which includes a cascade of dilutions of the calibration standards. Besides the risk of unidentified bias it provides high uncertainty of the concentration of standard CS5, which is prepared in six dilution steps. According to the law of error propagation the uncertainty of CS5 is equal to the square root of the sum of uncertainties of the preparations S to CS5, which is of course larger than the uncertainty of any other calibration standard shown in Figure 1.

FAQ on calibration
Figure 1: Different schemes for the preparation of calibration standards. Each arrow represents one dilution step (S: stock standard solution; D1 dilution 1; CS1 to CS5: calibration standard solutions) A B
S

D1

CS1

CS2

CS3

CS4

CS5

The most appropriate of the three schemes is shown in Figure 1 C. The calibration standard solutions are prepared from independent dilutions of the stock standard solution. By doing so, an error in the preparation of an intermediate dilution (D1 to D5) should be detectable in the measurement results of the calibration standards.
Duplicating scheme C with two independent stock standard solutions provides the highest level of information about the correctness of the calibration standards. In practice the preparation of calibration standards needs thorough planning. The handling of low volumes or low masses shall be avoided as much as possible. In case of PAHs limitations have to be encountered in the preparation of the stock standard solutions, which are caused by the low solubility of some PAHs (e.g. dibenzopyrenes) in the majority of organic solvents.

6.

How shall I store PAH standard solutions?

PAH standard solutions shall be stored in amber glass ware in the dark due to potential degradation of PAHs by UV light. Room temperature (about 20 C) is recommended for storage of PAH standard solutions by a number of suppliers. Opened commercial standards and own standard preparations should be stored cooled to avoid solvent losses. Do not put PAH standard solutions in the freezer as the solubility of some PAHs might be affected at low temperatures.

FAQ on calibration 7. Which containers shall I use for storage of standard solutions?

Amber glass ware with Teflon lined closures should be used. As a general rule, the headspace above the standard solution shall be as small as possible. It is also recommended to divide the stock standard solution preparations for storage into several units of small volume in order to conserve the composition of the parts of the preparations, which are at the time being not in use.

8.

How shall I estimate the shelf life of my standard preparations?

The shelf-life of a product is the time that the average characteristic of the product remains within an approved specification. Translated to standard preparation this means that the change of standard concentration respectively the associated uncertainty must not exceed certain predefined limits. This sounds very well in theory, but causes several problems for the implementation into practise. The first constraint is given by defining of the maximum tolerable change of concentration, which might be caused by degradation of the analyte, loss of solvent etc. The question which has to be answered is how much may the change of the composition of the standard preparations contribute to the combined measurement uncertainty. There is not any general guidance on this. An appropriate value has to be set on case by case basis. However, a relative change of the standard concentration of 1 % to 2 % could be acceptable. The second problem consists of the identification of changes in practise, and related to that to the set up of the experimental plan to proof the agreement with the predefined specifications. At the beginning of such studies little knowledge of the stability of the standard solutions is available. Hence the shelf life has to be estimated based on experiences made with similar substances, or information from literature. The time of the study has to cover at least this first estimate of the shelf life. The tested standard solution must be independent from the standard solution that is used for instrument calibration in order to identify any changes, and hence estimate its shelf life. Usually laboratories use one standard preparation at a time as standard solutions are expensive. Hence requesting the preparation of a fresh standard solution for each set of shelf life experiments would be illusionary. In addition the preparation of fresh standard solutions would make the determination of the shelf life superfluous. More economically would be applying a single, second, independent, standard solution over the whole period of shelf life experiments. However this does not provide the requested information because in case of significant differences it is not possible to trace back which of the two standard solutions has changed. Therefore it might be worth to look for alternatives. A possibility could be to apply in the shelf life experiments a chemical as internal standard that is available in large amounts at low costs. This chemical serves as reference point. The solution of the

FAQ on calibration
reference point has to be prepared freshly for each set of shelf life experiments. The low costs would allow using large quantities in the standard preparation, which lowers the risk of bias. In the experiments relative response factors between the analyte and the reference chemical are determined, and any changes are monitored. The selection of a chemical serving as reference point depends on the properties of the analyte. The integrity/stability of standard preparations has to be monitored over the whole shelf life of the standard preparation. Control charts shall be applied for this purpose. Repeated measurements shall be performed at each control point in order to estimate the variability of the measurements. The shelf life of the standard preparation can be shortened or extended depending of the experimental results.

9.

Which type of volumetric glassware may I use for the preparation of calibration standards?

The contribution of glassware tolerance to the global uncertainty of the method is very low but not negligible. Class A glassware according to ISO standard 1042:1983 shall be applied. For light sensitive substances the glass ware shall be produced from amber glass. The maximum tolerances for different volumes are given in ISO standard 1042:1983 as well (for instance it is 0.04 ml for a 25 ml flask). However it has to be pointed out that the handling (filling, emptying, parallax error) of volumetric glass ware will contribute to the total uncertainty of the standard preparation probably to a larger extend than the tolerances according to ISO standard 1042:1983. Gravimetric standard preparation is considered superior to volumetric standard preparation with regard to precision.

10. How do I verify the concentration of my standard preparations?


The verification of the standard concentration is crucial for assuring the quality of analysis results. In a limited number of cases the concentration of standard preparations can be verified by application of reference methods, e.g. the concentration of aflatoxin standard solutions in methanol/water can be verified by photometry. More likely the concentration of a particular standard preparation can only be verified against other standard preparations. Best practise in that respect would be verification against a solution with certified values for the analyte(s). Such certified reference materials (CRMs) are frequently not available. Hence the concentration of the standard preparation shall be evaluated against an independent standard preparation. The minimum requirement is to verify the concentration of a new standard preparation against the concentration of the preceding standard preparation.

FAQ on calibration
Bracketing calibration as detailed in ISO standard 11095:1996 shall be preferably applied for the verification measurements, as this technique yields usually greater accuracy than linear calibration.

11. How many points do I need for a calibration curve?


Before answering to this question the purpose of the calibration experiment has to be defined. One has to distinguish between the calibration of a measurement system and the check of the validity of the calibration of a measurement system. Both topics are treated in depth by international standards such as ISO standard 11095:1996 and the IUPAC guideline "Guidelines for calibration in analytical chemistry". The first case, called in ISO standard 11095:1996 the "basic method", is usually applied for the estimation of linear calibration functions. It encompasses the measurement of a certain number of reference materials (calibration standards) at different concentration levels. The minimum number of calibration points/levels is defined by ISO standard 11095:1996 for the basic calibration method to three. However it also says that the number of levels shall be increased for an initial assessment of the calibration function. This initial assessment is equal to operations performed during method validation to assess the linear range of a measurement method. The EURACHEM Guide "The Fitness for Purpose of Analytical Methods" specifies for that purpose at least six concentration levels plus blank. The above mentioned IUPAC guide does not specify any concrete number of calibration levels. Commission Decision 2002/657/EC stipulates at least 5 concentration levels including zero for the construction of a calibration curve. Other documents might lay down a different number of calibration levels. For example ISO standard 15302:2007 specifies four calibration levels, whereas the LGC/VAM guide "Preparation of Calibration Curves" defines seven calibration levels, including blank, as minimum requirement for an initial assessment of the calibration function. ISO 8466-1:1990 demands even ten calibration levels. As can be seen the design of calibration experiments and the number of calibration levels depend very much of the purpose of the experiment and of existing knowledge. The linearity of the instrument response was probably tested for the analysis method that became an ISO standard. Hence ISO regarded four calibration levels sufficient for the estimation of the calibration function. Less knowledge on the shape of the calibration functions requires performing of measurements on more concentration levels. The inclusion of blank or zero levels into the calibration design is required, if the blank or zero sample produces a signal that is of the same nature as the signal produced by the analyte. If the blank or zero sample does not produce any signal it can be excluded from the calibration experiments.

FAQ on calibration

In general three concentration levels are required to fit a non-linear function, and at least one more calibration level is needed for the statistical assessment of the calibration model. Increasing the number of calibration levels and the number of replicate analyses per level reduces the width of confidence and prediction intervals. However the return in terms of narrowing confidence intervals is diminishing with the number of calibration levels. Exceeding ten calibration levels does not provide any additional benefit. Figure 2 shows the confidence intervals for simulated calibration experiments performed at different numbers of calibration levels. Each calibration level was measured once. The underlying data are displayed in Table 1. Table 1: Data of simulated calibration experiments including different numbers of calibration levels 2 Level 1 2 3 4 5 6 7 8 9 Slope(x) + Intercept 1.05 Number of calibration points 4 Response 1.05 1.05 3 3.85 5.1 7.1 8.8 0.9688x + 0.0813 8.8 0.9688x + 0.1396 8.8 0.9837x + 0.0357 5.1 5.85 7.1 8.8 0.9871x + 0.0178 7 1.05 2 2.8 9 1.05 2 2.8 3.9 5.1 5.85 7.1 8.05 8.8 0.9950x + -0.0139

10

FAQ on calibration
Figure 2: Confidence intervals for simulated calibration experiments at two (black dashed line), three (red dashed line), four (green dashed line), seven (purple dashed line) and nine (grey dashed line) concentration levels (concentration points). The lines in the middle represent the calibration curves corresponding to the different scenarios.

The check of the validity of a calibration system has to be clearly distinguished from the initial calibration. This procedure is based on the information gained in an initial calibration experiment. ISO standard 11095:1996 applies the term "Control method" for the check of the validity of a calibration system. At least two, preferably three calibration levels are used to monitor via control charts the validity of the calibration function, and to detect any shifts or errors.

12. How many replicates per calibration point?


The ISO standard 11095:1996 demands at least two replicate analyses per calibration level and recommends as many as possible. At least two replicate analyses are necessary to evaluate the calibration for constancy of the residual standard deviation. This information is needed to decide on which regression model is most appropriate (see below).

11

FAQ on calibration
Increasing the number of replicate analyses follows, as the number of calibration levels, the law of diminishing return. Hence more than five replicate analyses per calibration level do not provide big additional benefit. NOTE: A very important thing to consider is that all performance data associated with a standard method is based on the calibration procedure mentioned therein. If you deviate from this calibration procedure, it is your responsibility to demonstrate that the modified calibration procedure will give equivalent results.

13. Why shall the concentration levels of the calibration standards be equidistant?
The reason is that the higher the concentration of a respective calibration standard, the more it is weighted for the calculation of the calibration curve (this is called leverage). As a result the calculated slope and intercept might be influenced disproportionally by one data point. The effect is demonstrated based on a simulated calibration experiment. In Figure 3 each calibration level corresponds in concentration to the double of the next lower concentration level. Two data points of the example, corresponding to the highest concentration level and one concentration level at the lower end of the concentration range, were manipulated, one at a time, and calibration curves were determined by linear regression. In each of the experiments one data point got a relative offset of -20 %. The respective data points are indicated by bold dots. The effect of the offset of the data point at the lower end of the calibration range (green dot) on the regression curve is marginal. The contrary is the case if the data for the highest concentration level would be biased. The signal value of this data point was changed from about 800 to 600. As a consequence, both the slope and the intercept of the calibration curve change significantly. This effect is based on the principle of the applied regression method, which aims to minimise the sum of the squared residuals. Since the residual (absolute signal value) caused by the relative offset (20 %) is much higher at the upper end of the calibration range than at the lower end, the data point at the upper end of the calibration range gets higher weight, as mentioned before. Such relative offsets are caused in practise by e.g. pipetting mistakes. Figure 3: Simulated calibration experiments with a relative offset of -20% of one data point in each experiment. The offset of the red dot () at the higher level has a much bigger influence on the

12

FAQ on calibration
resulting red calibration curve () than the offset of the green dot () on the resulting green calibration curve ().
Leverage
800

600

Signal

400

200

0 0 20 40 60 80

Analyte concentration

It has to be stressed that the application of calibration designs based on standard concentrations that correspond to multiples of the next lower concentration is strongly discouraged; despite they are frequently found in practise. The difference in effect on the regression curve of one biased calibration point is displayed in Figure 4 both for a set of six equidistant concentration levels and a set of six unevenly distributed concentration levels (multiplication factor = 2). The offset of the data point at the highest concentration level has less influence on the regression curve in the calibration with equidistant concentration levels than with unevenly distributed concentration levels. Figure 4: Effect of one biased calibration point (at concentration level 80, offset of signal = -20%) on the regression curve of calibration experiments with equidistant (pink) and unevenly distributed (blue) analyte concentration levels
800 700 600 Signal 500 400 300 200 100 0 0 20 40 60 80 100 Analyte concentration
Unevenly distributed Equidistant Linear regression Linear regression

13

FAQ on calibration 14. Which range of concentration has the calibration to cover?
The calibration shall cover at least the content/concentration range in which you will need to report results. The calibration range defines also the working range of the analysis method. The calibration standards have to be at concentration levels corresponding to the concentration levels of the ready to measure/inject sample. As a result it can be very narrow in concentration (e.g. around a legislative limit), provided the interest concerns only this small working range. The upper range of concentration that a calibration experiment may span is not defined. However factors such as homo-/heteroscedasticity (see below) shall be taken into account in the design of the experiments. As rule of thumb the ratio between the concentrations of the highest and lowest concentration levels shall not exceed a factor between 10 and 20. Occasionally the analyte content of test samples will exceed this concentration range the instrument is calibrated for. In that respect caution has to be given to simply diluting the test sample extract to bring it to a concentration level that is covered by the instrument calibration, and to re-analyse it. This might be possible in many cases, but is per se not applicable for all analysis methods due to the alteration of matrix effects. However, where shown by experiments to be appropriate, a dilution can be made.

14

FAQ on calibration

15. Which type of internal standard shall I use?


The most important properties of a suitable internal standard are: the internal standard must behave the same or at least very similar to the analyte in question. the internal standard must not be found in the sample itself, otherwise the interpretation of the internal standard data can be jeopardized. the concentration of the internal standard added to the sample shall preferably be in the middle of the range of expected analyte concentrations There are different options for the choice of internal standards. The applicability of the different possibilities depends on the purpose of the internal standard and the applied detection system. For example, if the analysis method comprises chromatography with optical detection (such as fluorescence or UV-absorption) the chosen internal standard has to behave chemically and physically very similar to the analyte (e.g. in extraction and clean up steps), but must be chromatographically resolved from the analyte. Often analogues of the actual analyte are taken for this purpose. Structural isomers of target analytes (e.g. benzo[b]chrysene) are applied for the determination of PAHs in food by high performance liquid chromatography with fluorescence detection (HPLC-FLD). Another option is provided by the application of fluorinated analogues of the target PAHs, because chemical properties are very similar and chromatographic separation can easily be achieved. The same holds true for deuterium substituted PAHs, which show in HPLC also slightly different retention characteristics compared to the native compounds. In the field of mycotoxins aflatoxicol (a metabolite of aflatoxins) is used as internal standard for the determination of aflatoxins. Also structurally similar substances have been proposed when a derivatisation is required and the analogue (internal standard) must react in the same manner as the analyte. Examples are the use of verrucarol for the determination of fusarium toxins (for GC methods), squaric acid for the determination of moniliformin (for HPLC-FL methods) or de-epoxy deoxynivalenol (DOM-1) for GC methods. If the chosen internal standard has a much different retention time and therefore most likely a rather different chemical behaviour (e.g. in terms of polarity) it is likely that it also behaves different from the analyte during extraction or clean-up. As a result, close structural analogues of the analyte are preferably used.

15

FAQ on calibration
In the case of chromatography coupled to mass selective detection the substances of choice are isotope labelled analogues of the analyte. This offers the detection of both substances (the analyte and the labelled internal standard) with the same or very similar retention time, which is necessary for compensating for matrix effects. The choice between deuterated and C13-labelled substances needs to take into account different facts. The differently labelled substances might show significant physico-chemical differences. Perdeuterated substances as commercialised for some PAHs have, as mentioned above, different retention characteristics compared to the native compounds, which might provide problems when it comes about compensation of matrix effects in mass spectrometry. The possibility of deuterium-hydrogen exchange cannot be excluded with deuterated compounds. Also the loss of deuterium atoms in chemical reactions of the analyte, such as derivatisation reactions, might lead to problems in distinguishing between the mass spectrometric signals of the native compound and the labelled analogue. This phenomenon is encountered in the determination of acrylamide by GC-MS after chlorination and consecutive dehydrochlorination. The hydrogen isotope clusters of some fragment ions of the labelled and native acrylamide overlap partially, which makes them unsuitable for quantitative analysis. C13 labelled compounds do not provide such problems. However the costs for this kind of labelled substances are substantially higher than for deuterated substances and the availability is limited.

16. When do I need to prepare matrix matched calibration standards?


A matrix matched calibration is needed in those cases, where the matrix (even after clean-up procedures) has an influence on the signal obtained for the analyte during measurement. Many analysis systems are sensitive to matrix effects, e.g. LC-MS or GC-MS. Also fluorescence detection can be subject to matrix influences (e.g. fluorescence quenching). However care must be taken, that the matrix used to prepare the calibrant is sufficiently well matched to the matrix of the sample. Isotope dilution with isotope labelled analogues of the target analyte is frequently applied to compensate for matrix effects. The basic assumption with this technique is that relative responses between the analyte and the labelled analogue stay constant.

16

FAQ on calibration

17. How do I determine matrix effects?


Matrix effects can be identified from calibration curves obtained with matrix matched calibration standards and calibration solutions in solvent. Matrix effects are encountered when the intercepts and/or the slopes of the regression curves for the two sets of calibration solutions are significantly different from each other. Ignoring these facts would lead in the earlier case to constant bias and in the latter case to proportional bias. The procedure to identify matrix effects is the same as to estimate a recovery function. In the first step a calibration curve is constructed by linear regression with the calibration standards in solvent solution. In the next step another calibration curve is constructed from the measurement data of the matrix matched calibration standards. Before proceeding it must be guaranteed that the precision of the two calibration curves is comparable. Otherwise any significant difference between the calibration curves might be hidden by the different level of precision. This is accomplished by testing the residual standard deviations of the two calibration curves for significant differences (with an F-test) at the 99% confidence level. The number of degrees of freedom is for each calibration experiment N-2 (with N=number of data). Given that no significant differences of the residual standard deviations were identified, the measurement data (y-values) of the matrix matched calibration are applied to the calibration function gained with the calibration standards in solvent, and the corresponding concentration values (x-values) are calculated. These values are called in the following "apparent concentration values". In the next step a linear regression is performed on the concentration data of the calibration standard solutions in solvent (x-values) and the apparent concentration values (used as signal data y-values). This regression curves contains the information on matrix effects. A slope different from one indicates potential concentration proportional signal enhancement respectively signal suppression. An intercept different from zero indicates concentration independent bias. However as the regression is based on a particular data set, the question has to be answered whether the deviations from the ideal values (slope=1, intercept=0) are significant or just random, as a consequence of the variability in the limited number of data points. To answer this question, the confidence intervals (95% confidence level) of the regression parameter have to be determined. If the confidence intervals include for the slope the value one, and for the intercept the value zero, than it can be concluded that there is not any statistical difference between the calibration with calibration standards in solvent solution and the matrix matched calibration. Matrix matched calibrations are an alternative to isotope dilution to compensate for matrix effects when using mass spectrometry for measurement.

17

FAQ on calibration

18. In which sequence shall I measure the calibration standards?


Generally the sequence in which the calibration standards are measured should be random. The decision on whether to measure the calibration standards at the beginning of a sample sequence, at the beginning and at the end of the sample sequence, or randomly distributed over the sample sequence depends of the stability of the measurement system. ISO standard 11095:1996 specifies as general requirements that "the measurements from which the calibration function was calculated are representative of the normal conditions under which the measurement system operates" and "that the measurement system is in a state of control". If the measurement system is stable throughout the whole sample sequence then all approaches will give equal results. However the design of the measurement sequence has to be modified if any instrument drift is expected. Such modifications could exist of repeated analyses of the calibration solutions during the measurement sequence or the inclusion of an increased number of quality control samples in the measurement sequence.

Evaluation of calibration measurements


19. How shall I test for linearity of the calibration?
For the purposes of this document linearity means the calibration can be best described by a straight line. Linearity may also mean the estimated parameters are linear which would also be true for a parabola, something that is not a straight line at all. A straight line can be described by Eq 5:

y nk = 0 + 1 x n + nk
with 0 1 ynk xn nk = intercept = line slope = the kth measured response of calibration level n = the concentration of the analyte in calibration level n = the residual for the kth measurement of calibration level n.

Eq 5

The residual is the difference between the measured response and the response value calculated from the calibration function:

) nk = y nk y n

18

FAQ on calibration
Plotting all nk over y n (residuals over fitted) results in the so called residual plot. This plot is a very valuable diagnostic tool. If the points are evenly distributed around a horizontal line trough zero the straight line function will be appropriate (see Figures 5 & 6). Another, more complex approach, is the lack-of-fit test. If the lack-of-fit test is not significant then a straight line function describes the calibration data appropriately. Replicate measurements at each

100

2000

4000 6000 Fitted values

8000

10000

Residuals -400 -300 -200 -100 0 100 0

-200

Residuals -100 0

200

300

2000

4000 6000 Fitted values

8000

10000

Figure 5: straight line appropriate

Figure 6: straight line inappropriate

calibration level are a prerequisite for a lack-of-fit test.

20. Does a correlation coefficient (r) of 0.99 indicate linearity of calibration?


No! The correlation coefficient is a measure of how much of the variability of y can be predicted by x. An r-value of 1 indicates that y can be completely predicted by x, and a value of 0 indicates that y can not be predicted by x. A parabola, which is markedly not a straight line, may have a correlation coefficient of 0.99. And the r-value may improve by adding a quadratic term to one's calibration function which then is certainly not linear in our sense of the word.

21. Which level of R is sufficient?


One can not define a sufficient level of R2! The closer R2 to 1 the better the quality of the predictions made through the calibration. But certain calibration problems my never get beyond R2 = 0.98 while for others 0.998 is a sign of an error.

19

FAQ on calibration 22. Which information can I get from the plot of residuals?
The plot of residuals (see point 19) can show whether the assumption of linearity is met. But it can also be used to check for homo- or heteroscedasticity of the calibration data and it is an indicator of the residual variability.

23. What is the residual standard deviation?


The residual standard deviation is a measure of the goodness-of-fit of the calibration. The smaller the residual standard deviation, the closer are the measured data point to the calculated calibration curve. It is used to calculate significance of the intercept and the slope.

24. May I force the calibration curve through the origin?


If the test of significance shows that the estimated intercept is not different from zero, then the intercept term is dropped from the calibration function and the calibration curve is assumed to originate at x = 0 and y = 0. Otherwise the intercept term must be kept and the calibration curve is assumed to originate at x = 0 and y = intercept.

25. What is homo- and heteroscedasticity?


Homoscedasticity is the term for calibration data having about equal variability over the whole calibration range. If the data's variability changes from one end of the range to the other the data is called to be heteroscedastic.

20

FAQ on calibration

26. How do I test for homoscedasticity / heteroscedasticity?


Whether one is dealing with homo- or heteroscedasticity, either can be determined from the residual plot. In the case of homoscedasticity the residuals are more or less all within a band parallel to the xaxis. In the case of heteroscedasticity the residuals assume a fan shape, from tight at one end to spread out at the opposite end (see Figures 7 & 8).

200

100

Residuals -100 0

-200

-300

2000

4000 6000 Fitted values

8000

10000

-1000 0

-500

Residuals 0 500

1000

1500

2000

4000 6000 Fitted values

8000

10000

Figure 7: homoscedastic data

Figure 8: heteroscedastic data

27. Linear regression or weighted linear regression which shall I apply?


For homoscedastic data ordinary linear regression is appropriate. But if the data is heteroscedastic ordinary linear regression will result in inflated estimates of the residual standard deviation. Therefore weighted linear regression should be used in such case.

28. May I remove outliers?


If an outlying value can be traced back to a failure in the system (e.g. injection error, bad chromatography, pipetting error, etc.) then it is permissible to remove it or better yet to repeat the measurement in question. If such a retrace does not come up with any failure then the outlying value should be considered as a real but rare incident and kept in the data set.

21

FAQ on calibration

They are estimated based on the estimates of intercept ( 0 ), slope ( 1 ), and residual standard deviation ( ) according to Eq 6.

29. How do I estimate confidence and prediction intervals? ) )


)

) ) ( x C x )2 ) 1 y C = 0 + 1 x C t p , n 2 + n ( x n x )2
with

Eq 6

yC xC )

= upper or lower bound of the confidence interval for xC = value of x for which to compute the confidence interval = estimate of the intercept = estimate of the slope = estimate of the residual standard deviation

0 ) 1

t p ,n 2 = Students t for probability p and n-2 degrees of freedom


= number of observations = average of all x-values of the calibration = individual x-values of the calibration

x xn

The confidence interval or in the case of regression analysis better confidence band, defines the region in which with a certain probability (usually 95%) the regression line would be found if the calibration were repeated under similar conditions. As such the confidence band is of minor interest. More important for the task of calibration is the prediction band which is wider than the confidence band.

(x P x ) 1 y P = 0 + 1 x P t p ,n 2 1 + + n ( x n x )2
) ) )
2

Eq 7

The subscript C (Confidence) was replaced by the subscript P (Prediction). Otherwise the same definitions as above are true. The projection of the outer bounds of this prediction band onto the y-axis defines the range of values which could reasonably be expected if one were too predict a new y for a new x. The above formulas are for the ordinary least squares approach. If a weighted least squares approach has to be used because of heteroscedasticity the weighted equivalents of all the estimates are used in Eq 6 and 7.

22

FAQ on calibration

General
30. Is there any internationally harmonised document on calibration?
ILAC/OIML Guide on Calibration ILAC Guide G24:2007 / OIML D 10:2007 "Guidelines for the determination of calibration intervals of measuring instruments", ILAC, Silverwater, Australia, 2007. IUPAC Recommendations 1998 K. Danzer, L.A. Currie (1998), "Guidelines for Calibration in Analytical Chemistry Part 1. Fundamentals and single component calibration", Pure&Appl.Chem., 70: 993-1014 ISO Guide ISO Guide 32:1997 "Calibration in analytical chemistry and use of certified reference materials", ISO, Geneva, Switzerland, 1997 ISO Standard ISO 8466-1:1990 "Water quality Calibration and evaluation of analytical methods and estimation of performance characteristics. Part 1: Statistical evaluation of the linear calibration function", ISO, Geneva, Switzerland, 1990 ISO Standard ISO 11095:1996 "Linear calibration using reference materials", ISO, Geneva, Switzerland, 1996

31. Where can I get guidance on calibration?


LGC Best practice guide for calibration design LGC Document "Preparation of calibration curves a guide to best practice. (2003) L. Cuardos-Rodriguez, L. Gmiz-Gracia, E.M. Almansa-Lpez, J.M. Bosque-Sendra, (2003) "Calibration in chemical measurement processes. II. A methodological approach", Trends in Anal. Chem., 20: 620-636

23

FAQ on calibration

1 United States Pharmacopeia, Chapter 41, 28th Edition, Rockville, Maryland, USA, 2005 2 Ch. Burgess and R.D. McDowall, A question of balance? Part 2: Putting principles into practice, LCGC Europe, 19/3 (2006).

24

You might also like