fit - Gnuplot: An Interactive Plotting Program

Next: help, Previous: exit, Up: Commands

3.6 fit

The fit command can fit a user-supplied expression to a set of data points (x,z) or (x,y,z), using an implementation of the nonlinear least-squares (NLLS) Marquardt-Levenberg algorithm. Any user-defined variable occurring in the expression may serve as a fit parameter, but the return type of the expression must be real.

Syntax:

           fit {<ranges>} <expression>
               '<datafile>' {datafile-modifiers}
               via '<parameter file>' | <var1>{,<var2>,...}

Ranges may be specified to temporarily limit the data which is to be fitted; any out-of-range data points are ignored. The syntax is

           [{dummy_variable=}{<min>}{:<max>}],

analogous to `plot`; see ranges.

<expression> is any valid `gnuplot` expression, although it is usual to use a previously user-defined function of the form f(x) or f(x,y).

<datafile> is treated as in the `plot` command. All the datafile modifiers (using, every,...) except smooth and the deprecated thru are applicable to fit. See datafile.

The default data formats for fitting functions with a single independent variable, z=f(x), are z or x:z. That is, if there is only a single column then it is the dependent variable and the line numbers is the independent variable. If there are two columns, the first is the independent variable and the second is the dependent variable.

Those formats can be changed with the datafile using qualifier, for example to take the z value from a different column or to calculate it from several columns. A third using qualifier (a column number or an expression), if present, is interpreted as the standard deviation of the corresponding z value and is used to compute a weight for the datum, 1/s**2. Otherwise, all data points are weighted equally, with a weight of one. Note that if you don't specify a using option at all, no z standard deviations are read from the datafile even if it does have a third column, so you'll always get unit weights.

To fit a function with two independent variables, z=f(x,y), the required format is using with four items, x:y:z:s. The complete format must be given—no default columns are assumed for a missing token. Weights for each data point are evaluated from 's' as above. If error estimates are not available, a constant value can be specified as a constant expression (see using), e.g., `using 1:2:3:(1)`.

The fit function may have up to five independent variables. There must be two more using qualifiers than there are independent variables, unless there is only one variable. The allowed formats, and the default dummy variable names, are as follows:

           z
           x:z
           x:z:s
           x:y:z:s
           x:y:t:z:s
           x:y:t:u:z:s
           x:y:t:u:v:z:s

The dummy variable names may be changed with ranges as noted above. The first range corresponds to the first using spec, etc. A range may also be given for z (the dependent variable), but that name cannot be changed.

Multiple datasets may be simultaneously fit with functions of one independent variable by making y a 'pseudo-variable', e.g., the dataline number, and fitting as two independent variables. See multi-branch.

The `via` qualifier specifies which parameters are to be adjusted, either directly, or by referencing a parameter file.

Examples:

           f(x) = a*x**2 + b*x + c
           g(x,y) = a*x**2 + b*y**2 + c*x*y
           FIT_LIMIT = 1e-6
           fit f(x) 'measured.dat' via 'start.par'
           fit f(x) 'measured.dat' using 3:($7-5) via 'start.par'
           fit f(x) './data/trash.dat' using 1:2:3 via a, b, c
           fit g(x,y) 'surface.dat' using 1:2:3:(1) via a, b, c
           fit a0 + a1*x/(1 + a2*x/(1 + a3*x)) 'measured.dat' via a0,a1,a2,a3
           fit a*x + b*y 'surface.dat' using 1:2:3:(1) via a,b
           fit [*:*][yaks=*:*] a*x+b*yaks 'surface.dat' u 1:2:3:(1) via a,b
           fit a*x + b*y + c*t 'foo.dat' using 1:2:3:4:(1) via a,b,c
           h(x,y,t,u,v) = a*x + b*y + c*t + d*u + e*v
           fit h(x,y,t,u,v) 'foo.dat' using 1:2:3:4:5:6:(1) via a,b,c,d,e

After each iteration step, detailed information about the current state of the fit is written to the display. The same information about the initial and final states is written to a log file, "fit.log". This file is always appended to, so as to not lose any previous fit history; it should be deleted or renamed as desired. By using the command `set fit logfile`, the name of the log file can be changed.

If gnuplot was built with this option, and you activated it using `set fit errorvariables`, the error for each fitted parameter will be stored in a variable named like the parameter, but with "_err" appended. Thus the errors can be used as input for further computations.

The fit may be interrupted by pressing Ctrl-C. After the current iteration completes, you have the option to (1) stop the fit and accept the current parameter values, (2) continue the fit, (3) execute a `gnuplot` command as specified by the environment variable FIT_SCRIPT. The default for FIT_SCRIPT is replot, so if you had previously plotted both the data and the fitting function in one graph, you can display the current state of the fit.

Once fit has finished, the update command may be used to store final values in a file for subsequent use as a parameter file. See update for details.