Context Navigation

#11928 closed Bugs (fixed)

surveyor area method is inaccurate

Reported by:	Charles Karney <charles@…>	Owned by:	Barend Gehrels
Milestone:	Boost 1.61.0	Component:	geometry
Version:	Boost 1.58.0	Severity:	Problem
Keywords:	area	Cc:	charles.karney@…

Description

The following code illustrates a problem with the "surveyor" strategy for computing planar areas. It computes the area of a square of area 2 and the same square translated by 10⁸ in x and y. bg::area incorrectly computes the area in the second case as 0.

#include <iostream>
#include <boost/geometry.hpp>

namespace bg = boost::geometry;
typedef bg::model::point<double, 2, bg::cs::cartesian> pt;

int main() {
  bg::model::polygon<pt, false, false> poly;
  // A simple square of area 2
  bg::read_wkt("POLYGON((1 0,0 1,-1 0,0 -1))", poly);
  // Computes the correct area (= 2)
  std::cout << "area " << bg::area(poly) << "\n";
  // The same square shifted by (10^8 10^8)
  bg::read_wkt("POLYGON((100000001 100000000, \
                         100000000 100000001, \
                          99999999 100000000, \
                         100000000  99999999))", poly);
  // Should give 2 but instead gives 0
  std::cout << "area " << bg::area(poly) << "\n";
}

Running this code gives

area 2
area 0

Discussion:

Surely it's a bad idea to embed the algorithm name "surveyor" into the API. Computing 2d areas is really straightforward. Users shouldn't need to worry about the underlying method. Maybe "planar_area" is a better name.
In any case "surveyor" is a rather silly name of the type beloved of Wikipedia editors. There's no evidence that surveyors actually used this method.
It's more expensive (at 2*n multiplies and 2*n additions) than applying the trapezium rule (otherwise known as "doing the integral"), i.e., summing (x[i+1] - x[i]) * (y[i+1] + y[i]) / 2, which clocks in at n multiplies and 3*n additions.
Finally, as this example illustrates, it's subject to catastrophic loss of accuracy, while the trapezium rule will give the exact result. (Kahan contrasts evaluating x*x - y*y compared with (x - y) * (x + y) with x = 1e15+1, y = 1e15-1. The former result is wrong by 1.5% while the latter is exact.)

Change History (4)

comment:1 by awulkiew, 7 years ago

The name of the formula seems to be older than Wikipedia [1][2].

I haven't measured it but I'd rather guess that nowadays the most expensive part in both cases is data access.

Furthermore I guess that in the case of the trapezium rule the result is also not precise in general when coordinates are big. So I'd rather fight with this issue by moving all of the coordinates closer to the origin of the coordinate system before passing them into the strategy, so doing it on the fly while calculating the area with any strategy. Similar technique was used before to improve the precision of centroid(). This solution should improve the precision of a function no matter what strategy was used.

With that said I'm not against changing the formula. But should trapezium rule be better in all cases or is there a reason why the old one could be left as it was?

[1] Flint, Abel. "A System of Geometry and Trigonometry: together with a treatise on surveying: teaching various ways of taking the survey of a field; also to protract the same and find the area. Likewise, Rectangular Surveying; Or, an accurate method of calculating the area of any field arithmetically, without the necessity of plotting it.", 1808

[2] Braden, Bart. "The surveyor’s area formula.", 1986

comment:2 by Charles Karney <charles@…>, 7 years ago

Thanks for the reference to Abel Flint (a great name!). You will see on pp. 59-63 that he describes the trapezium rule and not the so-called surveyor's method! Most of his contemporaries who were mathematicians would have recognized this as such and not, therefore, have seen it as some special technique invented by a surveyor.

The analysis of why

sum x_i y_i+1 - x_i+1 y_i

is typically subject to worse round off errors than

sum (x_i+1 - x_i) (y_i+1 + y_i)

goes as follows. Each formula contains

products
differences in forming each term in the sum
differences in forming the sum itself

In the surveyors formula 2 follows 1 and so the mild round off errors from the products are sometimes catastrophically magnified. In the trapezium rule 2 precedes 1 and there is no round off error if x_i+1 and x_i are within a factor of 2 of each other (which is a common situation).

Both formulas are subject to potentially dangerous round off with 3. Centering the numbers might ameliorate this problem. Alternatively, merely switching to the other trapezium rule involving (y_i+1 - y_i) would help, e.g., in the case of UTM coordinates where often the northings are much larger than the eastings.

comment:3 by Charles Karney <charles@…>, 7 years ago

By the way, in GeographicLib, I use an Accumulator class to perform the summation. This effectively sums at double the normal precision.

comment:4 by awulkiew, 7 years ago

Milestone:	To Be Determined → Boost 1.61.0
Resolution:	→ fixed
Status:	new → closed

I tweaked the formula as you suggested. Thanks.

Note: See TracTickets for help on using tickets.

Download in other formats: