A recurring issue I seem to encounter is how to best model non-trivial Calculations in software; tasks that take multiple input parameters, do some computation, and provide a result object with multiple fields.
Some real examples from my career:
- For a customer with a loan contract, calculating the amount they must pay to payout the loan early and terminate the contract. Typically there are extra fees/charges to end a loan early. Also, we will want to know not just the final figure, but a breakdown of it into components.
- In a game world, finding the best path(s) a unit should take to a goal location, with some input constraints on where that path should go, using for the A* search algorithm. We might also specify constraints how much CPU time or memory the search is allowed to consume.
- To obtain an accurate numerical prediction function, we perform a Genetic Algorithm (GA) search to find optimal input parameters for it (there are many of these) over some training data. We might start with an existing function as input and improve it, or from scratch. Configuring a GA itself is complex. And the result object may contain data on performance during training, as well as the desired function params themselves.
Storing The Inputs and Results of a Calculation
The simplest calculations typically return a number, a date or an object. But it doesnt take very long before we find ourselves wanting extra details about the result, or about how the result was derived. A Calculation Object (sometimes: Result Object) groups together all of the results of a calculation, and is almost always recommended.
I also find that its very useful to record (or identify) the inputs to a calculation in the same calculation object. They may be required as a historical record for audit purposes, or so the calculation can be repeated, or for verification of its correctness.
Who Owns the Calculation’s Logic
If a Calculation Object ends up holding all the input state for the calculation, and all the results, should it also end up with behavioural methods for doing the calculation?
new PayoutCalculation(contract, payoutCharges, payoutDate)
payoutCalculation.getPayoutDueAmount() //calculated field
The above style is object-oriented, but I have come to prefer a separate Calculator object, a Factory that performs the calculation logic and creates Calculation objects as its products…
payoutCalculation = payoutCalculator.calculatePayout(contract, payoutCharges, payoutDate)
…for three reasons:
- A calculator can vary polymorphically and is less coupled to the Calculation Object and the client code.
- Calculation objects become stateless – they are only given to the client once all their fields are filled in.
- Often calculators need significant internal datastructures used during calculations, that are not part of the result.
In the payout example above, the three input parameters needed to calculate a payout are provided in a single, atomic method invocation that performs the calculation. This is the neatest, preferable case.
Sometimes, a calculation has many overrideable defaults for inputs. For simpler cases, this can be solved through overloading…
payoutCalculator.calculatePayout(contract, payoutCharges, payoutDate)
//default payoutCharges, date = today
…but as the inputs get more numerous, or when a calculation has many mandatory parameters, it becomes impactical to provide everything as individual parameters to a method call. Then, your options are to:
- Build up the inputs on the Calculation object itself, then pass it to the Calculator to have its outputs filled in. You lose the statelessness of the Calculation this way.
- Create a Recipe object and pass that to the Calculator.
- Bake the inputs into the Calculator itself, so that every Calculation it produces picks up its parameters.
Complex calculators often need lots of state while they are running. For example, A* pathfinding maintains two Sets of nodes, one with additional ordering requirements. Where to store it?
- Storing it inside the Calculator means it becomes stateful, and so cannot be shared across threads. Care needs to be taken to reset the calculator before next use. This choice is appropriate when you have a known, single-threaded audience, and/or calculator state is havy and expensive.
- Throwaway calculaltors are created/obtained by each client as needed, used once, then discarded. Obviously thread-safe.
- Calculators that have multiple copies of the required state, one associated with each client request, tracked eg by an record in a hashtable.
Be aware that if the calculator is asynchronous and long-running, there may be concurent, overlapping calculations running even in a single threaded scenario.
SimpleDateFormat is perhaps an example of inappropriately handled statefulness. Its in a Java API widely used by experts and beginners alike, with an immutable-sounding name, and yet its (unintuitively) stateful, so that if multiple threads access the same format, they can get crosstalk between their formatting jobs.